The ROSES project aims at defining a set of web ressource syndication services and tools for localizing, integrating, querying and composing RSS feeds distributed on the Web. We distinguish between two kinds of services :
Whereas RSS documents can be considered as a special kind of XML-RDF document that can be queried by any existing XML (XQuery, XPath) or RDF (Sparql) query language1), the combination of RSS syndication, XML query processing and distributed data and query processing creates new technical and scientific challenges that we intend to tackle in this project :
This project addresses the following technical and scientific key issues combining XML query processing, distribution and information flows :
Similar to search engines which already play an important role in the modern information society, web syndication gains more and more importance at the economic level. One explication for the success of web content syndication is the observation that a big amount of information published on the web is a time-stamped, uniquely identified chunk of data with meta-data (news stories, uploaded photos, events, podcasts, wiki changes, source code changes, bug report). The possibility to create, observe and aggregate well-defined information channels on the web allows to reduce the distance (cost, time, effort) between information producers and information consumers at the web-scale :
The ROSES project aims at developing a flexible and efficient web syndication model for building this kind of applications. Flexibility and efficiency is achieved by a high-level syndication model based on declarative languages and distributed XML data management technology .
This proposal answers to several priorities and objectives mentioned in the MDCO programm of the ANR call for projects. The main objective is to develop a web information management infrastructure combining distributed XML data management and RSS web ressource syndication. The project takes into account several important dimensions of web information :
RSS syndication is used for reducing the “publication lag” of web ressources RSS feeds are XML documents distributed all over the web and the number of RSS feeds is growing every day.
The main research topics concerning this project are mentioned in “Axe 2 : Algorithmes pour le traitement massif de données ” (page 8) :
The main expected results and contributions are :
The following figure summarizes the technical and scientific contribution of the proposed project and its integration with existing technology. The right part of the figure (gray) shows the most simple way of RSS-based web ressource syndication. A feed is an evolving XML document downloaded by a specialized user interface (RSS reader). On the rest of the figure shows the architectures we will study in our proposal.