hide
Free keywords:
-
Abstract:
Databases (DB) and information retrieval (IR)have evolved as separate fields.
However, modern applications such as customer support, health care, and digital
libraries require capabilities for both data and text management. In such
settings, traditional DB queries, in SQL or XQuery, are not flexible enough to
handle applicationspecific scoring and ranking. IR systems, on the other hand,
lack efficient support for handling structured parts of the data and metadata,
and do not give the application developer adequate control over the ranking
function. This paper analyzes the requirements of advanced text- and data-rich
applications for an integrated platform. The core functionality must be
manageable, and the API should be easy to program against. A particularly
important issue that we highlight is how to reconcile flexibility in scoring
and ranking models with optimizability, in order to accommodate a wide variety
of target applications efficiently. We discuss whether such a system needs to
be designed from scratch, or can be incrementally built on top of existing
architectures. The results of our analyses are cast into a series of challenges
to the DB and IR communities.