The CompleteSearch Engine: Interactive, Efficient, and Towards IR & DB 
integration

Bast, Holger; Weber, Ingmar

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Conference Paper

The CompleteSearch Engine: Interactive, Efficient, and Towards IR & DB integration

MPS-Authors

/persons/resource/persons44076

Bast, Holger
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

/persons/resource/persons45715

Weber, Ingmar
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Bast, H., & Weber, I. (2007). The CompleteSearch Engine: Interactive, Efficient, and Towards IR & DB integration. In Third Biennial Conference on Innovative Data Systems Research (pp. 88-95).

Cite as: https://hdl.handle.net/11858/00-001M-0000-000F-20F4-C

Abstract

We describe CompleteSearch, an interactive search engine that offers the user a variety of complex features, which at first glance have little in common, yet are all provided via one and the same highly optimized core mechanism. This mechanism answers queries for what we call context-sensitive prefix search and completion: given a set of documents and a word range, compute all words from that range which are contained in one of the given documents, as well as those of the given documents which contain a word from the given range. Among the supported features are: (i) automatic query completion, for example, find all completions of the prefix “seman” that occur in the context of the word “ontology”, as well as the best hits for any such completion; (ii) semi-structured (XML) retrieval, for example, find all emailmessages with “dbworld” in the subject line; (iii) semantic search, for example, find all politicians which had a private audience with the pope; (iv) DB-style joins and grouping, for example, find the most prolific authors with at least one paper in both “SIGMOD” and “SIGIR”; and (v) arbitrary combinations of these. The prefix search and completion mechanism of Complete- Search is realized via a novel kind of index data structure, which enables subsecond query processing times for collections up to a terabyte of data, on a single PC. We report on a number of lessons learned in the process of building the system and on our experience with a number of publicly used deployments.