Abstract
Semistructured data is of increasing importance in many application domains,
but one of its core use cases is representing documents. Consequently,
effectively retrieving information from semistructured documents is an
important problem that has seen work from both the information retrieval (IR)
and databases (DB) communities. Comparing the large number of retrieval models
and systems is a non-trivial task for which established benchmark initiatives
such as TREC with their focus on unstructured documents are not appropriate.
This chapter gives an overview of semistructured data in general and the INEX
initiative for the evaluation of XML retrieval, focusing on the most prominent
Adhoc Search Track.