English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Thesis

Information Retrieval by Dimension Reduction A compartive Study

MPS-Authors
/persons/resource/persons45166

Parreira,  Josiane Xavier
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Parreira, J. X. (2003). Information Retrieval by Dimension Reduction A compartive Study. Master Thesis, Universität des Saarlandes, Saarbrücken.


Cite as: https://hdl.handle.net/11858/00-001M-0000-0027-F866-0
Abstract
In this work we present a study of different techniques for semantic indexing by dimension reduction, with special emphasis on the LSI technique. Dimension reduction is important in the Information Retrieval (IR) context to enable fast retrieval and elimination of noisy data. LSI attempts to improve IR quality by deriving a latent semantic space with lower dimensionality, based on the co-occurrence of the terms in the documents from the document collection. It is a heuristic method and although experiments show that the LSI technique often improves the retrieval performance, there are deficiencies regarding mathematical models and rigorous theorems. Several variants of the LSI technique have been proposed, which differ in the function used for the mapping to the lower-dimensional space. Our comparative study is carried out using mathematical tools, like Linear Algebra, and systematic experiments. We present a theoreticla analysis of the two main LSI variants found in the literature - we call them Angle-stretching LSI and Angle-preserving LSI - and we prove that the results of the two can, in principle, arbitrarily, differ. The experiments reveal interesting features of the LSI variants and the differences in their behavior. In our experiments, the Angle-stretching LSI performs consistently worse than the Angle-preserving LSI.