English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Paper

Towards Query Logs for Privacy Studies: On Deriving Search Queries from Questions

MPS-Authors
/persons/resource/persons79330

Biega,  Asia J.
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons185343

Saha Roy,  Rishiraj
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

arXiv:2004.02023.pdf
(Preprint), 472KB

Supplementary Material (public)
There is no public supplementary material available
Citation

Biega, A. J., Schmidt, J., & Saha Roy, R. (2020). Towards Query Logs for Privacy Studies: On Deriving Search Queries from Questions. Retrieved from https://arxiv.org/abs/2004.02023.


Cite as: https://hdl.handle.net/21.11116/0000-0008-09C7-E
Abstract
Translating verbose information needs into crisp search queries is a
phenomenon that is ubiquitous but hardly understood. Insights into this process
could be valuable in several applications, including synthesizing large
privacy-friendly query logs from public Web sources which are readily available
to the academic research community. In this work, we take a step towards
understanding query formulation by tapping into the rich potential of community
question answering (CQA) forums. Specifically, we sample natural language (NL)
questions spanning diverse themes from the Stack Exchange platform, and conduct
a large-scale conversion experiment where crowdworkers submit search queries
they would use when looking for equivalent information. We provide a careful
analysis of this data, accounting for possible sources of bias during
conversion, along with insights into user-specific linguistic patterns and
search behaviors. We release a dataset of 7,000 question-query pairs from this
study to facilitate further research on query understanding.