Query segmentation is the problem of identifying those keywords in a query, which together form compound concepts or phrases like new york times. Such segments can help a search engine to better interpret a user’s intents and to tailor the search results more appropriately.
As part of our research, we developed the Webis Query Segmentation Corpus 2010 (Webis-QSeC-10), which contains segmentations for 53,437 web queries obtained from Mechanical Turk crowdsourcing. For each query, 10 MTurk workers were asked to segment the query. The corpus represents the distribution of their decisions. [api] [data] [demo]
Students: Anna Beyer, Christof Bräutigam