The Webis Query Segmentation Corpus 2010 (Webis-QSeC-10) contains segmentations for 53,437 web queries obtained from Mechanical Turk crowdsourcing (4,850 used for training in our CIKM 2012 paper). For each query, at least 10 MTurk workers were asked to segment the query. The corpus represents the distribution of their decisions.
Please refer to the publications for citing the dataset. If you want to link the dataset, please use the dataset permalink [doi].