The Webis Comparative Web Search Questions 2022 (Webis-CompQuestions-22) corpus comprises 30,000+ web questions collected from the public datasets MS Marco, Google Natural Questions, and Quora. The questions were manually annotated as comparative or not. The comparative ones were annotated as subjective or not, direct or not, with aspect or without aspect, and on the token level with comparison aspects, objects, and predicates. For a subset of comparative questions, their potential answers are labeled with stances.


You can access the Webis CompQuestions Corpus 2022 at GitHub.

If you use the dataset in your research, please send us a copy of your publication. We kindly ask you to cite the data using [bib].