Webis-ArgQuality-20

Name: webis-argquality-20
Published: 2020
License: https://creativecommons.org/licenses/by/4.0/deed.en

Synopsis
Download
Documentation
People

Synopsis

The Webis Argument Quality Corpus 2020 (Webis-ArgQuality-20) contains 1271 arguments spanning 20 topics with scores for rhetorical, logical, dialectical, and overall quality as well as topical relevance. Quality scores are inferred from a total of 42k pairwise judgments. Arguments are sourced from the args.me corpus.

Download

To download the corpus and model use the following links:

Documentation

The Webis-ArgQuality-20 corpus consists of two sets of data: a processed version, where for each annotated argument, a scalar value for each argument quality dimension is derived; and the raw annotation data, providing the individual paired comparison labels. The structure of both datasets is described below.

Processed Data

The dataset is split into three different tables. Each key represents a column name, with details about the contained data in the explanation field. Primary keys are marked in bold. If a combined key is used, all entries that the combined key is composed of are marked. Foreign keys that can be used to reference other tables are marked in italics.

Argument Dataset

Key	Explanation
*Topic ID*	Unique identifier for the topic context
Argument ID	Unique identifier for the item in regards to the discussion it is part of
Discussion ID	Unique identifier of the discussion the item is part of
Is Argument?	Boolean value, indicating wether the item is an argument, or not
Stance	Denotes the stance of the item, can be Pro, Con or Not specified
Relevance	Relevance score, z-normalised
Logical Quality	Logical quality score, z-normalised
Rhetorical Quality	Rhetorical quality score, z-normalised
Dialectical Quality	Dialectical quality score, z-normalised
Combined Quality	Combined quality score, z-normalised
Premise	Text of the items' premise
Text Length	Word Count of the premise

Ranking Dataset

Key	Explanation
*Topic ID*	Unique identifier for the topic context
Model	Unique identifier of the discussion the item is part of
Rank	The rank of the argument in the respective engines ranking
Argument ID	Unique identifier for the argument in regards to the discussion it is part of
Discussion ID	Unique identifier of the discussion the argument is part of

Topic Dataset

Key	Explanation
*Topic ID*	Unique identifier for the topic context
Category	Thematical category the topic belongs to
Long Query	Long query, used as input for the retrieval models
Short Query	Shortened form of the query

Raw Data

Individual comparisons for argument quality are given in a dedicated table each. Relevance annotations are included as well. Each key represents a column name, with details about the contained data in the explanation field. Primary keys are marked in bold. If a combined key is used, all entries that the combined key is composed of are marked. Foreign keys that can be used to reference other tables are marked in italics.

Quality Annotations

Key	Explanation
*Argument ID A*	Unique identifier for argument A in regards to the discussion it is part of
*Discussion ID A*	Unique identifier of the discussion argument A is part of
*Argument ID B*	Unique identifier for argument B in regards to the discussion it is part of
*Discussion ID B*	Unique identifier of the discussion argument B is part of
Comparison	Denotes the direction of the comparison; can be "A" if argument A is better, "B" if argument B is better, of "Tie", if both arguments are equal.

Relevance Annotations

Key	Explanation
Task ID	ID of the annotation task this annotation was part of
*Argument ID*	Unique identifier for the argument in regards to the discussion it is part of
*Discussion ID*	Unique identifier of the discussion the argument is part of
Relevance	Denotes the relevance of this argument with regards to the topic on a scale of 0 (low) to 4 (high). -2 is used to mark irrelevant text/spam
Is Argument?	Boolean value, indicating wether the item is an argument, or not

Model Implementation

A Python implementation is provided. See code comments for additional implementation details. Also, an example describing the usage of the model is given, and can be applied to the raw data to derive the processed version.

Webis-ArgQuality-20

Synopsis

Download

Documentation

Processed Data

Raw Data

Model Implementation

People

Args

ChatNoir

IR Anthology

Netspeak

Picapica

TIRA