The web genre corpus 2004 (Genre-KI-04) is designed for the evaluation of techniques for genre classification.It consists of 1239 web documents classified into 8 genres and basic meta data for each of the files.
Please refer to the publications for citing the dataset. If you want to link the dataset, please use the dataset permalink [doi].
- Benno Stein
- Sven Meyer zu Eissen