Synopsis

The web genre corpus 2004 (Genre-KI-04) is designed for the evaluation of techniques for genre classification.It consists of 1239 web documents classified into 8 genres and basic meta data for each of the files.

Access

Please refer to this publication for citing the dataset. If you want to link the dataset, please use the dataset permalink [doi].

  • Download the dataset from Zenodo.
  • Find the related metadata at Google.

People

Publications