TexBiG

Synopsis
People
Publications

Synopsis

TexBiG (from the German Text-Bild-Gefüge, meaning Text-Image-Structure) is a document layout analysis dataset for historical documents in the late 19th and early 20th century. The dataset provides instance segmentation (bounding boxes and polygons/masks) annotations for 19 different classes with more then 52.000 instances. Annotations are manually annotated by experts and evaluated with Krippendorff's Alpha, for each document image are least two different annotators have labeled the document. The dataset uses the common COCO-JSON format.

Access

Please refer to this publication for citing the dataset. If you want to link the dataset, please use the dataset permalink [doi].

Download the dataset from Zenodo.
Find the related metadata at Google.

People

David Tschirschwitz
Franziska Klemstein
Benno Stein
Volker Rodehorst

TexBiG

Synopsis

Access

People

Publications

Args

ChatNoir

IR Anthology

Netspeak

Picapica

TIRA