Open Thesis Topics

Students who are eager to develop their skills by doing a research-oriented thesis in our group should mail their interests to Suitable topic candidates are shown in the following list, which is not meant to be complete though:

  • Comparative Scholar: Exploiting Pairwise Neural Networks for High-Recall Literature Search
  • Polyglot Retrieval Systems with GraalVM

Ongoing Theses

  • Halle
    • Danik Hollatz. Precision-Oriented Argument Retrieval (supervised by Maik Fröbe and Alexander Bondarenko)
    • Erik Reuter. Interdocument-Aware Learning-to-Rank Using a Long Document Transformer (supervised by Ferdinand Schlatt)
  • Leipzig
    • Yaowei Zhang. Semi-automatic Knowledge Graph Authoring to Facilitate Retrieval of Expert Knowledge (supervised by Marcel Gohsen)
    • Leon Naumov. Last but not Least: Avoiding and Explicating Biases in Spoken Lists of Arguments (supervised by Johannes Kiesel)
    • Dominik Schwabe. Unsupervised Frame Identification in Argumentative Discussions (supervised by Shahbaz Syed)
    • Ahmad Dawar Hakimi. Contextualized Summarization of Scholarly Documents (supervised by Shahbaz Syed)
    • Deniz Simsek. Verbalizing Entity-based Answers in Conversational QA-Systems (supervised by Marcel Gohsen and Johannes Kiesel)
    • Henrik Bininda. Crawling and Analyzing the Novelupdates Corpus (supervised by Erik Körner)
    • Hannes Hansen. From contextualized to static word embeddings (supervised by Niklas Deckers, Clara Meister, and Lukas Muttenthaler)
    • Simon Kleine. How we Argue: A Study of Vocal Argument-seeking Conversations (supervised by Johannes Kiesel)
    • Kai Knappik. Simulation of Web Users for Online Discourse Analysis (supervised by Tim Gollub, Sebastian Günther, and Johannes Kiesel)
    • Gabriel Huppenbauer. Context Dynamics of the Term Sustainability (supervised by Christian Kahmann)
    • Christian Staudte. Building a Large-scale Argumentation Graph (supervised by Khalid Al-Khatib)
    • Eric Schmidt. Identifying Debating Strategies on Wikipedia (supervised by Khalid Al-Khatib)
    • Nicolas Handke. What's your Point? Identifying Values in Arguments (supervised by Johannes Kiesel)
    • Yiwen Cao. Mapping travel routes based on travelogue narrative (supervised by Andreas Niekler and Magdalena Wolska)
    • Jonas Richter. Knowledge Graph of resistance network in Nazi Germany (supervised by Andreas Niekler and Christian Kahmann)
    • Clemens Schöne. Aquiring corpora with triggering content (supervised by Andreas Niekler and Magdalena Wolska)
    • Roy Rodney. A Web-based Implementation of the Netspeak Wordgraph (supervised by Tariq Youssef)
    • Hannes Winkler. Digital Monitor of Saxony (supervised by Andreas Niekler)
    • Markus Kobold. Etymological data from Wiktionary as a graph (supervised by Thomas Efer)
    • Wolfgang Kircheis. Analyzing the History Section of Wikipedia Articles. (supervised by Martin Potthast)
    • Ole Borchardt. Language Models for the Correction of OCR-Errors in Historic Documents (supervised by Tim Gollub and Janek Bevendorff)
    • Yannick Dannies. Investigating Stopping Criteria for Active Learning (supervised by Christopher Schröder)
    • Maximus Germer. Chess Report Generation with Data-to-text (supervised by Janos Borst and Andreas Niekler)
    • Ferdinand Lange. Detecting Text Reuse from Books (supervised by Lukas Gienapp)
    • Cariem El Wakil. Training a TTS Model with custom speech data for Galileofication. (supervised by Andreas Niekler)
    • Mathias Halbauer. Investigating Paneling and Sampling Techniques to Approximate Polling Data through Social Media. (supervised by Matti Wiegmann)
    • Bernhard Jung. Early Hype Detection - Detecting and Tracking Investment Hypes on Reddit. (supervised by Matti Wiegmann, Erik Körner, and Michael Völske)
    • Moritz Brunsch. Multi-Label Active Learning with Many Irrelevant Examples (supervised by Christopher Schröder)
    • Nils Schröder. Short Text Classification (supervised by Christian Kahmann and Christopher Schröder)
    • Charly Zimmer. Improving Causal Relation Extraction from the Web (supervised by Ferdinand Schlatt)
    • Gregor Pfänder. Automatic Data Extraction for Metanalyses from Biomedical Publications (supervised by Ferdinand Schlatt)
  • Weimar
    • Ludwig Lorenz. Searching Personal Web Archives (supervised by Johannes Kiesel)
    • Sanket Gupta. Advancing and Benchmarking Large-Scale Content Extraction from the Web (supervised by Janek Bevendorff, Johannes Kiesel, and Nikolay Kolyada)
    • Oliver Singler. Quantifying evidence of poetry perception based on physiological response to recital (supervised by Jan Ehlers and Magdalena Wolska)
    • Sebastian Laverde. Disentangling Aspects from Text Representations (supervised by Tim Gollub)
    • Hans Lienhop. Rapid Prototyping for the Digital Humanities (supervised by Tim Gollub)

Resources for Students


Dear prospective PhD student, unsolicited applications to the Webis group ( are welcome. However, we cannot promise that open positions are available at the time of your application.

The Webis Group is a tightly cooperating research network, formed by computer science chairs at the universities of Halle, Leipzig, Paderborn, and Weimar. Our mission is to tackle challenges of the information society by conducting basic and applied research with the goal of prototyping and evaluating future information systems. We are an experienced research group where team spirit and active collaboration has top priority. We are looking for open-minded graduates and PhDs who want to develop both as a researcher and as a person. The working language of our group is English; fluency in German is not required.

Interested students should have finished either a master or a PhD in computer science, mathematics, or a related field with excellent or very good grades. A solid background in mathematics and statistics is expected—as well as very good programming skills.

Benno Stein
Bauhaus-Universität Weimar
On behalf of the Webis group