Open Thesis Topics

Students who are eager to develop their skills by doing a research-oriented thesis in our group should mail their interests to Suitable topic candidates are shown in the following list. Your own suggestions for topics are also welcome, for which you can draw inspiration from our recent publications.

  • Creating a Large-Scale Graph of Causal Relations from Web Data
  • Does Training Data from different Corpora benefit Learning-to-Rank?
  • Incorporating Knowledge Graph Embeddings in Large Language Models
  • Reducing the Size of Dense Retrieval Indexes by Removing Unimportant Terms
  • Query Obfuscation for Dense Retrieval Models

Ongoing Theses

  • Halle
    • Jan Heinrich Reimer. Health-Related Information Retrieval (supervised by Alexander Bonarenko, Maik Fröbe, and Matthias Hagen)
    • Max Henze. Simulation von Suchanfragen durch Anchortext (supervised by Maik Fröbe, Sebastian Günther, and Matthias Hagen)
  • Jena
    • Niklas Rausch. Multi-Task Learning with IR Axioms (supervised by Maik Fröbe, Alexander Bonarenko, and Matthias Hagen)
    • Felix Juch. Weather Event Extraction from Tweets (supervised by Matti Wiegmann)
  • Leipzig
    • Janko Götze. Cross-domain Counterargument Retrieval using Large Language Models (supervised by Nailia Mirzakhmedova, Johannes Kiesel)
    • Marvin Vogel. Axiomatic Re-ranking for Argument Search (supervised by Maximilian Heinrich, Johannes Kiesel, Alexander Bondarenko)
    • Jerome Würf. An Annotation Game in the Media Bias Domain. (supervised by Theresa Elstner, Timo Spinde and Martin Potthast)
    • Ruben Kohlmeyer. Probing Large Language Models for Causal Knowledge (supervised by Ferdinand Schlatt)
    • Julia Peters. Manipulating Embeddings of Stable Diffusion Prompts (supervised by Niklas Deckers)
    • Thilo Brummerloh. Extracting Large-Scale Multimodal Datasets From Web Archives (supervised by Niklas Deckers)
    • Pia Sülzle. Detecting Hidden Meaning in Stock Images (supervised by Niklas Deckers)
    • Marc-Pascal Richter. Normdaten-Disambiguierung und Reconciliation auf Korpusdaten (supervised by Erik Körner, Felix Helfer)
    • Thomas Abel. Comparative Scholar: Exploiting Pairwise Neural Networks for High-Recall Literature Search (supervised by Maik Fröbe, Lukas Gienapp)
    • Jonas Stahl. Adapting Sentence Embeddings to OCR erroneous data (supervised by Kim Bürgl)
    • Yaowei Zhang. Semi-automatic Knowledge Graph Authoring to Facilitate Retrieval of Expert Knowledge (supervised by Marcel Gohsen)
    • Leon Naumov. Last but not Least: Avoiding and Explicating Biases in Spoken Lists of Arguments (supervised by Johannes Kiesel)
    • Dominik Schwabe. Unsupervised Frame Identification in Argumentative Discussions (supervised by Shahbaz Syed and Khalid Al-Khatib)
    • Ahmad Dawar Hakimi. Contextualized Summarization of Scholarly Documents (supervised by Shahbaz Syed and Khalid Al-Khatib)
    • Deniz Simsek. Verbalizing Entity-based Answers in Conversational QA-Systems (supervised by Marcel Gohsen and Johannes Kiesel)
    • Gabriel Huppenbauer. Context Dynamics of the Term Sustainability (supervised by Christian Kahmann)
    • Jonas Richter. Knowledge Graph of resistance network in Nazi Germany (supervised by Andreas Niekler and Christian Kahmann)
    • Roy Rodney. A Web-based Implementation of the Netspeak Wordgraph (supervised by Tariq Youssef)
    • Hannes Winkler. Digital Monitor of Saxony (supervised by Andreas Niekler)
    • Markus Kobold. Etymological data from Wiktionary as a graph (supervised by Thomas Efer)
    • Wolfgang Kircheis. Analyzing the History Section of Wikipedia Articles. (supervised by Martin Potthast)
    • Maximus Germer. Chess Report Generation with Data-to-text (supervised by Janos Borst and Andreas Niekler)
    • Cariem El Wakil. Training a TTS Model with custom speech data for Galileofication. (supervised by Andreas Niekler)
    • Moritz Brunsch. Multi-Label Active Learning with Many Irrelevant Examples (supervised by Christopher Schröder)
    • Nils Schröder. Short Text Classification (supervised by Christian Kahmann and Christopher Schröder)
    • Karl Hase. Statistical Bootstrap Tests with Redundant Data (supervised by Maik Fröbe)
    • Gregor Pfänder. Generating a Large Scale Corpus of Text-aligned Medical Entities (supervised by Ferdinand Schlatt)
    • Max Staats. Estimating Corpus Statistics with Large Language Models (supervised by Matti Wiegmann)
    • Simon Reich. Integrating Information Retrieval Toolkits into TIRA (supervised by Maik Fröbe and Jan Heinrich Reimer)
    • Jennifer Rakete. Bootstrapping Training Data for Sentence-Level Trigger Detection (supervised by Matti Wiegmann and Magdalena Wolska)
  • Weimar
    • Ali Saqallah. Civilians at War: Quantifying the Effects of Real-World Conflicts on Wikipedia (supervised by Johannes Kiesel)
    • Alban Bruder. Interacting with a Multi-User Voice Search in VR (supervised by Johannes Kiesel and Marcel Gohsen)
    • Kshitij Pandit. Mining Linked Data on Web Scale (supervised by Nikolay Kolyada)
    • Ludwig Lorenz. Searching Personal Web Archives (supervised by Johannes Kiesel)

Resources for Students


Dear prospective PhD student, unsolicited applications to the Webis group ( are welcome. However, we cannot promise that open positions are available at the time of your application.

The Webis Group is a tightly cooperating research network, formed by computer science chairs at the universities of Groningen, Hannover, Jena, Leipzig, and Weimar. Our mission is to tackle challenges of the information society by conducting basic and applied research with the goal of prototyping and evaluating future information systems. We are an experienced research group where team spirit and active collaboration has top priority. We are looking for open-minded graduates and PhDs who want to develop both as a researcher and as a person. The working language of our group is English; fluency in German is not required.

Interested students should have finished either a master or a PhD in computer science, mathematics, or a related field with excellent or very good grades. A solid background in mathematics and statistics is expected—as well as very good programming skills.

Benno Stein
Bauhaus-Universität Weimar
On behalf of the Webis group