Open Thesis Topics

Students who are eager to develop their skills by doing a research-oriented thesis in our group should mail their interests to Suitable topic candidates are shown in the following list. Your own suggestions for topics are also welcome, for which you can draw inspiration from our recent publications.

  • Analyzing Factors of Persuasion in an Online Debate Forum
  • Assessing the Robusness of Retrieval Models to Adversarial Examples
  • Constructing and Analyzing a diverse Corpus of AI-generated Text
  • Contrastive Ranking-Aware Learning of Representations for Retrieval
  • Evaluating Document-level Classification by Combining Sentence-level Predictions
  • Extreme Multi-Label Classification of German Book Titles
  • Facets of complexity in scholarly political language
  • Fine-granular and Web-scale Language Identification for Multi-lingual LLMs
  • Jointly Learning Decoupled Bi-Encoder Representations for Retrieval
  • Mining Documents with Trigger Warnings from the Web and Social Media
  • Psychological Features for Argumentation
  • Rating the Degree of Search Engine Optimization of Websites
  • Retrieval Augmented Generation for Political Science Question Answering
  • Simplifying the language of political argumentation
  • Text2SQL. Exploring Relational Databases with Natural Language User Interfaces

Open Student Assistant Topics

Students who want to improve their skills and work with us can apply for a position as a student assistant at We are currently looking for assistants to work on the following topics:

  • We currently have no open topics. You can always let us know that you're interested.

Ongoing Theses

  • Halle
    • Simulation von Suchanfragen durch Anchortext (supervised by Maik Fröbe, Sebastian Günther, and Matthias Hagen)
  • Jena
    • Answering Open-Ended Health-Related Questions based on Trusted Sources (supervised by Jan Heinrich Reimer, Alexander Bonarenko, and Matthias Hagen)
    • Re-Creating Twitter-based IR and NLP-Experiments on the Feediverse (supervised by Jan Heinrich Reimer and Matti Wiegmann)
  • Leipzig
    • Lightweight Passage Re-ranking Using Embeddings from Pre-trained Language Models (Supervised by Ferdinand Schlatt and Harry Scells)
    • Logical Features of Neural Networks (supervised by Maximilian Heinrich)
    • Classification of Multimodal Social Media Posts (supervised by Tim Gollub)
    • Active Learning for Text Classification (supervised by Christian Kahmann and Christopher Schröder)
    • Incorporating Knowledge Graph Embeddings in Large Language Models (supervised by Ferdinand Schlatt)
    • Implicit Evaluation of Health Answers from Large Generative Text Models (supervised by Sebastian Schmidt, Harry Scells)
    • Improving Compositionality of Images Generated by Stable Diffusion (supervised by Niklas Deckers)
    • Cross-domain Counterargument Retrieval using Large Language Models (supervised by Nailia Mirzakhmedova and Johannes Kiesel)
    • Extracting Large-Scale Multimodal Datasets From Web Archives (supervised by Niklas Deckers)
    • Normdaten-Disambiguierung und Reconciliation auf Korpusdaten (supervised by Erik Körner and Felix Helfer)
    • Statistical Bootstrap Tests with Redundant Data (supervised by Maik Fröbe)
    • Collecting Fine-grained, Intesity-aware Annotations of Triggering Content. (supervised by Matti Wiegmann and Magdalena Wolska)
  • Weimar
    • Statistics Retrieval for Arguments (supervised by Johannes Kiesel)
    • Mimicking Personas of Dialog Participants with Large Language Models (supervised by Marcel Gohsen)
    • Character-Driven Story Generation Through Character Networks (supervised by Marcel Gohsen)
    • Health-Related Queries in Large-Scale Query Logs (supervised by Jan Heinrich Reimer)
    • Information Extraction from Academic Mailing Lists (supervised by Tim Gollub)
    • Topic Segmentation with Large Language Models (supervised by Johannes Kiesel, Nailia Mirzakhmedova, and Matti Wiegmann)
    • Retrieval Augmented Generation for the IR-Anthology (supervised by Tim Gollub)
    • Interacting with a Multi-User Voice Search in VR (supervised by Johannes Kiesel and Marcel Gohsen)
    • Mining Linked Data on Web Scale (supervised by Nikolay Kolyada)
    • Searching Personal Web Archives (supervised by Johannes Kiesel)

Resources for Students


Dear prospective PhD student, unsolicited applications to the Webis group ( are welcome. However, we cannot promise that open positions are available at the time of your application.

The Webis Group is a tightly cooperating research network, formed by computer science chairs at the universities of Groningen, Hannover, Jena, Leipzig, and Weimar. Our mission is to tackle challenges of the information society by conducting basic and applied research with the goal of prototyping and evaluating future information systems. We are an experienced research group where team spirit and active collaboration has top priority. We are looking for open-minded graduates and PhDs who want to develop both as a researcher and as a person. The working language of our group is English; fluency in German is not required.

Interested students should have finished either a master or a PhD in computer science, mathematics, or a related field with excellent or very good grades. A solid background in mathematics and statistics is expected—as well as very good programming skills.

Benno Stein
Bauhaus-Universität Weimar
On behalf of the Webis group