Dear prospective PhD student, unsolicited applications to the Webis group ( are welcome. However, we cannot promise that open positions are available at the time of your application.

The Webis group is a tightly cooperating research network, formed by computer science chairs at the universities of Halle, Leipzig, Paderborn, and Weimar. Our mission: To tackle challenges of the information society by carrying out basic and applied research with the goal of prototyping and evaluating future information systems. We are an experienced research group where team spirit and active collaboration has top priority. We are looking for open-minded graduates and PhDs who want to develop both as a researcher and as a person. The working language of our group is English; fluency in German is not required.

Interested students should have finished either a master or a PhD in computer science, mathematics, or a related field with excellent or very good grades. A solid background in mathematics and statistics is expected—as well as very good programming skills.

Benno Stein
Bauhaus-Universität Weimar
On behalf of the Webis group


Open Thesis Topics

Students who are eager to develop their skills in a research-oriented thesis in our group should mail us their interests to Suitable topics are, for example:

  • Adversarial Learning of Writing Style Representations
  • Authorship Identification with Phonological Features
  • Axiomatic Argumentative Web Scale Document Re-ranking
  • Crowdsourcing the Translation of a Book
  • Dealing with False Memories in Web Search
  • Detecting Bias in Media
  • Detecting Text Reuse from Books
  • Developing a Collaborative Writing Tool for Wikipedia
  • Detecting Bias in Search Engines
  • Detecting Ideological Bias in Word Embeddings
  • Exploiting Argumentation Knowledge Graphs for Argument Generation
  • Exploratory Analysis of Wikipedia Text Reuse
  • Exploring Argumentation Strategies in Persuasive Texts
  • Facet Completion based on Term Embeddings
  • Harvesting the Web for Building Evidence-based Knowledge Graphs
  • Neural Netspeak – Exploring the Performance of Transformer Models as Idiomatic Writing Assistants
  • Paraphrasing Operations for Heuristic Author Obfuscation
  • Paraphrasing Texts for Conversational News
  • Re-Ranking for Total Recall in Systematic Reviews
  • Semantic Search Engines for the Analysis of Debates and Discourses
  • Semi-Automatically Supporting Crowdsourcing Approval Processes
  • Simulating Search Behavior
  • Text Mining Methods for Intelligent Writing Assistance
  • The Said and the Unsaid: Analyzing Metaphors using Word Embeddings

Ongoing Theses

  • Nico Reichenbach. Argumentative Image Search (supervised by Johannes Kiesel, Martin Potthast, and Benno Stein)
  • Till Werner. Argument Quality Assessment in Natural Language using Machine Learning (supervised by Henning Wachsmuth)
  • Artur Jurk. Clickbait Spoiling (supervised by Matthias Hagen and Martin Potthast)
  • Philipp Rothe. Comparing Keyqueries for different Retrieval Models (supervised by Matthias Hagen)
  • Shaour Haider. Few Shot Learning for Text Classification (supervised by Tim Gollub and Magdalena Wolska)
  • Counterargument Generation via Premise Rebuttal (supervised by Milad Alshomary)
  • Lukas Trautner. Graph-Based Synthesis of Counterfactuals (supervised by Khalid Al-Khatib and Benno Stein)
  • Anh Phuong Le. Harvesting the Web for Building Large-scale Argumentation Graphs (supervised by Khalid Al-Khatib)
  • Alexander Rensch. Expertise Filtering for Social Media Timelines (supervised by Matthias Hagen and Martin Potthast)
  • Lukas Gehrke. Gender Regression and Gender Prediction based on Writing Style (supervised by Matti Wiegmann and Martin Potthast)
  • Nick Düsterhus. Snippet Generation for Argument Search (supervised by Milad Alshomary)
  • Hannes Winkler. Geolocation of Social Media Posts (supervised by Matti Wiegmann, Martin Potthast, Magdalena Wolska, and Benno Stein)
  • Lars Meyer. Large-scale Comparison and Analysis of Approaches and Algorithms for Web Page Segmentation (supervised by Johannes Kiesel and Martin Potthast)
  • Fan Fan. Mining High-ethos Evidence from Wikipedia (supervised by Khalid Al-Khatib and Yamen Ajjour)
  • Jan Philipp Bittner. Near-Duplicate-Detection of Webpages (supervised by Maik Fröbe and Matthias Hagen)
  • Fatema Merchant. Neural Paraphrasing Methods for Augmented Writing Tools (supervised by Khalid Al-Khatib and Shahbaz Syed)
  • Jan Heinrich Reimer. Popularity Bias in Learning-to-Rank (supervised by Maik Fröbe and Matthias Hagen)
  • Salomo Pflugradt. Reproducing Text Alignment Algorithms from PAN (supervised by Shahbaz Syed)
  • Nina Schwanke. Retrieving Police Press Releases for News Verification (supervised by Matthias Hagen and Martin Potthast)
  • Saeed Entezari. Shared Task on Argument Retrieval (supervised by Michael Völske)
  • Valentin Dittmar. Towards Answering Comparative Web Questions (supervised by Alexander Bondarenko and Matthias Hagen)
  • Unsupervised Metaphor Categorization (supervised by Henning Wachsmuth)
  • Erika Garces. Visualizing Wikipedia Text Reuse (supervised by Michael Völske and by Patrick Riehmann from the Virtual Reality group)


  • Webis facilities:

  • Operational and organizational structure (staff only):
    • proposals-generic-notes.txt [<proposals-in-progress>/proposals-generic/]
    • publications-notes.txt [<literature>/publications/]
    • research-generic-notes.txt [<research-in-progress>/research-generic/]
    • webis-organization-notes.txt [<webis-in-progress>/webis-organization/]
    • webis-web-notes.txt [<webis-in-progress>/webis-web/]

  • Communication:
    • Discord: ask staff for access
    • Google Calendar
    • Mailing list staff:
    • Mailing list students:
    • Skype: webis
    • Twitter: @webis_de
    • Whereby: webis room

  • Meeting rules:
    • Please take notes.
    • Also students should prepare an agenda (not only the PhD).
    • Typical meeting duration: 30 minutes with student assistants, 60 minutes for projects and PhD meetings.
    • Don't stray off-topic. Respect your students' and PhDs' time respectively.
    • Don't meet if there is nothing to discuss.

  • Community: