Synopsis

CAIR was a cooperative research project together with the Information Engineering Group (Universität Duisburg-Essen) and funded by the German Research Foundation (DFG). The goal of CAIR was the theoretical, methodological, and experimental study of cluster analysis in information retrieval, where semantics was investigated in different respects: (1) in the form of specialized retrieval models that consider knowledge of a retrieval task, (2) for multi-objective and interactive analyses that employ an explicit user model, (3) within hybrid merging strategies that combine algorithms, and (4) for improved cluster labeling.

One of the project outcomes is the concept of "keyqueries" as document descriptors. Representing documents in terms of the search queries for which they are most relevant has natural applications in cluster analysis. Given a document collection, it allows the automatic generation of a hierarchical taxonomy with expressive cluster labels.

As part of our project, we organized the Dagstuhl Seminar 11171, "Challenges in Document Mining". Further information can be found on the project page of the Information Engineering Group. [data]

People

Publications