Retrieval Models

In the literature a distinction between empirical models, probabilistic models, and language models is often made, which is rooted in the query-oriented understanding of retrieval tasks but also has historical reasons. Our map reflects this distinction.

By clicking on a model acronym in the map a short description of the respective retrieval model is displayed below the map. A retrieval model is either empirical, probabilistic, or of language model type. Below a model's acronym you find a code in the form of a quadrupel, [1 2 3 4], which hints the model's characteristics along four dimensions: (1) Feature type, which defines the basic principle to capture a document's content; possible values include document terms [T], latent or explicit taxonomic concepts [C], or an (often NLP-based) method yielding special [S] features. (2) Foundation of the Retrieval status value (RSV) computation; possible values include feature vector similarity [φ], relevance [ρ] assessment, or the ability of a document to generate [γ] a query. (3) Dependency on a Closed world; possible values are open [∪], where the document collection need not to be completely given, and closed [∩], where the collection must be completely given to compute global characteristics. (4) External knowledge, if used at all; possible values include none [∅], user feedback [✓], e.g. for relevance assessment purposes, and an additional [+] document collection, e.g., for computing collection-relative document similarities. Our scheme is not intended to exactly differentiate between all particularities of a model, but shall pinpoint retrieval model strengths and weaknesses. If you find it useful, if you have hints for its improvement, or if you detect incorrect statements please drop us a mail. Finally, we kindly ask you to refer to the overview using the related publication (Stein et al., 2017).

Legend [1 2 3 4]
(1) Feature type	Tterms Cconcepts Sspecial
(2) RSV computation	φsimilarity ρrelevance γgeneration
(3) Closed world	∪open collection ∩closed collection
(4) External knowledge	∅none ✓user feedback +additional collection

Retrieval Models

Boolean Model

Vector Space Model

Fuzzy Set Model

Generalized Vector Space Model

Latent Semantic Indexing

Genre Classification

SuffixTree

DivRand

WebGenre

ESA

CL-ESA

BII

2-Poisson

BIM

Probabilistic Indexing

INQUERY

BestMatch

Beliefnet

Language Models

pLSI

Mixture of Unigrams

LDA