Information Retrieval Models

Information Retrieval Models

Foundations & Relationships

Thomas Roelleke
ISBN: 9781627050784 | PDF ISBN: 9781627050791
Copyright © 2013 | 163 Pages | Publication Date: 07/01/2013

BEFORE YOU ORDER: You may have Academic or Corporate access to this title. Click here to find out: 10.2200/S00494ED1V01Y201304ICR027

Ordering Options: Paperback $40.00   E-book $32.00   Paperback & E-book Combo $50.00

Why pay full price? Members receive 15% off all orders.
Learn More Here

Read Our Digital Content License Agreement (pop-up)

Purchasing Options:

Information Retrieval (IR) models are a core component of IR research and IR systems. The past decade brought a consolidation of the family of IR models, which by 2000 consisted of relatively isolated views on TF-IDF (Term-Frequency times Inverse-Document-Frequency) as the weighting scheme in the vector-space model (VSM), the probabilistic relevance framework (PRF), the binary independence retrieval (BIR) model, BM25 (Best-Match Version 25, the main instantiation of the PRF/BIR), and language modelling (LM). Also, the early 2000s saw the arrival of divergence from randomness (DFR).

Regarding intuition and simplicity, though LM is clear from a probabilistic point of view, several people stated: "It is easy to understand TF-IDF and BM25. For LM, however, we understand the math, but we do not fully understand why it works."

This book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN's), and divergence-based models. The aim is to create a consolidated and balanced view on the main models.

A particular focus of this book is on the "relationships between models." This includes an overview over the main frameworks (PRF, logical IR, VSM, generalized VSM) and a pairing of TF-IDF with other models. It becomes evident that TF-IDF and LM measure the same, namely the dependence (overlap) between document and query. The Poisson probability helps to establish probabilistic, non-heuristic roots for TF-IDF, and the Poisson parameter, average term frequency, is a binding link between several retrieval models and model parameters.

Table of Contents

List of Figures
Foundations of IR Models
Relationships Between IR Models
Summary & Research Outlook
Author's Biography

About the Author(s)

Thomas Roelleke, Queen Mary University of London
Thomas Roelleke holds a Dr rer nat (Ph.D.) and a Diplom der Ingenieur-Informatik (MSc in Engineering & Computer Science) of the University of Dortmund. After school education in Meschede, Germany, he attended the b.i.b., the Nixdorf Computer school for professions in informatics, in Paderborn. Nixdorf Computer awarded him a sales and management trainee program, after which he was appointed as product consultant in the Unix/DB/4GL marketing of Nixdorf Computer. He studied Diplom-Ingenieur-Informatik at the University of Dortmund (UniDo), and was later a lecturer/researcher at UniDo. His research focused on probabilistic reasoning and knowledge representations, hypermedia retrieval, and the integration of retrieval and database technologies. His lecturing included information/database systems, object-oriented design and programming, and software engineering. He obtained his Ph.D. in 1999 for the thesis titled "POOL: A probabilistic object-oriented logic for the representation and retrieval of complex objects - a model for hypermedia retrieval." Since 1999, he has been working as a strategic IT consultant, founder and director of small businesses, research fellow, and lecturer at the Queen Mary University of London (QMUL). Research contributions include a probabilistic relational algebra (PRA), a probabilistic object-oriented logic (POOL), the relational Bayes, a matrix-based framework for IR, a parallel derivation of IR models, a probabilistic interpretation of the BM25-TF based on "semi-subsumed" event occurrences, and theoretical studies of retrieval models. Thomas Roelleke lives in England, in a village in the middle between buzzy London and beautiful East Anglia.


Browse by Subject
Case Studies in Engineering
ACM Books
IOP Concise Physics
SEM Books
0 items

Note: Registered customers go to: Your Account to subscribe.

E-Mail Address:

Your Name: