Information retrieval used to mean looking through thousands of strings of texts to find words or symbols that matched a user's query. Today, there are many models that help index and search more effectively so retrieval takes a lot less time. Information retrieval (IR) is often seen as a subfield of computer science and shares some modeling, applications, storage applications and techniques, as do other disciplines like artificial intelligence, database management, and parallel computing. This book introduces the topic of IR and how it differs from other computer science disciplines. A discussion of the history of modern IR is briefly presented, and the notation of IR as used in this book is defined. The complex notation of relevance is discussed. Some applications of IR is noted as well since IR has many practical uses today. Using information retrieval with fuzzy logic to search for software terms can help find software components and ultimately help increase the reuse of software. This is just one practical application of IR that is covered in this book.
Some of the classical models of IR is presented as a contrast to extending the Boolean model. This includes a brief mention of the source of weights for the various models. In a typical retrieval environment, answers are either yes or no, i.e., on or off. On the other hand, fuzzy logic can bring in a "degree of" match, vs. a crisp, i.e., strict match. This, too, is looked at and explored in much detail, showing how it can be applied to information retrieval. Fuzzy logic is often times considered a soft computing application and this book explores how IR with fuzzy logic and its membership functions as weights can help indexing, querying, and matching. Since fuzzy set theory and logic is explored in IR systems, the explanation of where the fuzz is ensues.
The concept of relevance feedback, including pseudorelevance feedback is explored for the various models of IR. For the extended Boolean model, the use of genetic algorithms for relevance feedback is delved into.
The concept of query expansion is explored using rough set theory. Various term relationships is modeled and presented, and the model extended for fuzzy retrieval. An example using the UMLS terms is also presented. The model is also extended for term relationships beyond synonyms.
Finally, this book looks at clustering, both crisp and fuzzy, to see how that can improve retrieval performance. An example is presented to illustrate the concepts.
Table of Contents
Introduction to Information Retrieval
Source of Weights
Relevance Feedback and Query Expansion
Clustering for Retrieval
Uses of Information Retrieval Today
About the Author(s)Donald H. Kraft
, Colorado Technical University and Louisiana State University
Donald H. Kraft's degrees are from Purdue University, where he majored in industrial engineering, specializing in operations research. He has taught at Purdue University,the University of Maryland, Indiana University, the University of California-Berkeley, the University of California-Los Angeles, the U.S. Air Force Academy, and Louisiana State University-where he served as chair and is now a professor emeritus. He is currently an adjunct professor at Colorado Technical University. He has been named a Fellow of IEEE, AAAS, and IFSA, as well as an ACM Distinguished Scientist and a LSU Distinguished Professor. He is also a winner of both the ASIST Research Award and Award of Merit. He served for 24 years as Editor of JASIST and is a Past President of ASIST. His research interests include information retrieval, fuzzy set theory, genetic algorithms, rough sets, operations research, and information science.Erin Colvin
, Western Washington University
Erin Colvin's degrees are a B.S. in computer science from Middle Tennessee State University; an M.Ed. in secondary education from Chaminade University of Honolulu; an M.S. in computer science from American Sentinel University; and a D.C. S. in computer science from Colorado Technical University. She has worked as a software engineer for Square D Company creating front-end software for electrical metering devices for companies such as BASF, Northrup Grumman, and the University of New Mexico. She has taught at Anne Arundel Community College, Southern New Hampshire University, John Hopkins University Center for Talented Youth and Regis University. Currently, Dr. Colvin is an instructor in the Computer Science Department at Western Washington University. Her research interests include information retrieval, fuzzy set theory, genetic algorithms, and software reuse.