A major part of natural language processing now depends on the use of text data to build linguistic analyzers. We consider statistical, computational approaches to modeling linguistic structure. We seek to unify across many approaches and many kinds of linguistic structures. Assuming a basic understanding of natural language processing and/or machine learning, we seek to bridge the gap between the two fields. Approaches to decoding (i.e., carrying out linguistic structure prediction) and supervised and unsupervised learning of models that predict discrete structures as outputs are the focus. We also survey natural language processing problems to which these methods are being applied, and we address related topics in probabilistic inference, optimization, and experimental methodology.
Table of Contents
Representations and Linguistic Data
Decoding: Making Predictions
Learning Structure from Annotated Data
Learning Structure from Incomplete Data
Beyond Decoding: Inference
About the Author(s)Noah A. Smith
, Carnegie Mellon University
Noah A. Smith is an assistant professor in the Language Technologies Institute and Machine Learning Department at the School of Computer Science at Carnegie Mellon University. He received his Ph.D. in Computer Science from Johns Hopkins University (2006) and his B.S. in Computer Science and B.A. in Linguistics from the University of Maryland (2001). He was awarded a Hertz Foundation fellowship (2001-2006), served on the DARPA Computer Science Study Panel (2007) and the editorial board of the journal Computational Linguistics, and received a best paper award from the Association for Computational Linguistics (2009) and an NSF CAREER grant (2011). His research interests include statistical natural language processing, especially unsupervised methods, machine learning for structured data, and applications of natural language processing.