Neural networks are a family of powerful machine learning models. This book focuses on the application of neural network models to natural language data. The first half of the book (Parts I and II) covers the basics of supervised machine learning and feed-forward neural networks, the basics of working with machine learning over language data, and the use of vector-based rather than symbolic representations for words. It also covers the computation-graph abstraction, which allows to easily define and train arbitrary neural networks, and is the basis behind the design of contemporary neural network software libraries.
The second part of the book (Parts III and IV) introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural networks, conditioned-generation models, and attention-based models. These architectures and techniques are the driving force behind state-of-the-art algorithms for machine translation, syntactic parsing, and many other applications. Finally, we also discuss tree-shaped networks, structured prediction, and the prospects of multi-task learning.
Table of Contents
Learning Basics and Linear Models
From Linear Models to Multi-layer Perceptions
Feed-forward Neural Networks
Neural Network Training
Features for Textual Data
Case Studies of NLP Features
From Textual Features to Inputs
Pre-trained Word Representations
Using Word Embeddings
Case Study: A Feed-forward Architecture for Sentence Meaning Inference
Ngram Detectors: Convolutional Neural Networks
Recurrent Neural Networks: Modeling Sequences and Stacks
Concrete Recurrent Neural Network Architectures
Modeling with Recurrent Networks
Modeling Trees with Recursive Neural Networks
Structured Output Prediction
Cascaded, Multi-task and Semi-supervised Learning
About the Author(s)Yoav Goldberg
, Bar Ilan University
Yoav Goldberg has been working in natural language processing for over a decade. He is a Senior Lecturer at the Computer Science Department at Bar-Ilan University, Israel. Prior to that, he was a researcher at Google Research, New York. He received his Ph.D. in Computer Science and Natural Language Processing from Ben Gurion University (2011). He regularly reviews for NLP and machine learning venues, and serves at the editorial board of Computational Linguistics. He published over 50 research papers and received best paper and outstanding paper awards at major natural language processing conferences. His research interests include machine learning for natural language, structured prediction, syntactic parsing, processing of morphologically rich languages, and, in the past two years, neural network models with a focus on recurrent neural networks.