Discriminative Learning for Speech Recognition

Discriminative Learning for Speech Recognition
Theory and Practice

Xiadong He, Li Deng
ISBN: 9781598293081 | PDF ISBN: 9781598293098
Copyright © 2008 | 112 Pages | Publication Date: 01/01/2008

BEFORE YOU ORDER: You may have Academic or Corporate access to this title. Click here to find out: 10.2200/S00134ED1V01Y200807SAP004

Ordering Options: Paperback $30.00   E-book $24.00   Paperback & E-book Combo $37.50


Why pay full price? Members receive 15% off all orders.
Learn More Here

Read Our Digital Content License Agreement (pop-up)

Purchasing Options:


In this book, we introduce the background and mainstream methods of probabilistic modeling and discriminative parameter optimization for speech recognition. The specific models treated in depth include the widely used exponential-family distributions and the hidden Markov model. A detailed study is presented on unifying the common objective functions for discriminative learning in speech recognition, namely maximum mutual information (MMI), minimum classification error, and minimum phone/word error. The unification is presented, with rigorous mathematical analysis, in a common rational-function form. This common form enables the use of the growth transformation (or extended Baum

Table of Contents

Introduction and Background
Statistical Speech Recognition: A Tutorial
Discriminative Learning: A Unified Objective Function
Discriminative Learning Algorithm for Exponential-Family Distributions
Discriminative Learning Algorithm for Hidden Markov Model
Practical Implementation of Discriminative Learning
Selected Experimental Results
Epilogue
Major Symbols Used in the Book and Their Descriptions
Mathematical Notation
Bibliography

About the Author(s)

Xiadong He, Microsoft Research
Xiaodong He received his bachelor's degree from Tsinghua University, Beijing, China, in 1996, and earned his master's degree from the Chinese Academy of Sciences in 1999, and his doctoral degree from the University of Missouri-Columbia in 2003. He joined the Speech and Natural Language group of Microsoft in 2003, and the Natural Language Processing group of Microsoft Research, Redmond, WA, in 2006, where he currently serves as researcher. His research areas include statistical machine learning, automatic speech recognition, natural language processing, machine translation, signal processing, nonnative speech processing, and human-computer interaction. In these areas, he has authored/coauthored more than 30 refereed papers in leading international conferences and journals. He has filed more than 10 U.S. or international patents in the areas of speech recognition, language processing, and machine translation. He served as a reviewer for major conferences and journals in the areas of speech recognition, natural language processing, signal processing, and pattern recognition. He also served on program committees of various conferences in these areas. He is a member of ACL, IEEE, ISCA, and Sigma Xi.

Li Deng, Microsoft Research
Li Deng received his bachelor's degree from the University of Science and Technology of China and his Ph.D. degree from the University of Wisconsin-Madison. In 1989, he joined the Department of Electrical and Computer Engineering, University of Waterloo, Ontario, Canada, as assistant professor; he became tenured full professor in 1996. From 1992 to 1993, he conducted sabbatical research at the Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, MA, and from 1997 to 1998, at the ATR Interpreting Telecommunications Research Laboratories, Kyoto, Japan. During 1989-1999, he taught a wide range of electrical and computer engineering courses, both at undergraduate and graduate levels. In 1999, he joined Microsoft Research, Redmond, WA, as senior researcher; he currently serves as principal researcher for the same institution. He has also been affiliate professor in the Department of Electrical Engineering at University of Washington since 2000 after moving to Seattle. His past and current research areas include automatic speech and speaker recognition, statistical methods and machine learning, neural information processing, machine intelligence, audio and acoustic signal processing, statistical signal processing and digital communication, human speech production and perception, acoustic phonetics, auditory speech processing, noise robust speech processing, speech synthesis and enhancement, spoken language understanding systems, multimedia signal processing, and multimodal human-computer interaction. In these areas, he has published more than 300 refereed papers in leading international conferences and journals, and 14 book chapters, and has given keynotes, tutorials, and lectures worldwide. He has been granted more than 20 U.S. or international patents in acoustics, speech/language technology, and signal processing. He has likewise authored two recent books on speech processing.

Reviews
Browse by Subject
Case Studies in Engineering
ACM Books
IOP Concise Physics
0 items
LATEST NEWS

Newsletter
Note: Registered customers go to: Your Account to subscribe.

E-Mail Address:

Your Name: