Speech Recognition Algorithms Based on Weighted Finite-State Transducers

Speech Recognition Algorithms Based on Weighted Finite-State Transducers

Takaaki Hori, Atsushi Nakamura
ISBN: 9781608454730 | PDF ISBN: 9781608454747
Copyright © 2013 | 162 Pages | Publication Date: 01/01/2013

BEFORE YOU ORDER: You may have Academic or Corporate access to this title. Click here to find out: 10.2200/S00462ED1V01Y201212SAP010

Ordering Options: Paperback $45.00   E-book $36.00   Paperback & E-book Combo $56.25

Why pay full price? Members receive 15% off all orders.
Learn More Here

Read Our Digital Content License Agreement (pop-up)

Purchasing Options:

This book introduces the theory, algorithms, and implementation techniques for efficient decoding in speech recognition mainly focusing on the Weighted Finite-State Transducer (WFST) approach. The decoding process for speech recognition is viewed as a search problem whose goal is to find a sequence of words that best matches an input speech signal. Since this process becomes computationally more expensive as the system vocabulary size increases, research has long been devoted to reducing the computational cost. Recently, the WFST approach has become an important state-of-the-art speech recognition technology, because it offers improved decoding speed with fewer recognition errors compared with conventional methods. However, it is not easy to understand all the algorithms used in this framework, and they are still in a black box for many people. In this book, we review the WFST approach and aim to provide comprehensive interpretations of WFST operations and decoding algorithms to help anyone who wants to understand, develop, and study WFST-based speech recognizers. We also mention recent advances in this framework and its applications to spoken language processing.

Table of Contents

Brief Overview of Speech Recognition
Introduction to Weighted Finite-State Transducers
Speech Recognition by Weighted Finite-State Transducers
Dynamic Decoders with On-the-fly WFST Operations
Summary and Perspective

About the Author(s)

Takaaki Hori, NTT Communication Science Laboratories, NTT Corporation
Takaaki Hori received the B.E. and M.E. degrees in electrical and information engineering from Yamagata University, Yonezawa, Japan, in 1994 and 1996, respectively, and a Ph.D. degree in system and information engineering from Yamagata University in 1999. Since 1999, he has been engaged in research on spoken language processing at the Cyber Space Laboratories, Nippon Telegraph, and Telephone (NTT) Corporation, Kyoto, Japan. He was a visiting scientist at the Massachusetts Institute of Technology, Cambridge, from 2006 to 2007. He is currently a senior research scientist in the NTT Communication Science Laboratories, NTT Corporation. He received the 22nd Awaya Prize Young Researcher Award from the Acoustical Society of Japan (ASJ) in 2005, the 24th TELECOM System Technology Award from the Telecommunications Advancement Foundation in 2009, and the IPSJ Kiyasu Special Industrial Achievement Award from the Information Processing Society of Japan in 2012. He is a member of Institute of Electrical and Electronic Engineers (IEEE), the Institute of Electronics, Information, and Communication Engineers (IEICE), and the ASJ.

Atsushi Nakamura, NTT Communication Science Laboratories, NTT Corporation
Atsushi Nakamura received the B.E., M.E., and Dr.Eng. degrees from Kyushu University, Fukuoka, Japan, in 1985, 1987, and 2001, respectively. In 1987, he joined Nippon Telegraph and Telephone Corporation (NTT), where he engaged in the research and development of network service platforms, including studies on application of speech processing technologies into network services, at Musashino Electrical Communication Laboratories, Tokyo, Japan. From 1994 to 2000, he was with Advanced Telecommunications Research (ATR) Institute, Kyoto, Japan, as a Senior Researcher, working on the research of spontaneous speech recognition, construction of spoken language database, and development of speech translation systems. Since April 2000, he has been with NTT Communication Science Laboratories, Kyoto, Japan, and is currently the head of Signal Processing Research Group. Dr. Nakamura is a senior member of the Institute of Electrical and Electronic Engineers (IEEE), and serves or served as a member of the IEEE Machine Learning for Signal Processing (MLSP) Technical Committee, a Vice Chair of the IEEE Signal Processing Society Kansai Chapter, etc. He is also a member of the Institute of Electronics, Information and Communication Engineering (IEICE) and the Acoustical Society of Japan (ASJ). He received the IEICE Paper Award in 2004, and received twice the TELECOM System Technology Award of the Telecommunications Advancement Foundation, in 2006 and 2009.

Browse by Subject
Case Studies in Engineering
ACM Books
IOP Concise Physics
SEM Books
0 items

Note: Registered customers go to: Your Account to subscribe.

E-Mail Address:

Your Name: