Automatic speech recognition

Overview

ABSTRACT

Great progress has recently been made in speech recognition performance (close to that of humans), but the level of understanding of present systems remains very low. Such systems are based on statistical modeling of speech: Hidden Markov Models (HMM) for acoustics, and n-gram models storing the conditional probabilities of sequences of linguistic units. Recent progress has been achieved by coupling classical HMMs with deep neural networks that are made up of a large number of hidden layers and trained by deep learning algorithms using very large amounts of training data. Applications concern mainly text dictation, transcription of media (radio, television) and especially vocal telematics.

Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.

Read the article

AUTHOR

Jean-Paul HATON: Professor at the University of Lorraine, LORIA/INRIA – Member of the Institut Universitaire de France

INTRODUCTION

The use of speech as a means of communication between man and machine has been widely studied in recent decades. In this article, we focus on automatic speech recognition (ASR), i.e. all the techniques used to communicate verbally with a machine. ALR is of undeniable practical interest, under certain conditions of use (remote access, heavy workload, disabled people, etc.). Commercial products have been available for over thirty years, initially mainly for the recognition of isolated and concatenated words, and now for continuously spoken sentences. Most are based on dynamic programming algorithms and stochastic models (Markov sources). However, there are still problems to be solved in order to increase the robustness of these systems and extend their dialog capabilities. Current research focuses on the recognition of noisy speech, the processing of incomplete or incorrect utterances, the definition of dialog procedures, etc.

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource. Click here to request your free trial access!

Already subscribed? Log in!

KEYWORDS

Hidden Markov Models (HMM) | deep neural networks | deep learning

CAN BE ALSO FOUND IN:

Ongoing reading
Automatic speech recognition

Characteristics of spoken man-machine communication

Article included in this offer

"Software technologies and System architectures"

( 232 articles )

Complete knowledge base

Updated and enriched with articles validated by our scientific committees

Services

A set of exclusive tools to complement the resources

View offer details

Bibliography

(1) - RABINER (L.), HUANG (B.H.) - Fundamentals of speech recognition. – - Prentice-Hall, Englewood Cliffs (1993).
(2) - JUNQUA (J.-C.), HATON (J.-P.) - Robustness in automatic speech recognition. – - Kluwer Academic, Dordrecht...

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource. Click here to request your free trial access!

Already subscribed? Log in!