HISTORY OF SPEECH RECOGNITION

INDEX ANALOG TO DIGITAL DIGITAL TO TEXT SOFTWARE APPLICATIONS FUTURE REFERENCES

A toy breaks the ice into speech recognition

"Radio Rex", was the first success story in the field of speech recognition. It was a toy dog that came in a house, when the name "Rex" was spoken the dog would pop out of his house. The dog was held within its house by an electromagnet, as current flowed through a circuit bridge, the magnet was energized. The bridge was sensitive to 500 cps of acoustic energy. The energy of the vowel sound of the word "Rex" caused the bridge to vibrate, breaking the electrical circuit, and allowing a spring to push Rex out of his house. Rex was the pioneer into the field of speech recognition.

To War with Mother Russia

The U.S. Department of Defense sponsored the first academic pursuits in speech recognition in the late 1940's. In an attempt to intercept and decode Russian messages, the U.S. sought the development of an automatic language translator. The first, and most difficult, step was to solve the problem of creating a program that could recognize speech. The project was a dismal failure. Phrases were typically mistranslated and included errors such as:

"The spirit is willing but the flesh is weak."

"The vodka is strong but the meat is disgusting."

Despite the dismal failure, appreciation and interest for the field began to grow. As a result, the government funded the Speech Understanding Research (SUR) program at Carnegie Mellon University, MIT, and some select commercial institutions. The agency that funded the research became known as the Defense Advanced Research Project Agency (DARPA).

EARLY KEY ADVANCES

In 1952, as government-funding research began to gain momentum, Bell Laboratories developed an automatic speech recognition system that successfully identified the digits 0-9 spoken to it over the telephone.
In 1959, MIT developed a system that successfully identifies vowel sounds with 93% accuracy.
In 1966, a system with 50 vocabulary words was successfully tested.
In the early 1970's the SUR program began to produce results in the form of the HARPY system. This system could recognize complete sentences that consisted of a limited rage of grammar structures. This program required massive amounts of computing power to work, 50 state of the art computers.
In the 1980's Hidden Markov Models (HMM) become the standard statistical approach for computation.

Up to this point there are only three major obstacles standing in the way of commercial use.

Computing Power, lots of power required, but little available
The ability to recognize speech from any person (not just the particular voices the system has been designed around).
A continuity of speech capability (so that the person speaking did not have to break after every word).

The successes from the 50's to the 80's gained more attention and interest, eventually continuous speech became imaginable.

PHONEME'S RECOGNIZED AS KEY TO SPEECH

In the 1960's linguistic researchers examine inherent structure of language, results of research lead developers to concentrate speech recognition technology at the level of phonemes, the sound fragments that make up comprehensible words. By the 1980's programmers were using more powerful hardware to implement statistical phoneme-chain recognition routines. However, computing power still inhibits speech recognition.

COMMERCE TAKES OVER

Speechworks and Dragon Systems take over as major producers of speech recognition technology. As these two compete in the field, eventually a point is reached where computation required gets low enough and computation available became high enough for wide spread commercial use.

At the same time, the task difficulty increased coupled with the decrease in error rate made for wide spread use.

In 1996, the consumer company, Charles Schwab became the first company to implement a speech recognition system for its customer interface.
In 1997 Dragon Systems release "Naturally Speaking," the first continuous speech dictation software.
In 2002, TellMe supplies the first global voice portal, and later that year, NetByTel launched the first voice enabler. This enabled users to fill out a web-based data form over the phone.

INDEX ANALOG TO DIGITAL DIGITAL TO TEXT SOFTWARE APPLICATIONS FUTURE REFERENCES


	©2003 St. Norbert College Computer Science Dept. 100 Grant St. DePere, WI 54115	Email the Webmaster St. Norbert College Computer Science Dept.