what is speech recognition and voice dictation

"...we'd like to explain why talking to a computer is not the same as talking to a person and then give you a few tips about how to speak when dictating.

Understanding spoken language is something that people often take for granted. Most of us develop the ability to recognize speech when we're very young. We're already experts at speech recognition by the age of three or so.

When people first start using speech-recognition software, they might be surprised that the computer makes mistakes. Maybe unconsciously we compare the computer to another person. But the computer is not like a person. What the computer does when it listens to speech is different from what a person does.

The first challenge in speech recognition is to identify what is speech and what is just noise. People can filter out noise fairly easily, which lets us talk to each other almost anywhere. We have conversations in busy train stations, across the dance floor, and in crowded restaurants. It would be very dull if we had to sit in a quiet room every time we wanted to talk to each other!

Unlike people, computers need help separating speech sounds from other sounds. When you speak to a computer, you should be in a place without too much noise. Then, you must speak clearly into a microphone that has been placed in the right position. If you do this, the computer will hear you just fine, and not get confused by the other noises around you.

A second challenge is to recognize speech from more than one speaker. People do this very naturally. We have no problem chatting one moment with Aunt Grace, who has a high, thin voice, and the next moment with Cousin Paul, who has a voice like a foghorn. People easily adjust to the unique characteristics of every voice.

Speech-recognition software, on the other hand, works best when the computer has a chance to adjust to each new speaker. The process of teaching the computer to recognize your voice is called "training," and it's what you're doing right now.

The training process takes only a few minutes for most people. For a small percentage of speakers, extra training can significantly improve results. If, after you begin using the program, you find that the computer is making more mistakes than you expect, additional training may help.

Another challenge is how to distinguish between two or more phrases that sound alike. People use common sense and context--knowledge of the topic being talked about--to decide whether a speaker said "ice cream" or "I scream."

Speech-recognition programs don't understand what words mean, so they can't use common sense the way people do. Instead, they keep track of how frequently words occur by themselves and in the context of other words. This information helps the computer choose the most likely word or phrase from among several possibilities.

Finally, people sometimes mumble, slur their words, or leave words out altogether. They assume, usually correctly, that their listeners will be able to fill in the gaps. Unfortunately, computers won't understand mumbled speech or missing words. They only understand what was actually spoken and don't know enough to fill in the gaps by guessing what was meant.

To understand what it means to speak both clearly and naturally, listen to the way newscasters read the news. If you copy this style when you use Dragon NaturallySpeaking, the program should successfully recognize what you say.

One of the most effective ways to make speech recognition work better is to practice speaking clearly and evenly when you dictate. Try thinking about what you want to say before you start to speak. This will help you speak in longer, more natural phrases.

Speak at your normal pace without slowing down. When another person is having trouble understanding you, speaking more slowly usually helps. It doesn't help, however, to speak at an unnatural pace when you're talking to a computer. This is because the program listens for predictable sound patterns when matching sounds to words. If you speak in syllables, Dragon NaturallySpeaking is likely to transcribe each syllable as a separate word.

With a little practice, you will develop the habit of dictating in a clear, steady voice, and the computer will understand you better.

When you read this training text, Dragon NaturallySpeaking adapts to the pitch and volume of your voice. For this reason, when you dictate, you should continue to speak at the pitch and volume you are speaking with right now. If you shout or whisper when you dictate, Dragon NaturallySpeaking won't understand you as well.

And last but not least, avoid saying extra little words you really don't want in your document, like "um" or "you know." The computer has no way of knowing which words you say are important, so it simply transcribes everything you say."