Advertisement

Voice Systems : Computers: Now They Talk Back

Share
Times Staff Writer

Pedaling his bicycle along a San Juan Capistrano street in September, 1986, Tom Bocsko didn’t notice the car parked in the middle of the bike lane until it was too late. The accident left him paralyzed from the neck down.

Suddenly, simple, everyday tasks, such as answering the telephone or writing a letter, became exceedingly difficult for the 34-year-old fisheries scientist. But a futuristic speech-recognition computer named “HAL” is making his life a little easier now.

By speaking into a microphone attached to his wheelchair, Bocsko can order his personal computer to answer the phone, switch on the lights or play a jazz record on the stereo.

Advertisement

“HAL is really important to me,” says Bocsko, who spends six or seven hours a day working with the computer. “It relieves me from having to depend on someone else all the time.”

No Longer Science Fiction

Speech recognition systems--computers that respond to verbal commands--are moving out of science fiction movie plots and into the home, office, automobile and factory. Some examples:

--On Southern California roadways, cellular car phone enthusiasts are installing voice-activated dialers that let them phone home on command while letting their fingers do the driving.

--At some airports, airlines are using voice recognition technology to help sort baggage. A worker looks at a destination tag on baggage passing on a conveyor belt and says “L.A.,” or “New York” or “Peoria.” The bags are then routed onto the proper belt.

--At a Hughes Aircraft plant in El Segundo, workers use a voice-controlled computer to log information about defective electronics parts they spot. The company says the system has helped increase worker productivity and cut down on errors.

--The McDonald’s restaurant chain is considering using voice technology to speed hamburger and soft drink orders at its drive-through windows.

Advertisement

Increasingly Common

In the last few years, voice recognition systems have grown increasingly commonplace. They are most often found in factory environments where workers’ hands are occupied by other tasks. The systems enable them to record information orally rather than by writing by hand or typing on a keyboard.

Although technical advances have increased the sophistication of today’s voice systems, most are still limited in their ability to respond to human speech. But the technology is improving, and voice systems are expected to become faster, easier to use and less expensive during the next several years.

“It’s an extraordinarily powerful technology that will change the way literally everybody does their work,” said Janet Baker, president of Dragon Systems, a Newton, Mass., company specializing in voice technology.

Confident that the technical hurdles can be overcome, researchers are looking at an array of applications. U.S. auto manufacturers, for instance, are experimenting with voice-controlled systems for the cars of the 1990s. The systems would enable drivers to control the car radio, interior temperature and instruments. Further down the road, cars could be equipped with computerized maps that would respond to questions such as: “How do I get to 100 Nowhere Drive?”

For years, the Holy Grail of industry research has been a voice system that could be spoken to as easily as another person. Researchers are trying to develop a computer capable, for instance, of taking dictation from an executive. Most experts believe that such a system, known as a “talkwriter” in industry jargon, is at least five years away.

The outlook hasn’t always been so bright for the fledgling industry that has grown up around voice technology. The task of getting computers to act more like humans has posed a daunting technical challenge.

Advertisement

That challenge has attracted the attention of about two dozen companies, ranging from IBM and Texas Instruments to smaller firms such as Speech Systems of Tarzana and Dragon Systems in Massachusetts. These companies generally share the belief that if computers can be made less intimidating, millions of people who don’t already use them will be encouraged to do so.

Some Investors Soured

During the past decade, speech recognition companies have appeared and disappeared, and many investors have soured on an industry that trumpeted billion-dollar sales forecasts that have turned out to be little more than hot air.

But the technology is improving, and the predictions of rapid market growth seem to be coming closer to reality. “It’s very clear now that the market has started its growth and has a very steep positive slope,” said Dragon Systems’ Baker.

Until recently, most voice systems on the market had limited vocabularies, required speakers to pause after each word or required that speakers “train” the computer to understand their voices. Many systems still have one or more of these limitations.

“The ability of speech recognition to deal with just any speaker is still not good,” said Bill Creitz, editor of Voice News, a Rockville, Md., newsletter.

30,000 Words

But a Tarzana company, Speech Systems, has overcome some of those problems. Speech Systems has developed a voice recognition system with a vocabulary of 30,000 words. Most systems on the market are limited to several hundred words.

Advertisement

The firm’s system, which costs about $9,000, is also capable of recognizing naturally spoken, or “continuous” speech, and is not limited in the number of people it can respond to. The machine is one of the most sophisticated on the market, but it must be tailored for highly specific tasks.

“The technology has evolved to the point that it is no longer the limiting factor,” said Doug Palmer, a marketing program manager for Texas Instruments, which produces many of the computer chips used in voice-recognition systems. “The limiting factor is matching the technology to the appropriate user.”

Until recently, most speech systems were used by the handicapped, computer hobbyists or manufacturing companies looking for a way to increase productivity. The technology has proven particularly useful for so-called “hands-busy, eyes-busy” jobs; on assembly lines, for example, where workers must pay close attention to their task.

Less Work, Fewer Errors

Hughes Aircraft credits its voice system in El Segundo for savings of up to $1.5 million a year in data entry costs, a 90% reduction in paper work and a reduced rate of inspection errors.

A drawback of most voice recognition systems is that they don’t work well in noisy factory or office environments. That problem must be overcome if the technology is to gain wider acceptance in offices and factories.

“A lot of vendors claim their products are from 85% to 99% accurate in recognizing words, but if the telephone rings or something heavy drops, it impacts the quality of the text,” said Janet Baker, who follows the industry for International Data Corp., a Framingham, Mass., market research firm. In current voice systems, the larger the vocabulary, the lower the accuracy.

Advertisement

One of the first places where the technology found a home was in helping the handicapped.

Cost Him $3,000

Bocsko bought a voice system for $3,000 from the Voice Connection, an Irvine company, after his 1986 bicycle accident. The Voice Connection named its system Home Automation Link, or HAL, in memory of the disobedient computer in the movie “2001: A Space Odyssey.” The system is attached to a conventional, IBM-compatible personal computer.

Today, Bocsko spends an average of seven hours a day sitting in front of HAL, talking on the phone, writing letters with a specially adapted word-processing program and teaching himself a difficult computer programming language--a skill he hopes will improve his employment prospects.

The technology is helping the handicapped in other ways as well. The U.S. Department of Education, for example, is studying how speech recognition systems could assist the deaf in improving their speaking abilities. Speech systems can be programmed to listen to a deaf person speak a sentence, display the sentence on a computer screen and “score” each word based on how well it was pronounced. The deaf person then works to improve his or her score on each word.

Speech recognition technology is just beginning to show up in consumer products, such as “talking” toys and cellular phone dialers.

Talking Doll

Dolls and animals that respond to simple commands from their playmates hit the market a few years ago. Playmates Toys of La Mirada markets a talking doll named “Julie” whose head, mouth, eyes and arms move in synchronization with its speech. The doll, which sells for $150 to $200, asks the child such questions as: “What is two plus two? Is it three, four or five?” If the child answers incorrectly, the doll says, “Try again.” If the child gives the correct answer, four, it responds: “Great, I knew you knew the answer.”

Sales of voice-activated dialers for cellular car phones are beginning to take off. Orange-based Interstate Voice Products is marketing a $400 cellular phone that dials preprogrammed numbers in response to voice commands such as “office” or “mom’s house.”

Advertisement

The market for speech recognition is still small, with 1987 sales of about $26 million, according to the Yankee Group, a Boston-based technology consulting firm. But the technology should experience a healthy growth of 20% to 40% annually during the next five years, said Blair Pleasant, a Yankee Group analyst.

Previous projections for the speech recognition market have been wide of the mark. In fact, analysts have been predicting--incorrectly--explosive growth for the technology since the late 1970s.

Some Became Cynical

The industry “was overdoing it for awhile and then people started getting cynical,” said William Meisel, chairman of Speech Systems and a former USC computer science professor. “People kept saying speech recognition was coming every year, so then some of them said, ‘Let’s not say it’s coming until the year 2000.’ ”

“You won’t see an explosion in business next year,” Meisel continued, “but you’re going to start seeing major installations. By the end of the year, it will become very clear that something new is happening and speech recognition is going to be an explosive market.”

Other observers are more guarded in their optimism. Karl Kozarsky, vice president of Probe Research, a market research firm in Morristown, N.J., said he is “fairly convinced” that the long-forecast growth will finally occur.

How quickly the technology takes off will depend on the ability of companies to develop systems that work faster, are easier to use and cost less. The willingness of office workers to accept the notion of talking to their computers instead of punching keyboards will also be an important factor.

Advertisement

‘Talkwriter’ Progress

The most sought-after goal for the industry is the development of the “talkwriter.” Some companies, such as Speech Systems and Dragon Systems, already offer machines that can understand continuous speech--or speech without pauses--but they have limited vocabularies and are too expensive for many applications.

Speech Systems spent six years and $14 million developing a voice recognition system that doesn’t require the speaker to pause between words. The company calls its product the “Phonetic Engine” because it recognizes phonemes--the basic elements of speech--rather than complete words. The system is limited to a specific vocabulary, such as terms used in the medical or scientific fields.

After more than a decade of research, scientists only recently have been able to teach computers to recognize different words that sound alike, or homonyms, such as “bite” and “byte.” To accomplish that, powerful machines equipped with complex software are needed.

IBM has developed an experimental system that identifies words by the context in which they are used. The system does so by determining which words are more likely to follow another. The system, whose 20,000-word vocabulary is limited to the language of the business world, was developed by studying the words most commonly used in memorandums sent among IBM offices.

Still Falls Short

Despite the advances, no company has been able to develop a system that would allow a secretary to dictate a letter or a journalist to write a story simply by talking to a computer.

The IBM system, for example, requires that the system be trained to understand its user--a procedure that takes about 20 minutes. It also requires that the speaker pause briefly between words.

Advertisement

“The system is not yet capable of continuous speech,” said Gerald Present, a spokesman at IBM’s Thomas J. Watson Research Center in Yorktown Heights, N.J. “That’s one of the things we’d like to achieve. At present, it’s thought that will take at least several more years of research.”

“Continuous speech is the Holy Grail that everyone is after, but no one has it yet,” said Deana J. Murchison, a Speech Systems spokeswoman. “Doing that entails building a model of the entire English language.” The Tarzana firm has employed linguists, computer scientists, mathematicians and statisticians to develop its systems.

Voice-Activated

But some industry observers believe a more immediate goal for the industry is to develop voice systems that can respond to commands over the telephone. One example: a voice-activated, computer-controlled system that would allow a user to call home before leaving work and order the system to increase the heat, turn on the lights and preheat the oven.

“The technology is still not at the level where you can use it over the telephone without using certain codes,” said Pleasant, the Yankee Group analyst.

Voice Prints, a Costa Mesa company, recently announced a $325,000 contract to supply voice technology to Amway Corp., the network marketing giant. The system will enable any of Amway’s hundreds of thousands of U.S. distributors to place an order over the telephone by speaking the code name for the product. The system eliminates the need for a human operator to take the order at the company’s Ada, Mich., headquarters.

There is also interest linking voice recognition technology and the telephone for consumer-oriented applications, such as getting information about one’s bank account or theater movie schedules.

Advertisement

Some industry observers believe speech recognition will be standard equipment on personal computers in the future. “What we’re attempting to do is provide another way for someone to interact with computers or a machine without thinking about how they’re doing it,” adds Speech Systems’ Meisel.

Advertisement