Advertisement

Speech and handwriting: Where’s the app?

Share

A10-inch screen, more books than you can possibly read and a gazillion potential apps: The iPad has its enticements, but for decades I’ve wished for something more.

In the early 1980s, I worked with another reporter, George Frank, on what turned out to be the Los Angeles Times’ first computer-assisted investigative project. The issue was laundered campaign contributions -- hundreds of them. And to track and cross-file the illicit donations, we lugged boxes of 3-by-5 index cards from Southern California to Sacramento and back, as we researched documents and interviewed politicians and bogus contributors.

My high-tech epiphany came at the top of the outside boarding steps as I was getting off a flight at the old Ontario airport. With a Santa Ana wind whipping my hair across my eyes, I envisioned what would happen to all those index cards if I stumbled on my way across the tarmac.

Advertisement

The next week, boxes holding a brand new IBM personal computer filled the back seat of my car. I had to drive from Sacramento to San Jose to buy it, but I now had all the parts of an electronic system that could run the most exciting invention I’d ever imagined: Superfile, a computerized cross-indexing program. OK. I’m a nerd.

But not the kind who enjoys fiddling with computers. For me, the attraction of computers is what they can do. And that’s the reason for my decades-old wish list.

Once cross-indexing was conquered, I believed it was only a short time until the hours spent transcribing recorded interviews would be consigned to scribes-in-the-monastery journalism. Think of the uses. Full transcripts of corporate and government meetings, court hearings, academic panel discussions and, selfishly for a journalist, news conferences, all accurately transformed from someone’s mouth to text on my computer. Presto.

Best of all, I was positive a program was just around the corner to read my terrible handwriting and translate the scrawl of notes into instant text. Hey, I even bought an Apple Newton. But you guessed that.

What’s taking so long?

As a variety of experts explained to me last week, both voice and handwriting recognition software share a similar challenge: They just can’t match the human brain, and the variety of human brains.

For business uses, if people print clearly or fill in forms with a specific vocabulary, software can “read” handwriting and transcribe it into text with 90% to 95% accuracy, said Pietro Parravicini, head of the U.S. subsidiary of the Swedish tech company Anoto. But when faced with the full human vocabulary, a mixture of topics and messy cursive handwriting, accuracy drops to 50% to 60%, or even less. A poorly drawn “f,” for example, may be mistaken for an “s” and “fix” becomes “six.”

Advertisement

“If people expect handwriting to be interpreted perfectly, it’s not possible because handwriting is not perfect,” said Parravicini.

But we won’t need handwriting in the future, will we? Not so fast: Kathleen S. Wright, national product manager for handwriting at Zaner-Bloser Educational Publishers, points to studies that link handwriting with how we learn to communicate and acquire language at a sophisticated level. Besides, writing with one hand is easier than typing with two hands, and anyway, who wants a keyboard “autograph” from a celebrity?

As for speech, software has to be taught what humans mean in order for transcripts to make sense. For example, said Dilek Hakkani-Tür, a senior research scientist at the International Computer Science Institute in Berkeley, in spoken conversation, human ears instinctively understand the difference between a question and a statement. Right? Right!

And with speech, she and others said, there are the added voice-to-text problems of people interrupting each other, background noise, male voices versus female, the sounds of happiness, sadness, individual accents and variations caused by age.

I’d had high hopes that the excitement leading up to Apple’s latest innovation meant the iPad would do it all. Sigh. Maybe next year’s model. More likely, according to the experts, five to 10 years. Or so.

“We’re the only species that really communicates by speech,” said Abeer Alwan, a UCLA professor of electrical engineering who specializes in speech processing. And, although a lot of research is being conducted, scientists still “don’t know how human brains are really doing it.”

Advertisement

In other words, there’s no app for that.

Tracy Wood is a writer in Southern California.

Advertisement