Advertisement

Next Big Step: Computers That Can Read and Think : Electronics: Post office machines can scan and process addresses, but they reject 30% as unreadable.

Share
ASSOCIATED PRESS

To most people, the phrase computer literacy means knowing how to use a computer. To Sargur Srihari, though, it means giving a computer the know-how to read and “think” like a human being.

Such a computer would need to do what people do--draw from experience and make educated guesses. Human beings “don’t just do character recognition. We don’t just look at one thing,” said Srihari, who is a professor at the State University of New York Buffalo campus. “It’s going to be many, many years before computers can read documents, such as handwritten letters.”

For now, Srihari and his team of about 30 assistants would be satisfied if a computer could read the addresses on envelopes. So would the U.S. Postal Service, which is supporting the project with a $2.1-million grant.

Advertisement

The central post office in Buffalo, for example, must sort as many as 3.3 million pieces of mail a night. Its existing equipment could process 42,000 letters an hour if the addresses were perfectly printed, but the machines reject more than 30% of the envelopes as unreadable, said Dennis Wnuk, a postal operations officer in Buffalo.

“Primarily, what our optical character readers can read is business-type mail, preprinted mail,” Wnuk said. “The average piece of mail that a residential customer will put into a collection box--we’ll only read about 25% of those, and that’s only if he prints very well or happens to have a typewriter.”

But business mail can also be a problem if it comes in a gaudy envelope, if the address is too high, or if it is folded in a way that allows part of the letter to show through the address window , said John Gullo, an automation and readability specialist at the post office.

“People who use printers, they don’t like to change the ribbon because they want to get every last address out of the ribbon. That gives us trouble,” Gullo said.

The first major challenge for Srihari’s team is to get a computer to find the address itself, a bewildering task for a computer, which must separate the address from other elements on items such as a magazine cover or sweepstakes mailing.

To solve the problem, the SUNY computer looks for text blocks with the expected address shape--for example, text blocks that have the lines flush on the left. It has had a 90% success rate with this method, researchers say.

Advertisement

The next task for the computer is to figure the address itself.

To recognize a 2, for example, a computer could be taught to look for ends in the top center and lower right, a curve in the upper right and a sharp V-like bend in the lower left, said Alan Commike, a graduate student who is handling the number-recognition aspect of the project.

The problem is that there are so many ways to write numerals that 130 such instructions are needed to identify them all.

“There are 2s with holes” or loops instead of points, Commike said. “There are 5s with hats, 5s without hats. There are British 7s and American 7s.”

The next step is to teach the computer to make educated guesses to fill in unreadable blanks. For example, if the street address is not entirely legible, the computer narrows it to a few possibilities by matching the part it can read against a list of all streets and numbers in that particular ZIP code.

“This is what is called using contextual information--bringing knowledge to bear, which is what people do,” Srihari said.

Srihari’s computers can now read about 75% of handwritten ZIP codes; current postal equipment can read only about 5%. But the new process is slow, taking as long as a minute for each piece of mail.

Advertisement

Speeding it up will require specially designed hardware, and that is 18 months or more away, Srihari said.

The researchers are testing their equipment with the real thing.

“We had a team of undergraduates working nights in the Buffalo post office,” Srihari said. “We didn’t delay anybody’s mail. We just took it for a few minutes, captured it and put it in our database.”

Eventually, Srihari said, the technology will have a variety of applications. For example, a busy newspaper reader could feed his paper into a computer that would cull only those stories of interest to him. Office workers could do the same with memos.

Srihari says his project might someday allow people to have it both ways--in a computer and on paper.

“People are not going to get away from hard copy. People like to have hard copy,” Srihari said. “It’s something tangible. It’s nice to be able to take it somewhere.”

Advertisement