Advertisement

Right Tool Is a ‘Printer in Reverse’

Share
RICHARD O'REILLY <i> designs microcomputer applications for The Times</i>

The information age has a serious drawback. The information you receive often is not in the form you need to make good use of it.

For instance, I see a lot of computer articles and press releases that I would like to be able to keep and retrieve later by subject. The traditional file folder in a drawer isn’t a very handy solution. What if the document covers dozens of subjects?

Then there are tables of numbers to contend with. One week I may get an annual report on Apple Computer and the next week, one on Tandy Corp. If I want to save the numbers so that I can later compare them, the best way has been to keyboard them into a spreadsheet.

Advertisement

What I really need is to be able to convert the material into digital form for storage in a computer. Then I can use the data with the appropriate software, be it word processing, desktop publishing, spreadsheet or text retrieval.

Recently I found just the tool to make all of this happen, TrueScan from Calera Recognition Systems Inc. in Santa Clara.

Stephen M. Dow, chairman of Calera, which recently changed its name from Palantir Corp., describes TrueScan as a “printer in reverse.” It is an apt description.

TrueScan is a expansion board for an IBM PC/AT or compatible computer that processes information from page scanners. TrueScan converts the information from the scanners into a form that can be used by most of the leading word processing, desktop publishing, spreadsheet, graphics and text retrieval software. (It is not available for IBM’s top-of-the-line PS/2 computers with the Micro Channel architecture, the models 50, 60, 70 and 80.)

The usual term for such a function is optical character recognition, but TrueScan goes well beyond that to what Calera calls complete document recognition, meaning it understands what text goes where as well as what the text is. That is a formidable task and to accomplish it, Calera’s expansion board is really a little computer unto itself, complete with a Motorola 68020 microprocessor (the same chip that powers the Macintosh II) and either two or four megabytes of operating memory.

Not That Expensive

Character recognition software for PCs has been around for several years. The trouble with most software-only solutions (as opposed to TrueScan, which is one of the first “board” systems) is that they can recognize a limited number of type styles and sizes. Some have to be taken through an extensive “training” exercise with each new kind of type.

Advertisement

Even the best of the software-only approaches lack many of the document recognition features of True-Scan--including, for example, the ability to automatically recognize both columns and tables in a single scan or to handle both graphics and text, also in a single scan. When you consider the extra memory and processing power you would need in a PC to properly run a comprehensive software-only product, True-Scan’s price isn’t that expensive.

The two-megabyte board, TrueScan model S, sells for $2,495. The more powerful model E, which goes for $3,495, can process pages about 25% faster and work with pages that are printed sideways on the paper. For instance, a one-page double-spaced letter took the model E board 46 seconds to process, while a full-page table of numbers of marginal typographical quality took nearly four minutes.

The scanner is a separate piece of equipment available from several manufacturers, though not Calera, and it is the starting point for either character or document recognition. What a desktop scanner does is basically take a picture of the page loaded into it, whether it is a page of text, a table of numbers, a photograph or a drawing. The resulting picture is understood by the computer only as relative values of light or dark dots in a grid pattern.

I used a Hewlett-Packard ScanJet ($1,495) in my test of the TrueScan with excellent results.

The task that TrueScan (or any other recognition system) has to perform is to discern the numbers, letters, punctuation marks and other text symbols from the patterns of light and dark dots created by the scanner. If there are pictures or graphics on the page, it has to convert those patterns into a recognizable format.

There are several software packages available that can convert scanned images of text into files that can be edited with word processing software. The accuracy of such conversions depends on the quality of the page that was scanned. If it was an original produced on a electric typewriter with a fresh ribbon, most optical character recognition programs will do a good job.

Advertisement

In the real world, however, I want to convert photocopies of text, pages printed on dot matrix printers or laser printers, articles clipped from magazines, printed price lists, newsletters with multiple columns of type and pages that combine photographs, charts, drawings and paragraphs of text. TrueScan does all of that with remarkable, if not perfect, accuracy.

Using TrueScan is very easy. Using a menu displayed on your computer screen, you specify what kind of document is to be scanned--text only, image only or a mixture--and what application program the resulting file will be used with.

It creates formatted files for more than 22 word-processing and desktop publishing packages. Spreadsheet files for Lotus 1-2-3, Excel and Quattro can be created. Several popular kinds of image formats are provided. In addition, TrueScan works with a number of computer FAX cards so that FAX images received can be converted into application files.

Most kinds of type faces from 6 points to 28 points (up to about 3/8 of an inch high) can be recognized by TrueScan, and it is not a problem if varying sizes of type are on the page.

The system recognizes basic kinds of formatting such as centered text, paragraph indentations, bold type and underlining, and it passes those attributes along to the file it creates. If the scanned page has multiple columns of text, they are converted into a single column of text in the file. That column, however, can be converted back to its original form if your applications software allows it.

In my tests the system worked very well with a wide variety of pages. You have to proofread the results, of course, but the errors are usually easy to spot--many are marked with an asterisk--and easy to fix with the application program.

Advertisement

I particularly enjoyed being able to scan a table of numbers and have TrueScan automatically create a spreadsheet file that can be put to use immediately. (You do have to proofread carefully, though, and correct any numbers that were misread.) The widths of the columns even varied in the same proportions as those of the original.

Computer File welcomes readers’ comments but regrets that the author cannot respond individually to letters. Write to Richard O’Reilly, Computer File, Los Angeles Times, Times Mirror Square, Los Angeles, Calif. 90053.

THE PRODUCT

TrueScan

TrueScan can convert images from a page scanner into text, spreadsheet tables and graphics.

Features: Model S ($2,495) has a Motorola 68020 processor, two megabytes of memory and two custom chips that allow it to work with 8 1/2-by-11-inch pages printed in the normal “portrait” mode. Model E ($3,495) has a Motorola 68020 processor, four megabytes of memory and three custom chips. It is 25% faster and also recognizes pages printed sideways in the “landscape” mode. It creates formatted text files for 22 word processing and desktop publishing programs, plus spreadsheet files for Excel, Quattro and Lotus 1-2-3. Images are converted into three popular graphic file formats, TIFF, PCX and PC Paintbrush. Accessory controller boards that plug into TrueScan and operate with Hewlett-Packard ScanJet or Canon IX-12 scanners are available for $249

Requirements: IBM PC/AT or a compatible computer.

Manufacturer: Calera Recognition Systems; 2500 Augustine Drive, Santa Clara, Calif. 95054, (408) 986-8006

THE PRODUCT

ScanJet

ScanJet is a desktop scanner that can be connected to all models of IBM compatible personal computers and to most Macintosh computers.

Advertisement

Features: Capable of handling 8 1/2-by-14-inch pages (8 1/2 by 11 inches without automatic document feed accessory). With its own interface board and software, it is capable of creating images at resolutions from 38 to 600 dots per inch, with 300 dpi being standard. It stores a 16-level gray scale and allows images to be scaled from 7% to 1,578% of the original. Scanner alone costs $1,495. (This all you need if it is used with the TrueScan accessory controller board.) Interface kits for IBM PC, PC/XT, PC/AT, PS/2 Models 30, 50, 60, 70 and 80 and Macintosh Plus, SE and II, including expansion circuit boards, software and cable, sell for $495 each. The IBM PS/2 Model 30 also requires a $495 accessory kit. Automatic document feeder costs $595.

Manufacturer: Hewlett-Packard; Customer Information Center; 19310 Pruneridge Ave.; Cupertino, Calif. 95014. Telephone inquiries should be made to Hewlett-Packard sales offices listed in local white pages directories.

Advertisement