Nation’s History Fades Away as Data Is Lost on Computer Tapes : Archives: Information is unreadable because agencies have updated their systems and there is neither the money nor the knowledge to translate it from the earlier versions. Improper storage has also left reels damaged.
A slice of America’s history has become as unreadable as Egyptian hieroglyphics before the discovery of the Rosetta stone.
And there’s more historic, scientific and business data in danger of dissolving into a meaningless jumble of letters, numbers and computer symbols. Americans paid billions to collect the information and may now have to fork over millions more to preserve it.
That’s part of the price for the country’s eager embrace of more and more powerful computers. Much information from the last 30 years is stranded on computer tape from primitive or discarded systems--unintelligible or soon to be so.
As a result, hundreds of thousands of Americans researching family history--the largest use of the National Archives--will find fascinating records of their relatives beyond reach. Detection of a disease or environmental threat or shift in social class could be delayed because data was lost before researchers even knew what questions to ask.
“The ability to read our nation’s historical records is threatened by the complexity of modern computers,” said Rep. Bob Wise (D-W.Va.), chairman of a House information subcommittee, which wants the government to start buying computers that will preserve data for future researchers. “Already the National Archives has computer records that can’t be read,” he said.
A number of records are lost or out of reach:
* Two hundred reels of 17-year-old Public Health Service computer tapes were destroyed last year because no one could find out what the names and numbers on them meant.
* The government’s Agent Orange Task Force, asked to determine whether Vietnam soldiers were sickened by exposure to the herbicide, was unable to use Pentagon computer tapes containing the date, site and size of every U.S. herbicide bombing during the war.
* The most extensive record of Americans who served in World War II exists only on 1,600 reels of microfilm of computer punch cards. As the 50th anniversary of Pearl Harbor approaches, no manpower, money or machine is available to return the data to a computer so ordinary citizens could trace the war history of their relatives.
* Census data from the 1960s and NASA’s early scientific observations of the Earth and planets exist on thousands of reels of old tape. Some may have decomposed; others may fall apart if run through the balky equipment that survives from that era.
“We’re just scratching the surface of the problems we will have,” said Kenneth Thibodeau, director of electronic records for the National Archives. His staff of 20--Congress just agreed to add another eight--is responsible for all government computer records in the archives.
“If every agency that owes us stuff from the last 20 years gave it to us now, it would take my staff at least 25 years to process it,” Thibodeau said.
And that’s just to clear the backlog. A huge amount of data is created by the government every day and more is coming: NASA’s Earth Observing System, set for 1998 launch, could generate the equivalent of all 15 million books in the Library of Congress every 12 weeks.
One of the biggest headaches is sloppy record-keeping. Everyone who designs a computer or a program for it is supposed to write down--on paper--how the machines operate, how the program organizes data and what information is on each tape.
Often, they didn’t.
“Generally it’s the last thing you do and pay the least attention to,” said Gerald Cranford, assistant Census director for data processing.
“Documentation is a bore,” Thibodeau explained.
It could get worse. The government now owns more than a million desktop personal computers. “There’s not a great deal of control over what kind of information is on them, how long it’s kept and how well it’s documented,” Thibodeau said.
Sometimes deciphering the old programs isn’t enough. New programs must be written to reorganize the data before newer computers can analyze it.
The size of this problem dwarfs the manpower and money committed to it by government, as the preservation efforts of one private business indicate.
The Huntington Bank of Columbus, Ohio, just switched to a new computer system--a four-year project that required writing 200 to 300 conversion programs. Senior Vice President John Voss said his 100-person data staff was aided by four outside companies.
At the National Archives, Thibodeau has spent $60,000 over several years on consultants and computer time to figure out how the Pentagon organized its Vietnam records and to write translation programs. He is not finished.
This file consumed so much of his sleuthing budget--only $100,000 a year for all Archives computer records--because more requests come for Vietnam data than for any other kind.
Still, when an airman sought records of his missions, Thibodeau said, “It was sad for us to say, ‘We have the data, but the only way you can get it is to buy 50 reels of tape on air operations and search through it.’ ” And he would need a programmer to help read it.
Why wasn’t one computer saved with software for the original program?
“Theoretically, that’s all we’d need,” Thibodeau said. “But in 1990, could we find spare parts for a 1960s computer? Could we find operators who remembered how to run that thing and would know errors when they came up?”
Since 1960, Cranford said, the Census Bureau has changed computers and tape drives five times--going from tapes that recorded 200 bits of data per inch to ones that record 6,250 bits per inch. Tapes were reprogrammed to run on two different kinds of computers, IBM and Univac.
With each change, he said, the old tapes were supposed to be modernized.
But there are 4,000 tapes “that weren’t copied in one or another of these phases,” Cranford said. Revamping them is estimated to cost $2 million and require a year’s work by 27 programmers.
The tapes include parts of the 1960, 1970 and 1980 censuses. Also included are some periodic surveys of unemployment, income, economic growth, crime, foreign trade, immunization of children and other topics, Cranford said.
Census still has a couple of outdated, seven-track tape drives to do this, but, Cranford said, they randomly eat parts of the tape.
“NASA is in the same boat, as are the IRS and Social Security,” Cranford said.
The National Aeronautics and Space Administration also has some problems as low-tech as basement leaks. The space agency has 1.2 million reels of computer tape, the fruits of $24 billion in space science missions since 1958.
Hundreds of thousands were stored “under deplorable conditions,” risking permanent loss of data on topics like loss of ozone and rain forests. That’s according to the General Accounting Office, which counted the tapes.
In one dirty basement, the accountants found NASA tape canisters coated with a film from flooding.
Restoration has a cost too. NASA’s Jet Propulsion Laboratory near Pasadena, Calif., is about to spend $1.9 million to reclaim just 135,000 tapes dating back to Earth orbiters in 1958. Some information will be lost.
The JPL already has lost 225 computerized images of Mars, Venus, Jupiter and Saturn and may lose some 1960s photos of the moon’s dark side that could be useful in the planned moon-to-Mars mission.
Scientists may never know the value of what is gone.
In 1978, Nimbus 7 recorded surprisingly low ozone levels over Antarctica. The computer marked the levels “erroneous and did not process” them, said Arthur Zygielbaum, the JPL’s science information systems manager.
After British scientists discovered a hole in Earth’s ozone layer in the 1980s, NASA took another look at its data and found that it had evidence of the hole a decade earlier.
“We ought to save everything,” Zygielbaum concluded. “We don’t know the right questions to ask, but some scientist in the future will.”