Since Sept. 11, the FBI has budgeted tens of millions of dollars to turn its massive collection of computerized case files, memos, tips and phone intercepts from an investigative black hole into a mother lode of predictive intelligence.
If the effort succeeds, by Sept. 11, 2004, it will have replaced today's system--so antiquated and cumbersome that many top FBI executives have never learned to use it--with a high-tech brain that instantly culls years of records and eventually will simultaneously check databanks in other government agencies, public records and the Internet.
And that's just the beginning.
By Sept. 11, 2011, the FBI hopes to use artificial-intelligence software to predict acts of terrorism the way the telepathic "precogs" in the movie "Minority Report" foresee murders before they take place.
The goal is to "skate where the puck's going to be, not where the puck was," said Robert J. Chiaradio, who until recently oversaw data system improvements as a top aide to FBI Director Robert S. Mueller III. "We have to get ourselves positioned for Sept. 10, not Sept. 12."
The technology plan reflects a belief that the chief weapon against terrorism will not be bullets or bombs. It will be information.
But intelligence experts, computer scientists and civil libertarians remain skeptical about whether the FBI can--or should--reverse 94 years of entrenched bias in favor of shoe-leather detective work, and turn itself into a high-tech domestic CIA. And they caution that using databases to foretell acts of terrorism is still a science fiction fantasy.
"These techniques assume that the past predicts the future," said Rakesh Agrawal, an IBM Corp. scientist and a leading "data mining" expert. "But what if the future is completely different?"
Before Sept. 11, no one had crashed a hijacked plane into a skyscraper. Before Jan. 27, when a blast ripped through Jerusalem's commercial district, there had never been a female suicide bomber.
FBI leaders insist that effective data mining--sifting investigative knowledge from voluminous electronic files--will overcome such obstacles.
They point out that rudimentary data mining already has become commonplace. Any Internet user can instantly search more than a billion Web pages for, say, "Middle Eastern flight-training students." The popular search service Google ranks results by popularity--pages that receive the most visits and are most often referenced by other pages are listed first--one formula for making sense of more information than a person can digest.
Retail stores analyze data on millions of purchases, then draw conclusions on buying habits to pitch discounts or new products.
"Just as Wal-Mart's trying to figure out what people's buying patterns are, some of that logic can translate into law enforcement," said Mark Tanner, the FBI's deputy chief information officer.
Broad Changes Needed
But to get there will require sweeping changes. Today at the FBI, a comprehensive electronic search requires separate checks of 42 databanks of case files, memos, video footage, mug shots and fingerprints. It's as different from Google as the Web is from government-issue file cabinets, where 1 billion FBI documents still reside.
That will soon change, FBI leaders promise. In the next fiscal year alone, the FBI has requested $76 million to combine and enhance its databases, on top of $730 million more previously budgeted for "Trilogy"--code name for a general technology upgrade, the third try after two failed efforts. The bureau says it will replace paper files and inefficient text-only electronic databases with a "virtual case file" system that will allow rapid, Web browser-like views of video, photos and sounds.
Though technologically feasible, that goal remains distant, given the bureau's primitive technology.
"When I came in I said I wanted it done in a year," Mueller told a Senate committee in June. Now he estimates two to three years. "We do not have the data warehousing, we do not have the software applications [for this] kind of searching."
Still, within the FBI, Mueller is widely viewed as having a better grasp of technology than his predecessor, Louis J. Freeh, and greater drive to make changes--especially after Sept. 11.
"They're on the right track," said Nancy Savage, head of the FBI Agents' Assn. Unlike earlier failed technology efforts, she said, Mueller has involved field agents in the planning and testing.
As a model, experts point to the Defense Department's Global Command and Control System, an immensely complex and far-flung system that analyzes intelligence data, satellite imagery, troop movements, weapon status and a multitude of other inputs from all over the world, yet operates efficiently and effectively. Unlike typical government data systems, built from scratch, the Command and Control system is built largely from off-the-shelf commercial hardware and software and took less than two years to build in the mid-1990s.
After the FBI gets its data systems operating, it will try to tie them to information held in the databanks of other agencies or private entities that may prove crucial in rooting out terrorists.
For example, by combing different agencies' records, the FBI could find a person who was denied a visa, took a flying lesson and may be moving next door to a suspected terrorist. An automated process would connect the information "for an analyst to say, 'Hey look, here's three clues,' " Chiaradio said.
That process is technically challenging because it involves many systems that use incompatible software and divergent methods to label and organize information.
But similarly connected databases are becoming commonplace in the corporate world and gradually are being adopted in the intelligence community, according to private data-mining contractors such as Presearch Inc. and Veridian Corp.
The National Security Agency has linked about 20 disparate databases containing human intelligence, electronic eavesdropping files, pictures and sounds using software from Webmethods Inc., said Len Pomata, a company executive. Pilot projects within NSA and the Transportation Security Administration are now linking such data to public records, such as real estate ownership and marriage and death certificates, he said.
Systems can even be designed to track missing data, said James H. Vaules, a former FBI executive who heads the National Fraud Center, a data-mining subsidiary of Lexis-Nexis.
"A lack of information is probably the [biggest] red flag," he said. "If you are 40 years old and there are no public records on you in this country, then there's something up--it just doesn't happen."
Effort Was 'Pipe Dream'
The FBI has coveted such abilities since the 1980s--investing substantial time and resources without success, according to officials familiar with the project. The entire effort was "a pipe dream," said an agent who declined to be identified.
But data-mining developments are beginning to produce predictive abilities--such as banks scanning credit card purchases for anomalies that suggest fraudulent transactions.
The FBI says such techniques will preempt terrorists.
"There was not a specific warning [before Sept. 11] about an attack on a particular day. But that doesn't mean that there weren't
But systems that make sense of highly varied inputs are still in their infancy, independent experts say.
For example, the NSA may be able to find a photo of a cargo plane and an intercepted flight plan but not know what the plane carried, even if the flight manifest was accessible. Every scanned document, film clip and photo must be labeled with multiple codes to allow efficient searches--and to compare data, the labels must be consistent. To a computer, "occupation" and "employment category" are not necessarily equivalent.
The scope of that task will be staggering, given the volume of terrorism materials in question. Prosecutors in the case of Zacarias Moussaoui, allegedly the 20th Sept. 11 hijacker, declined to print out discovery material for the defendant, because the documents "would leave no room for Mr. Moussaoui in his cell ... and might even consume the entire jail."
Yet the bureau proposes to sift thousands of times as much data as a matter of routine.
Chiaradio said the biggest challenge will not be handling huge volumes of information but securing it.
"Do we want to bet that our technology is going to be one day ahead of a 13-year-old in Alabama who's getting into the system and beating it?" Chiaradio said. "It's a business risk that eventually the director or somebody is going to have to" take.
And internal spies or interagency leaks pose additional security problems.
"The more people who have access to that information, the surer it is to leak," said Michael Vatis, director of the FBI's cyber-crime unit.
Mindful of the damage that FBI spy Robert Philip Hanssen caused by navigating intelligence files, several senators say they are concerned that the FBI may be leaning too far toward an open system in an effort to make files more accessible to all agents.
Sen. Jeff Sessions (R-Ala.) said at a recent Senate hearing that the FBI should keep a separate system for sensitive intelligence data--available only on a need-to-know basis.
Yet in a technical sense, security problems may seem trivial compared with the challenge of developing artificial-intelligence methods that can generate knowledge to stop terrorism before it occurs.
The FBI is seeking pattern-recognition algorithms that can discern hints of terrorism from what Jeffrey D. Ullman, professor of computer science at Stanford University, calls "the soup of billions of possible coincidences."
Instead of needing the right question, an analyst would merely say "show me something out there that looks odd," and get, say, a report about an influx of Middle Eastern men in flight training, he said.
But anticipating acts of terrorism by sorting billions of records with unknown relevance to unknown future attackers is incomparably more difficult than detecting credit card fraud.
Ullman called predictive data mining "one of the fundamental research problems of the age," comparing it to the Manhattan Project, which produced the atomic bomb during World War II. He said it would require an investment of at least $1 billion to accomplish the ultimate goal--"preventing a terrorist group from carrying a nuclear bomb into this country and setting it off."
Even more modest goals may have been placed in doubt by recent departures of key executives. Bob Dies, a former IBM executive who was the FBI's technology visionary, retired in the spring. He has not been replaced. Chiaradio, appointed to manage the FBI technology transition, also left in June, joining the accounting firm KPMG after only six months on the job.
Meanwhile, President Bush has slated the FBI's cyber-crime unit to move to the new Homeland Security Department.
"That would be a major loss to the FBI," said Vatis, the unit's founder. "One of the things we were successful at doing was building a cadre of technical expertise both in headquarters ... and in the field offices."
Members of Congress have grown impatient over missteps on far less ambitious projects than today's proposals. Fingerprint computers and other law-enforcement data systems have cost more than $1.7 billion since 1993--yet still don't operate reliably.
Sen. Charles E. Schumer (D-N.Y.) recently called the FBI's current system "fossil technology," and Mueller's two- to three-year estimate for minimal database efficiency "unacceptable."
Testifying before the Senate Judiciary Committee this month, Sherry Higgins, the FBI's project management executive, acknowledged that "the problems ... didn't occur overnight and they won't be fixed overnight either. That is because it is more important to get it right and know that we have the systems and capabilities that precisely fit our mission, as well as cure past problems."
Despite repeated requests from The Times, the FBI was unable or unwilling to detail its plans for technology spending, or to clarify the relationships among its many technology projects.
Civil libertarians charge that the FBI faces a crisis of competence that sophisticated new technology will only exacerbate--more deeply burying the bureau in information. Already awash in data, the FBI has not even updated its Web-based wanted posters of leading terrorists. The section on Osama bin Laden makes no mention of Sept. 11 and the Web site still lists Bin Laden lieutenant Mohammed Atef as at large, although he was reportedly killed in November.
Documents released in May under the Freedom of Information Act showed that the FBI's "Carnivore" program, which monitors e-mail in criminal probes, had inadvertently gobbled unrelated messages--a violation of privacy laws. When the error was discovered, an FBI technician destroyed the entire data file, including e-mail from presumed terrorists.
"The buck really stops at the FBI for their failure to properly analyze the information they had before Sept. 11," said Marc Rotenberg, executive director of the Electronic Privacy Information Center, the advocacy group that obtained the FBI documents. He called the surge of interest in data mining "sleight of hand" designed to distract focus away from the bureau's failures.
FBI executives agree that there should be some limit on database surveillance. But they insist that a national crisis warrants a shift in the balance between security and privacy.
Critics should ask, "How can we create civil liberties protections that don't get in the way of fighting terrorists?" said Stewart Baker, a Washington attorney and former general counsel for the National Security Agency.
He suggested that database abuses can be prevented with automated audit controls. "One way to protect civil liberties is to make people prepare to justify how they use the systems," Baker said.
Problems of Accuracy
Yet no matter how careful the FBI is, it faces a larger question about the accuracy of records. "Garbage in, garbage out," the old computer adage goes. The accuracy of all kinds of data held by the government or corporations--as victims of identity theft have learned, to their dismay--is highly suspect.
Deep within complex databases, errors can rapidly eclipse reality, as a 1999 Justice Department audit showed.
In a Department of Justice review of an FBI database of 93,000 Florida civil service job applicants, about 12% of those who had criminal records were not detected, while nearly 6% of applicants with no criminal record were identified as criminals.
Moreover, just as spies create false personas, the Sept. 11 hijackers evaded detection, in part, by setting up bank accounts using false Social Security numbers. Such moves to pollute the data stream suggest a flaw in the logic of data mining, skeptics say.
"The people who are the greatest threats are already conducting themselves in such a way that they fall into the most innocuous profiles," said Edward Tenner, author of "Why Things Bite Back: Technology and the Revenge of Unintended Consequences."
"The question is not whether innovations in artificial intelligence are worth trying," he said. "The real issue is the opportunity cost--the other things that experienced investigators could be doing with their time," such as figuring out how to infiltrate Al Qaeda.
Fear of terrorism, the FBI's detractors suggest, has already pushed database research into the realm of the absurd--where innocuous behavior, or even the failure to leave an electronic trail, can arouse suspicion.
"That would be one of the most damaging things terrorism could do to us," Tenner said.
On the Web: The complete series is available on the Web at www.latimes.com/fbi