Advertisement

Searching for Logic Driving Search Engines

Share

Search. It’s the essence of the Web. But is it redolent with thoughtfulness and knowledge or something far less edifying?

Search tools do a remarkable job of finding information in seconds. But every Web newbie soon faces two befuddling realities: Different search “engines” yield dramatically different results on the same request, and many searches bury key sources among thousands of junk links.

Given that every search site claims to offer the best of the Web, it’s fair to ask whether any of them are honest brokers of information. To understand the answer, first consider how search engines work.

Advertisement

Engines used by AltaVista, Excite, Infoseek, Lycos, Yahoo and other big services deploy software “robots” that probe millions of sites and decide which ones to display at the top of the hit list when you make a request.

As the Web grows explosively, those robots face a daunting challenge “just providing relevant information,” said Harley Manning, a Web analyst with Forrester Research in Cambridge, Mass. Determining the best-quality sites is often beyond their powers.

“It’s a step above voodoo,” said Manning, explaining the disparate results offered by different search algorithms. No one understands precisely why the results vary so widely.

And unlike the quietude of a library, where a card catalog offers all references without bias, the environment in which search engines operate is akin to the floor of the New York Stock Exchange--the louder a site shouts, the more likely it will be heard above the din.

That desire to be heard leads to a more insidious reason Web searches can be unpredictable or worse: engine rigging.

Step into the shoes of a site owner. Your site comes up 42nd among 39,906 hits on AltaVista for, say, auto parts (as did Nissenbaum’s Auto Parts Inc.)--that’s in the top 0.1% of results. Sounds impressive. Unfortunately, with that showing you should cash out on e-commerce. Few users look past the first couple of screens.

Advertisement

The competition among Webmasters to move up the list has grown ferocious. Some techniques they use to get ahead are simple and straightforward. These include registering with the major engines to gain favorable treatment, as well as “meta-tagging” their site--embedding keywords in the computer code of each page that can be indexed by search robots but remain invisible to users.

If you ignore such techniques, “you’re pretty much chucking your message into the ocean in a bottle,” said Paul Bruemmer, chief operations officer of Santa Barbara-based Web-Ignite Corp., which provides search-enhancement services.

But because every sensible Webmaster follows that advice, such methods alone no longer get top results. Many have adopted other methods of skewing searches, starting with cash.

Most of the big search sites offer favorable placement for a fee, though few Web surfers realize this.

One engine, GoTo, uses a refreshingly upfront disclosure method--once you get over the shock of search results crassly displayed in the order of the amount GoTo earns for each click.

Not surprisingly, GoTo’s performance leaves room for improvement. In a search for “Microsoft,” the company’s site, which did not pay for placement, came up 22nd. In a search for “Macintosh computing,” the first Apple Computer Web site turned up 148th.

Advertisement

Then there are the less scrupulous ways to rise to the top. Known as spamming (after the junk e-mail scourge), they involve stuffing Web pages with scores or hundreds of meta-tag keywords because some robots interpret more keywords to mean higher relevance. Some sites print dozens of keywords on their home page (another test of relevance) but in the same color as the page’s background--rendering them invisible to the user but obvious to the colorblind robot.

Pornographic sites are notorious for using false meta-tags that land them on hit lists for such terms as “auto sales” and “IRS.”

“Spammers will go to any extreme, break any rule to get in position,” said Bruemmer. Many spend $100,000 a month to do it. Not surprisingly, scores of services that guarantee better search results have sprouted--as a quick search for “Web promotion” demonstrates.

Some, unlike Bruemmer’s company, promise top 10 placement; this almost certainly signals spamming. How big a problem is search-result subversion? The engine companies have assumed a war footing.

“Every major search engine is developing countermeasures, because if they didn’t they’d immediately be swamped,” said Kevin Brown of Inktomi Corp. of San Mateo, which provides underlying technology for many engines. Spammers respond with their own countermeasures, he said.

“It’s a little bit of spy versus spy,” he said.

A few search sites--Yahoo, Mining Co., Ask Jeeves and LookSmart--employ human editors in addition to software robots to handpick sites that remain immune from most of these problems. But that subset leaves most of the Web untouched.

Advertisement

Secondary searches on those sites suffer the same problems as the rest of the industry.

So while search is the essence of the Web, think before you inhale too deeply.

*

Innovation will appear every other week in The Cutting Edge. Times staff writer Charles Piller can be reached at charles.piller@latimes.com.

Advertisement