Making the real-time Web relevant

If there is perhaps one universal truth about the Web, it’s that people want it now.

During the past 15 years, our expectations for how quickly information should be delivered to us over the Internet have changed. Now a delay of minutes on a breaking news story is unacceptable, as we saw during the frantic search for information in the hours after Michael Jackson died last year.

Enter real-time search. Search has been our gateway to the Web for almost as long as it has existed, and the big search players of the day are gearing up to handle a new challenge: how can the explosion of instant content produced by news organizations, blogs, and social-media users be organized in a relevant fashion, sorting through one of the worst signal-to-noise ratios in modern communication? Oh, and by the way, those results have to be displayed instantly.

“If information was generated seconds ago that’s relevant to what I am looking for, it should be available to me in one place,” said Amit Singhal, a Google Fellow and a legend in the search industry who is responsible for Google’s real-time search project. “It’s awfully hard.”

This is for real

There’s no going back to a delayed publishing model for media companies: deadlines are dead in the real-time world. And more and more regular people realize every day that there is an audience for the thoughts, rants, and banal moments in their day-to-day lives.

The result is a content explosion, the likes of which crushed Google CEO Eric Schmidt’s dream of one day indexing the entire Web. That may have been a pipe dream to begin with, but it’s definitely not going to happen now.

So if search engines are to remain relevant themselves, they’ll need to make sense of this content. And unless social-media networks are able to make their content discoverable, they won’t turn into the types of content-discovery engines that their public-relations people like to imagine are already here.

Expect the importance of real-time search to only grow over the next several years. For example, Yahoo’s search deal with Microsoft does not include real-time indexing and ranking efforts, as the company believes that it’s too important to give away.

“We think of (real-time search) as a very strategic and important asset, and we are going to continue to invest in it in a big way,” Seth said.

It’s been about four months since Google integrated real-time results into its pages, and a bit longer since Google and Microsoft cut deals with Twitter to bring that service’s “firehose” feed directly into those companies. Real-time search today is in its infancy, but it’s the next stage in the evolution of Internet search.

Time to get real

So, what is “real-time” content? There are nearly as many definitions as there are companies scrambling to get their names associated with one of the more hyped developments in Internet publishing.

Most people agree it centers on the concept of microblogging, or instant publishing of content to the open Web from social-media services. But in practice, “real-time search is still primarily Twitter search,” said Danny Sullivan, editor of Search Engine Land.

Microsoft’s Paul Yiu, one of Bing’s leading real-time search experts, agreed. Bing has centered almost all of its real-time search efforts on its page. The 140-character service is the undisputed king of “what’s happening now” status updates and continues to grow amid high-profile anecdotes such as uprisings in Iran and the landing of a jetliner in the Hudson River.

Beyond Twitter, however, Yiu thinks there are two components to real-time information: the actual content of the status update or post, and the link that is being shared within that update. Both parts are relevant to a searcher’s query, Yiu said.

Tobias Peggs, president of start-up OneRiot, has built an entire company on the premise that the link being shared within the status update is more relevant than the message itself. When you search for a topic with the intent of finding out what’s happening with, say, the bombings in Moscow last week, OneRiot analyzes the links being shared within status updates and user-controlled sites like Digg to determine the most relevant pieces of content being shared at a given moment.

“We filter through that real-time social noise and extract the useful signal,” Peggs said, surfacing the definitive Los Angeles Times story about the bombings being retweeted by thousands of users as opposed to a tweet that says “OMG, those Moscow bombings are really bad.”