Friday, November 17, 2006

Sitemap Protocol based on XML (Extensible Markup Language)

Strange bedfellows Google, Microsoft and Yahoo have partnered to simplify how webmasters and online publishers submit their sites’ content for indexing in the companies’ search engines.

In a rare collaborative effort, Google, Microsoft and Yahoo, which compete directly in Internet search and other online services, plan to announce on Thursday their support for the open source, Sitemap Protocol based on XML (Extensible Markup Language).

This protocol, which Google created and has been using for about 18 months, will be adopted by Yahoo effective Thursday, and the three companies will collaborate to extend and enhance it. Yahoo has been using another protocol, which it will continue to support. Microsoft will stop using its current protocol after it implements Sitemap Protocol in its search engine in early 2007.

A site map is a file that webmasters and publishers put on their sites to guide the search engines’ automated Web crawlers in properly indexing their Web pages.

Site maps are particularly useful in highlighting to crawlers the dynamic Web content that is served up on the fly. Crawlers generally index content contained in static Web pages without problems but often they have difficulty with dynamic content, such as the one that is generated as a result of a search query.

A site map can be formatted using various protocols, but this means more work for webmasters and publishers, which is why Google, Microsoft and Yahoo are throwing their weight behind the Sitemap Protocol to promote it as a standard.

“The benefit for publishers is that they’ll get more of their content indexed more rapidly,” said Tim Mayer, Yahoo’s senior director of global search.

Meanwhile, the three companies believe the common protocol will improve site maps in general, and along the way make their search engine crawlers more comprehensive in their indexing, a benefit that will trickle down to end users.

“Ultimately what we care about is the best results for searchers and making things easy for site owners. This really does that,” said Vanessa Fox, product manager for Google’s Webmaster Central.

In addition to listing Web pages available for indexing, the Sitemap Protocol also lets publishers and webmasters include other relevant information, like when a page was last updated, how frequently it changes and what is its importance level on the site. All of this leads to more precise and effective crawling, the officials said.

Google, Microsoft and Yahoo will encourage other search engine operators, as well as makers of related software, like content management systems vendors, to support the protocol, they said.

Offered under the terms of the Attribution-ShareAlike Creative Commons License, the protocol will be publicly available starting Thursday.


Your Ad Here