AnalyticBridge

Social Network For Analytic Professionals

Sun

Simple idea to help Google, Yahoo and MSN (Bing) improve their web crawlers

Here is a quick fix: Stop downloading images that are less than 0.1 KB in size. Most of them are beacons (AKA clear gifs, or 1x1 pixel images) used for tracking purposes.

All search engines now have web robots that download not only web pages (text) but also images, in order to create image directories. However, for search engines, it is a waste of bandwidth to download billions of beacons per day. Also, some web analytics software (not mine) count these hits as human Internet traffic, based on the assumption that robots usually do not download images.

Share

Reply to This

Replies to This Discussion

And another idea: use anonymous web crawlers, as many webmasters now deliver search engine optimized content to well known robots (Google bot, Yahoo slurp, etc.) and a different content to real humans, to try to get top positions on search result pages for many irrelevant, popular keywords.

An anonymous web crawler is one that does not comply with the web robot standard protocol: in particular, the user agent and IP address are masked.

Reply to This

RSS

Featured


Advertisement

© 2010   Created by Vincent Granville

Badges  |  Report an Issue  |  Privacy  |  Terms of Service