For Webmasters
To protect a Website from being Cached.
Through robot.txt
Some crawlers for search engines obey the Robot Exclusion Standard. We can simply include a "robots.txt" file to a root directory of web servers to tell crawlers what pages to be excluded. The "robots.txt" should look like the following:
User-agent: *
Disallow: /cgi-bin/
Disallow: /private/
For Webmasters and Web Publishers
To protect Web pages from being cached.
Through Meta tags
Another method is to add a NOINDEX tag to web pages those we want to exclude.
For Yahoo and Google, the tag should look like<META NAME="robots" CONTENT="noindex">
For MSN, the tag should look like<META NAME="*" CONTENT="noindex" />
References
- Yahoo
http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-02.html
http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-04.html - MSN
Visit the following URL and select "How do I control which pages of my website are indexed?"
http://help.live.com/help.aspx?project=wl_webmasters&mkt=en-US - Google
http://www.google.com/support/webmasters/bin/answer.py?answer=35301&topic=8459