web-site.rst
author Oleksandr Gavenko <gavenkoa@gmail.com>
Tue, 07 Dec 2010 23:18:59 +0200
changeset 740 8189b7ad02d9
parent 708 83ded9492a61
child 743 3ce5d6d2c057
permissions -rwxr-xr-x
Web document structure useage.

-*- mode: outline; coding: utf-8 -*-

* Speeding up web site loading.

  http://developer.yahoo.com/performance/rules.html

* robots.txt.

To exclude all robots from the entire server

  User-agent: *
  Disallow: /

To exclude all robots from part of the server:

  User-agent: *
  Disallow: /cgi-bin/
  Disallow: /tmp/
  Disallow: /junk/

To allow a single robot:

  User-agent: Google
  Disallow:

  User-agent: *
  Disallow: /

To allow all robots complete access:

  User-agent: *
  Disallow:

  http://www.robotstxt.org/
                home page
  http://www.robotstxt.org/robotstxt.html
                About /robots.txt
  http://www.robotstxt.org/faq.html
                Frequently Asked Questions
  http://googlewebmastercentral.blogspot.com/2008/06/improving-on-robots-exclusion-protocol.html
                Improving on Robots Exclusion Protocol

* Web document structure useage.

  http://dev.opera.com/articles/view/mama/
                Metadata Analysis and Mining Application