author | Oleksandr Gavenko <gavenkoa@gmail.com> |
Tue, 10 Dec 2019 19:31:15 +0200 | |
changeset 2391 | aedbd074ec54 |
parent 2228 | 837f1337c59b |
permissions | -rw-r--r-- |
1334
9bf0d5a1f0cf
Include common header with quick links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1331
diff
changeset
|
1 |
.. -*- coding: utf-8; -*- |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
2 |
|
2072
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
3 |
========== |
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
4 |
Web site |
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
5 |
========== |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
6 |
.. contents:: |
1905
fba288d59662
Include only local subsections into TOC. This prevent duplication of
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1334
diff
changeset
|
7 |
:local: |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
8 |
|
2072
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
9 |
Speeding up web site loading |
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
10 |
============================ |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
11 |
|
2228
837f1337c59b
Removed indentation that compiled into <blockquote>.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2077
diff
changeset
|
12 |
* http://developer.yahoo.com/performance/rules.html |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
13 |
|
2072
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
14 |
robots.txt |
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
15 |
========== |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
16 |
|
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
17 |
To exclude all robots from the entire server:: |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
18 |
|
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
19 |
User-agent: * |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
20 |
Disallow: / |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
21 |
|
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
22 |
To exclude all robots from part of the server:: |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
23 |
|
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
24 |
User-agent: * |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
25 |
Disallow: /cgi-bin/ |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
26 |
Disallow: /tmp/ |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
27 |
Disallow: /junk/ |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
28 |
|
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
29 |
To allow a single robot:: |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
30 |
|
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
31 |
User-agent: Google |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
32 |
Disallow: |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
33 |
|
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
34 |
User-agent: * |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
35 |
Disallow: / |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
36 |
|
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
37 |
To allow all robots complete access:: |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
38 |
|
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
39 |
User-agent: * |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
40 |
Disallow: |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
41 |
|
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
42 |
See: |
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
43 |
|
2073
1abbd5a7db80
Add Wikipedia article.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2072
diff
changeset
|
44 |
http://www.robotstxt.org/ |
1abbd5a7db80
Add Wikipedia article.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2072
diff
changeset
|
45 |
Page provides description for robots.txt usual practice and discussion about |
1abbd5a7db80
Add Wikipedia article.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2072
diff
changeset
|
46 |
possible standardization efforts. |
1abbd5a7db80
Add Wikipedia article.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2072
diff
changeset
|
47 |
http://www.robotstxt.org/robotstxt.html |
1abbd5a7db80
Add Wikipedia article.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2072
diff
changeset
|
48 |
About /robots.txt |
1abbd5a7db80
Add Wikipedia article.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2072
diff
changeset
|
49 |
http://www.robotstxt.org/faq.html |
1abbd5a7db80
Add Wikipedia article.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2072
diff
changeset
|
50 |
Frequently Asked Questions. |
1abbd5a7db80
Add Wikipedia article.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2072
diff
changeset
|
51 |
https://en.wikipedia.org/wiki/Robots_exclusion_standard |
1abbd5a7db80
Add Wikipedia article.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2072
diff
changeset
|
52 |
Wikipedia article on robots.txt. |
1abbd5a7db80
Add Wikipedia article.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2072
diff
changeset
|
53 |
http://googlewebmastercentral.blogspot.com/2008/06/improving-on-robots-exclusion-protocol.html |
1abbd5a7db80
Add Wikipedia article.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2072
diff
changeset
|
54 |
Improving on Robots Exclusion Protocol. |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
55 |
|
2072
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
56 |
Sitemap |
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
57 |
======= |
1331 | 58 |
|
59 |
Sitemaps protocol allows a webmaster to inform search engines about URLs on a |
|
60 |
website that are available for crawling. |
|
61 |
||
2077 | 62 |
http://www.sitemaps.org/protocol.html |
63 |
Sitemap protocol. |
|
64 |
http://en.wikipedia.org/wiki/Sitemaps |
|
65 |
Wikipedia article. |
|
1331 | 66 |
|
2072
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
67 |
Web document structure useage |
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
68 |
============================= |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
69 |
|
2228
837f1337c59b
Removed indentation that compiled into <blockquote>.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2077
diff
changeset
|
70 |
http://dev.opera.com/articles/view/mama/ |
837f1337c59b
Removed indentation that compiled into <blockquote>.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2077
diff
changeset
|
71 |
Metadata Analysis and Mining Application |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
72 |
|
2072
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
73 |
Validation |
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
74 |
========== |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
75 |
|
2228
837f1337c59b
Removed indentation that compiled into <blockquote>.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2077
diff
changeset
|
76 |
* http://validator.w3.org/ |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
77 |
|
2072
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
78 |
Add search to your site |
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
79 |
======================= |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
80 |
|
2228
837f1337c59b
Removed indentation that compiled into <blockquote>.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2077
diff
changeset
|
81 |
http://www.google.com/support/customsearch/ |
837f1337c59b
Removed indentation that compiled into <blockquote>.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2077
diff
changeset
|
82 |
Custom Search Help |
837f1337c59b
Removed indentation that compiled into <blockquote>.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2077
diff
changeset
|
83 |
http://help.yahoo.com/l/uk/yahoo/search/basics/basics-13.html |
837f1337c59b
Removed indentation that compiled into <blockquote>.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2077
diff
changeset
|
84 |
Can I add a Yahoo! Search box to my site? |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
85 |
|
2072
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
86 |
Check websites for broken links |
72921b56230b
Remove dots from headers.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
1912
diff
changeset
|
87 |
=============================== |
1325
ea51f96a6a47
Check websites for broken links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff
changeset
|
88 |
|
2228
837f1337c59b
Removed indentation that compiled into <blockquote>.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2077
diff
changeset
|
89 |
http://linkchecker.sourceforge.net/ |
837f1337c59b
Removed indentation that compiled into <blockquote>.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2077
diff
changeset
|
90 |
linkchecker home page. |
837f1337c59b
Removed indentation that compiled into <blockquote>.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2077
diff
changeset
|
91 |
http://arthurdejong.org/webcheck/ |
837f1337c59b
Removed indentation that compiled into <blockquote>.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
2077
diff
changeset
|
92 |
webcheck home page. |