BitTorrent Search Engine

About
Missing features
Home
Information for Webmasters
What is BitTorrent
Advertising
Labs
Shop
Contact
  
Information for Webmasters

All crawling and indexing tasks are done completely automatically by several crawlers that are running at the same time. The behavior of our crawlers can be influenced by webmasters. For example, it is possible to exclude parts of web sites by the usage of the robots exclusion protocol. In the following some useful information is listed that will help webmasters to understand btbot.

Our robots accesses your page too often?

To prevent overloading of web servers our crawlers are designed not to visit a web server more than once in a few seconds. Nevertheless, since several closed web crawlers are running simultaneously at btbot two different crawlers may access the same web server at once. If our robots visit your web server too often please send us an email to btbot@btbot.com to report your problem so that we can fix it.
 
Why does the robot try to download a file called robots.txt?

The robots.txt contains rules defined by the robot exclusion standard. Each web crawler should check for this file which tells the robot what parts of the web server are allowed to be visited and what parts are not. Each crawler maintains a cache of all robots.txt files that it has been downloaded and which is updated periodically.
 
How can I avoid the robots from crawling my web site or parts of my web site?

To avoid our crawlers from crawling your web site you should use the robot exclusion standard. Create a robots.txt file and place it into the root directory of your web site. The file may contain the following lines:

User-agent: btbot (* for all robots)
Disallow: /path_1 (/ for the complete web site)
Disallow: /path_2


For more details about the robot exclusion standard we refer to The Robots Exclusion Protocol.
 
Why does the robot download the robots.txt so often?

Each crawler maintains its own cache for all downloaded robots.txt files which has to be updated from time to time (usually once within 24 hours) so that in the worst case the number of downloads is equal to the number of running crawlers.
 
What parts of my web sites will be analyzed?

All parts of web sites are analyzed, except blocks that are commented out with <!-- .. -->. Also scripts within <script>...</script> and sources for frames will not be evaluated.
 
 
Copyright © 2004-2009 btbot   -   All rights reserved