Robots Exclusion Standard
A robots.txt file, commonly mis-represented as a robot.txt file, is a file encoded in the ANSI text format. This basically means it is a simple text file which should be created in Notepad. It controls how search engine crawlers (robots) look at your website and can be used to specify how certain areas of your site is indexed or to give instruction to specific search engines.
The file should be placed in the root directory of your website of where your index.html or home page resides. Even though you may not require the spider to exclude any area of your site from its search you should still have it as all the top-ranked search engines now look for it.
Some reasons you may need to exclude spiders from your site include:
1. There are some private directories or information that you do not want to be crawled.
2. You’re still fixing parts of the site and some areas may contain error pages.
3. You have optimized certain pages for specific search engines and want to exclude other search engine spiders from indexing it.
4. You want to prevent some search engine robots or email harvesting bots (Bad Bots) from crawling your pages altogether.
Syntax For File Creation
The basic instructions are placed in two lines of text.
User-agent: Spider Name
Diallow: File/Directory Name
Get More Details at: GoArticles.com
Spread the word: related/bookmark it/readit






