One of the most common mistakes I see many websites make is trying to hide webpages and files from search engines while making it incredibly easy for people to discover those pages. Your eyes may start to glaze while reading this, but stay with me…this will save your booty at some point.
The first and most common mistake is the improper use of a robots.txt file. Basically what this file does is tells spiders (search engines) what files and directories to scan or ignore. You can usually see if a website has one by simply adding a /robots.txt after their domain name. For example, you could see Google’s by going to http://www.google.com/robots.txt, which reveals numerous directories they don’t want search engines to scan and index…interesting stuff if you’re into that sort of thing.