• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

Genuine Internet Marketing

Real internet marketing strategies for websites, email, social, and more.

  • Blog
  • Podcast
  • Resources
  • About
  • Contact
You are here: Home / Internet Marketing / How people steal your hidden files
How people steal your hidden files

September 14, 2010 By Shane Eubanks 4 Comments

How people steal your hidden files

One of the most common mistakes I see many websites make is trying to hide webpages and files from search engines while making it incredibly easy for people to discover those pages. Your eyes may start to glaze while reading this, but stay with me…this will save your booty at some point.

The first and most common mistake is the improper use of a robots.txt file. Basically what this file does is tells spiders (search engines) what files and directories to scan or ignore. You can usually see if a website has one by simply adding a /robots.txt after their domain name. For example, you could see Google’s by going to http://www.google.com/robots.txt, which reveals numerous directories they don’t want search engines to scan and index…interesting stuff if you’re into that sort of thing.


So…how can this be a bad thing for you? Well, take a look at Google’s robots.txt again and notice how they disallow directories, which is the proper way of doing it. The mistake that some websites make is to disallow actual pages or files. It could say something like:

User-agent: *
Disallow: hiddenpage.html

or

User-agent: *
Disallow: SuperSecretFile.pdf

Now, the “User-agent: *” part means “Hey all search engines…this applies to all of you”. Then the “Disallow: hiddenpage.html” line means exactly what it implies…don’t scan or index hiddenpage.html. Now while this is all fine and dandy when it comes to search engines, what happens is that hiddenpage.html is now exposed to anyone if they view your robots.tx file! Even worse, if there are multiple pages like this in the robots.txt, then the website has essentially listed every single secret page in one, organized location for anyone to see. (More info about setting up a robots.txt file)

I can’t tell you how many internet marketers (and many other types of sites) I see making this mistake. Their landing page/sales letter/squeeze page or whatever you want to call it has little more than a form to submit your name and email address to get something for free…and some even require payment before receiving a “link” to a “secret download page”. Well…just type in their domain.com/robots.txt and voila…instant access sometimes to the very pages that you will eventually end up at. You’re not “hacking” anything and there’s nothing illegal about this. It’s just a simple misuse on their part of the robots.txt file. Unethical and immoral? Perhaps.

Now what do you do if you’re one of these very people with exposed files in your robots.txt file? The good news is, there’s a fix. The best thing you can do is to move your “secret” pages into a directory and then disallow the directory. It would look something like this if you move them one directory deep to a folder called “secret”.

User-agent: *
Disallow: /secret

Presto…all of your files in that directory will not be spidered by search engines and you’re not revealing actual pages or files. Go one step further and stick a blank file in that directory named index.html. I won’t go into details why…just do it to be even more secure.

The next mistake I see made is the lack of noindex/nofollow tags on the “secret” pages. I’ll cover that in another post as this is plenty for you to chew on for now!

Filed Under: Internet Marketing Tagged With: development, internet marketing, security, technical

Comments

  1. Dan D'Laine says

    March 16, 2011 at 1:27 am

    Once again, great content. I’ve taken two pieces of useable info in half hour – from one site. The adsense/analytics link and this.
    Many Regards,
    Daniel D’Laine.

    Reply
    • Shane Eubanks says

      May 24, 2011 at 8:24 pm

      Thanks, Dan!

      Reply
  2. Bill (LoneWolf) Nickerson says

    May 24, 2011 at 7:35 pm

    Excellent advice here. I’ll be looking for your nofollow/noindex tag article if it’s up yet. If it is, a link would be a nice touch (not to mention good for your SEO).

    Reply
    • Shane Eubanks says

      May 24, 2011 at 8:25 pm

      Thanks, Lonewolf! That article is long overdue, so thank you for reminding me!

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

About Shane Eubanks

Husband, Father, & Internet Marketer based in the land of Chick-fil-a and sweet tea. Follow Shane - @ShaneEubanks

I share even more on these networks!

  • Facebook
  • Google+
  • Instagram
  • LinkedIn
  • Pinterest
  • RSS
  • Twitter
  • YouTube

I don't solely make money online by teaching other people to make money online. Making money on this site isn't my goal. My goal is to share the interactive strategies I've used for more than a decade to help businesses of all sizes succeed...including my own. You'll find strategies that will:

- Increase traffic
- Boost user registrations
- Grow sales
- Increase email open & click-through rates
- and more.

Just kick back & relax knowing that you're not being "sold to". The strategies presented here can help you succeed and reach your goals as well!

Ready to dive in?

Get Started →

Recent Posts

  • Farewell Corporate World – How and Why I did it
  • Easy Way to Find the Best Day and Time to Send Emails
  • Facebook Holds Fans Hostage
  • Genuine Email Marketing Best Practices and Tips
  • 23 Genuine WordPress SEO Tips

Other Places to Follow Me

  • Subcribe to RSS feed
  • Twitter
  • Facebook
  • Podcast (coming soon)
  • YouTube
  • LinkedIn
  • Pinterest
  • Instagram
  • Google+

About

  • About Genuine Internet Marketing
  • Contact
  • Disclaimer, Terms & Conditions
  • Privacy Policy

[footer_backtotop]

Copyright © 2021 Genuine Internet Marketing