Wed. Apr 8th, 2026

What is robots.txt and Do You Even Need One for Your Small Business Website?

ByJohn Mitchell

August 5, 2025
Reading Time: 7 minutes :

What is robots.txt and Do You Even Need One for Your Small Business Website?

If you’ve ever dived into the geeky side of running a website, chances are you’ve come across something called robots.txt. Sounds fancy, right? Like something out of a sci-fi film. But really, it’s just a small text file that can make a big difference to how search engines deal with your site. The thing is, there’s a lot of confusion around what it actually does, whether you really need one, and how it works alongside other stuff like sitemaps.

So, if you’re running a small business website and trying to figure out if this is one of those things you need to stress about — good news: this blog is just for you. We’re going to break it all down in simple terms, no tech jargon (well, only the important bits, and we’ll explain those too), and help you understand when to use a robots.txt file — and when not to bother.

What Even Is a robots.txt File?

Alright, let’s start with the basics. A robots.txt file is just a plain text file you put in the root of your website — that means the main folder where everything lives. It gives instructions to bots (mainly search engine bots like Googlebot) about what they can and can’t look at on your website.

Think of it like a polite sign on a door that says, “Oi, please don’t go snooping in here.” Bots don’t have to obey it, but most good ones will. It’s part of a thing called the Robots Exclusion Protocol. Sounds very official, doesn’t it? But it’s just a way of saying, “These are the rules for crawling my site.”

What Does a robots.txt File Look Like?

Here’s a super basic example of what you might find in a robots.txt file:

User-agent: *
Disallow: /private-folder/

That asks all bots (* means “all of you”) to stay out of the folder called “private-folder”. Easy peasy. You can get more specific, but that’s the general idea — it’s all about telling search engines which pages or folders they shouldn’t poke around in.

A more advanced file may look like (note that the lines starting with # are comments are are ignored by bots, but are useful when you come back to the file in 6 months time and wonder what those records are for) :

#To block SemrushBot from crawling your site for different SEO and technical issues:
User-agent: SiteAuditBot
Disallow: /
#To block SemrushBot from crawling your site for Backlink Audit tool:
User-agent: SemrushBot-BA
Disallow: /
#To block SemrushBot from crawling your site for On Page SEO Checker tool and similar tools:
User-agent: SemrushBot-SI
Disallow: /
#To block SemrushBot from checking URLs on your site for SWA tool:
User-agent: SemrushBot-SWA
Disallow: /
#To block SplitSignalBot from crawling your site for SplitSignal tool:
User-agent: SplitSignalBot
Disallow: /
#To block SemrushBot-OCOB from crawling your site for ContentShake AI tool:
User-agent: SemrushBot-OCOB
Disallow: /
#To block SemrushBot-FT from crawling your site for Plagiarism Checker and similar tools:
User-agent: SemrushBot-FT
Disallow: /
User-agent: SemrushBot
Disallow: /

And you can, for some bots ask them to slow down their visits if they are impacting your server by using a record like (like the lines above, anything after the # is treated as a comment and ignored by bots) :

User-Agent:  FacebookBot                              
Crawl-delay: 10              # 1 page per 10 seconds

If you need to do this though, check that the bot follows the Crawl-delay request as not all of them do so.

Why Would You Want to Stop Search Engines Crawling Your Site?

So here’s the thing: not everything on your site needs to be indexed by Google. Maybe you’ve got some pages just for your team, like a staff rota or an admin login. Maybe you’ve got old product pages you don’t want to show up in search anymore. Or maybe you’re running a little experiment behind the scenes and don’t want search engines finding it just yet.

Using robots.txt, you can tell bots, “Hey, leave this stuff alone.” It doesn’t delete anything, and it doesn’t block people from seeing it — it just asks bots to ignore it.

But Here’s the Big One: You Do Not Need a robots.txt File Just to Say “Index Everything”

This is a biggie, and it trips up a lot of people. If you don’t have a robots.txt file at all, do you know what happens? Search engines will just assume they’re allowed to crawl and index everything. That’s the default behaviour.

You don’t need to write a file that says “Yes, index all my pages” — because they already will, as long as nothing else is stopping them (like a noindex tag or password protection).

Adding a robots.txt file that says:

User-agent: *
Disallow:

…does the same as not having one at all. It’s like saying, “Hey bots, you can go wherever you like,” even though they already could. So unless you’ve got something specific you want to block, it’s perfectly fine — and even preferred — to just skip the robots.txt file entirely.

So Why Do So Many People Add One Anyway?

Habit, mostly. Or they’ve read online that “every website needs one,” which isn’t true. Some website platforms create one automatically even if there’s nothing in it. And sometimes people copy what they’ve seen elsewhere without fully understanding it.

The problem is, if you get it wrong, it can accidentally stop search engines from indexing your whole site. We’ve seen it happen more than once — a small business owner thinks they’re helping by adding a robots.txt file, but ends up adding something like:

User-agent: *
Disallow: /

That one little slash means “don’t index anything on this site.” Total disaster if your business relies on people finding you on Google!  Believe it or not, I’ve seen this so many times in my 28 year SEO career that it’s the frst thing I check when a new client says that they are not in the Google results.

When Should a Small Business Use a robots.txt File?

Right, so while you don’t need a robots.txt file if you want everything to be indexable, there are a few cases where it might come in handy. For example:

  • You’ve got a staging site or testing area and don’t want it showing up in search.
  • There are folders with technical files (like scripts or backend tools) that aren’t useful to search engines.
  • You want to block bots that aren’t helpful (though this is hit-and-miss — most bad bots ignore robots.txt anyway and there are other ways to block these bots).
  • You’re working with developers who need it for specific technical reasons.

But for the average small business with a brochure-style website, online shop, or blog? You probably don’t need to worry about it.

What About the XML Sitemap Then?

Ah yes, the trusty sitemap. This one is worth having — and you can link to it in your robots.txt file if you’ve got one, but it’s not required. An XML sitemap is basically a list of all the pages on your site you want search engines to know about. It’s like handing Google a cheat sheet that says, “Here’s everything important — crawl this lot, please.”

A lot of website platforms (like WordPress, Wix, or Shopify) will create a sitemap for you automatically. You can usually find it at:

https://www.yourdomain.co.uk/sitemap.xml

If you do decide to use a robots.txt file, you can add a line like this (I prefer to have it as the first line in the file, but it can go at the bottom if you prefer – or in fact, anywhere in the file) :

Sitemap: https://www.yourdomain.co.uk/sitemap.xml

That’s just giving search engines a helpful pointer to your sitemap. But again — they’ll probably find it anyway, especially if you’ve submitted it through Google Search Console. So it’s a nice touch, but not essential.

Common Myths About robots.txt (And Why They’re Wrong)

Myth 1: Every Website Needs One

Nope! Like we’ve said — if you want all your pages indexed, the best thing you can do is not have one. No file = open access. That’s what you want most of the time.

Myth 2: It Stops Pages Appearing in Google

Here’s where it gets tricky. Just because you block a page from being crawled in robots.txt, that doesn’t mean it won’t appear in search. Google might still index it if there are links pointing to it. If you want to properly remove a page from search results, you’ll need a <meta name="robots" content="noindex"> tag on the page itself, or remove the page entirely and send a 404 or 410 code to the search engines

Myth 3: It Makes Your Website Faster

There’s a tiny grain of truth here, but it’s mostly fluff. Blocking some files from bots can reduce crawl activity slightly, but it’s not going to speed up your website in a way that your customers notice. For proper speed boosts, you’re better off looking at image compression, caching, and a good web host.

Should You Just Delete It?

If you’ve got a robots.txt file that doesn’t do anything — like one that’s just:

User-agent: *
Disallow:

…then yeah, feel free to delete it. It’s not doing anything useful, and it’s just one more thing that could go wrong if someone edits it by mistake. Less is more sometimes.

If you’ve got a developer who’s added one for a reason, have a quick chat with them first. But if you’re managing your own site and you’re not blocking anything on purpose, it’s completely safe — and smart — to leave it out.

Quick Recap: What You Need to Know

  • robots.txt tells search engines what not to crawl.
  • If you don’t have one, search engines will crawl everything they find — which is usually what you want.
  • Don’t use robots.txt to stop pages appearing in Google — that needs a noindex tag.
  • For small business sites, you probably don’t need a robots.txt file at all.
  • Make sure you have an XML sitemap — and submit it to Google Search Console.
  • Only use robots.txt if you’ve got a specific reason to block something.

Final Thoughts

Look, there’s enough to deal with when you’re running a small business website — you don’t need extra tech headaches. The good news is, you don’t need to overthink robots.txt. If you want your whole site to be visible and indexable (and let’s be honest, you probably do), then no robots.txt file is absolutely fine, although having an empty file may stop the Google Search Console from compaining about the file being missing.

Focus on creating useful pages, writing good content, keeping your site speedy and mobile-friendly, and making sure your sitemap is up to date. That’s the stuff that really makes a difference.

And remember: sometimes doing nothing is the best option. Especially when it comes to robots.