{"id":2592,"date":"2025-08-05T13:48:18","date_gmt":"2025-08-05T12:48:18","guid":{"rendered":"https:\/\/www.forestsoftware.co.uk\/blog\/?p=2592"},"modified":"2025-08-05T10:06:24","modified_gmt":"2025-08-05T09:06:24","slug":"what-is-robots-txt-and-do-you-even-need-one-for-your-small-business-website","status":"publish","type":"post","link":"https:\/\/www.forestsoftware.co.uk\/blog\/2025\/08\/what-is-robots-txt-and-do-you-even-need-one-for-your-small-business-website\/","title":{"rendered":"What is robots.txt and Do You Even Need One for Your Small Business Website?"},"content":{"rendered":"<span class=\"span-reading-time rt-reading-time\" style=\"display: block;\"><span class=\"rt-label rt-prefix\">Reading Time: <\/span> <span class=\"rt-time\"> 7<\/span> <span class=\"rt-label rt-postfix\">minutes : <\/span><\/span><h1>What is robots.txt and Do You Even Need One for Your Small Business Website?<\/h1>\n<p>If you\u2019ve ever dived into the geeky side of running a website, chances are you\u2019ve come across something called <code>robots.txt<\/code>. Sounds fancy, right? Like something out of a sci-fi film. But really, it\u2019s just a small text file that can make a big difference to how search engines deal with your site. The thing is, there\u2019s a lot of confusion around what it actually does, whether you really need one, and how it works alongside other stuff like sitemaps.<\/p>\n<p>So, if you\u2019re running a small business website and trying to figure out if this is one of those things you need to stress about \u2014 good news: this blog is just for you. We\u2019re going to break it all down in simple terms, no tech jargon (well, only the important bits, and we\u2019ll explain those too), and help you understand when to use a robots.txt file \u2014 and when not to bother.<\/p>\n<p><!--more--><\/p>\n<h2>What Even Is a robots.txt File?<\/h2>\n<p>Alright, let\u2019s start with the basics. A <code>robots.txt<\/code> file is just a plain text file you put in the root of your website \u2014 that means the main folder where everything lives. It gives instructions to bots (mainly search engine bots like Googlebot) about what they can and can\u2019t look at on your website.<\/p>\n<p>Think of it like a polite sign on a door that says, \u201cOi, please don\u2019t go snooping in here.\u201d Bots don\u2019t have to obey it, but most good ones will. It\u2019s part of a thing called the Robots Exclusion Protocol. Sounds very official, doesn\u2019t it? But it\u2019s just a way of saying, \u201cThese are the rules for crawling my site.\u201d<\/p>\n<h2>What Does a robots.txt File Look Like?<\/h2>\n<p>Here\u2019s a super basic example of what you might find in a <code>robots.txt<\/code> file:<\/p>\n<pre>User-agent: *\r\nDisallow: \/private-folder\/\r\n<\/pre>\n<p>That asks all bots (<code>*<\/code> means \u201call of you\u201d) to stay out of the folder called \u201cprivate-folder\u201d. Easy peasy. You can get more specific, but that\u2019s the general idea \u2014 it\u2019s all about telling search engines which pages or folders they shouldn\u2019t poke around in.<\/p>\n<p>A more advanced file may look like (note that the lines starting with # are comments are are ignored by bots, but are useful when you come back to the file in 6 months time and wonder what those records are for) :<\/p>\n<pre>#To block SemrushBot from crawling your site for different SEO and technical issues:\r\nUser-agent: SiteAuditBot\r\nDisallow: \/\r\n#To block SemrushBot from crawling your site for Backlink Audit tool:\r\nUser-agent: SemrushBot-BA\r\nDisallow: \/\r\n#To block SemrushBot from crawling your site for On Page SEO Checker tool and similar tools:\r\nUser-agent: SemrushBot-SI\r\nDisallow: \/\r\n#To block SemrushBot from checking URLs on your site for SWA tool:\r\nUser-agent: SemrushBot-SWA\r\nDisallow: \/\r\n#To block SplitSignalBot from crawling your site for SplitSignal tool:\r\nUser-agent: SplitSignalBot\r\nDisallow: \/\r\n#To block SemrushBot-OCOB from crawling your site for ContentShake AI tool:\r\nUser-agent: SemrushBot-OCOB\r\nDisallow: \/\r\n#To block SemrushBot-FT from crawling your site for Plagiarism Checker and similar tools:\r\nUser-agent: SemrushBot-FT\r\nDisallow: \/\r\nUser-agent: SemrushBot\r\nDisallow: \/<\/pre>\n<p>And you can, for some bots ask them to slow down their visits if they are impacting your server by using a record like (like the lines above, anything after the # is treated as a comment and ignored by bots) :<\/p>\n<pre>User-Agent:  FacebookBot                              \r\nCrawl-delay: 10              # 1 page per 10 seconds<\/pre>\n<p>If you need to do this though, check that the bot follows the Crawl-delay request as not all of them do so.<\/p>\n<h2>Why Would You Want to Stop Search Engines Crawling Your Site?<\/h2>\n<p>So here\u2019s the thing: not everything on your site needs to be indexed by Google. Maybe you\u2019ve got some pages just for your team, like a staff rota or an admin login. Maybe you\u2019ve got old product pages you don\u2019t want to show up in search anymore. Or maybe you\u2019re running a little experiment behind the scenes and don\u2019t want search engines finding it just yet.<\/p>\n<p>Using <code>robots.txt<\/code>, you can tell bots, \u201cHey, leave this stuff alone.\u201d It doesn\u2019t delete anything, and it doesn\u2019t block people from seeing it \u2014 it just asks bots to ignore it.<\/p>\n<h2>But Here\u2019s the Big One: You Do <em>Not<\/em> Need a robots.txt File Just to Say \u201cIndex Everything\u201d<\/h2>\n<p>This is a biggie, and it trips up a lot of people. If you don\u2019t have a <code>robots.txt<\/code> file at all, do you know what happens? Search engines will just assume they\u2019re allowed to crawl and index everything. That\u2019s the default behaviour.<\/p>\n<p>You don\u2019t need to write a file that says \u201cYes, index all my pages\u201d \u2014 because they already will, as long as nothing else is stopping them (like a noindex tag or password protection).<\/p>\n<p>Adding a <code>robots.txt<\/code> file that says:<\/p>\n<pre>User-agent: *\r\nDisallow:\r\n<\/pre>\n<p>\u2026does the same as not having one at all. It\u2019s like saying, \u201cHey bots, you can go wherever you like,\u201d even though they already could. So unless you\u2019ve got something specific you want to block, it\u2019s perfectly fine \u2014 and even preferred \u2014 to just skip the robots.txt file entirely.<\/p>\n<h2>So Why Do So Many People Add One Anyway?<\/h2>\n<p>Habit, mostly. Or they\u2019ve read online that \u201cevery website needs one,\u201d which isn\u2019t true. Some website platforms create one automatically even if there\u2019s nothing in it. And sometimes people copy what they\u2019ve seen elsewhere without fully understanding it.<\/p>\n<p>The problem is, if you get it wrong, it can accidentally stop search engines from indexing your whole site. We\u2019ve seen it happen more than once \u2014 a small business owner thinks they\u2019re helping by adding a <code>robots.txt<\/code> file, but ends up adding something like:<\/p>\n<pre>User-agent: *\r\nDisallow: \/\r\n<\/pre>\n<p>That one little slash means \u201cdon\u2019t index anything on this site.\u201d Total disaster if your business relies on people finding you on Google!\u00a0 Believe it or not, I&#8217;ve seen this so many times in my 28 year SEO career that it&#8217;s the frst thing I check when a new client says that they are not in the Google results.<\/p>\n<h2>When Should a Small Business Use a robots.txt File?<\/h2>\n<p>Right, so while you don\u2019t <em>need<\/em> a robots.txt file if you want everything to be indexable, there are a few cases where it might come in handy. For example:<\/p>\n<ul>\n<li>You\u2019ve got a staging site or testing area and don\u2019t want it showing up in search.<\/li>\n<li>There are folders with technical files (like scripts or backend tools) that aren\u2019t useful to search engines.<\/li>\n<li>You want to block bots that aren\u2019t helpful (though this is hit-and-miss \u2014 most bad bots ignore <code>robots.txt<\/code> anyway and there are other ways to block these bots).<\/li>\n<li>You\u2019re working with developers who need it for specific technical reasons.<\/li>\n<\/ul>\n<p>But for the average small business with a brochure-style website, online shop, or blog? You probably don\u2019t need to worry about it.<\/p>\n<h2>What About the XML Sitemap Then?<\/h2>\n<p>Ah yes, the trusty sitemap. This one <em>is<\/em> worth having \u2014 and you can link to it in your <code>robots.txt<\/code> file if you\u2019ve got one, but it\u2019s not required. An XML sitemap is basically a list of all the pages on your site you want search engines to know about. It\u2019s like handing Google a cheat sheet that says, \u201cHere\u2019s everything important \u2014 crawl this lot, please.\u201d<\/p>\n<p>A lot of website platforms (like WordPress, Wix, or Shopify) will create a sitemap for you automatically. You can usually find it at:<\/p>\n<pre>https:\/\/www.yourdomain.co.uk\/sitemap.xml\r\n<\/pre>\n<p>If you do decide to use a <code>robots.txt<\/code> file, you can add a line like this (I prefer to have it as the first line in the file, but it can go at the bottom if you prefer &#8211; or in fact, anywhere in the file) :<\/p>\n<pre>Sitemap: https:\/\/www.yourdomain.co.uk\/sitemap.xml\r\n<\/pre>\n<p>That\u2019s just giving search engines a helpful pointer to your sitemap. But again \u2014 they\u2019ll probably find it anyway, especially if you\u2019ve submitted it through Google Search Console. So it\u2019s a nice touch, but not essential.<\/p>\n<h2>Common Myths About robots.txt (And Why They\u2019re Wrong)<\/h2>\n<h3>Myth 1: Every Website Needs One<\/h3>\n<p>Nope! Like we\u2019ve said \u2014 if you want all your pages indexed, the best thing you can do is <strong>not<\/strong> have one. No file = open access. That\u2019s what you want most of the time.<\/p>\n<h3>Myth 2: It Stops Pages Appearing in Google<\/h3>\n<p>Here\u2019s where it gets tricky. Just because you block a page from being crawled in <code>robots.txt<\/code>, that doesn\u2019t mean it won\u2019t appear in search. Google might still index it if there are links pointing to it. If you want to properly remove a page from search results, you\u2019ll need a <code>&lt;meta name=\"robots\" content=\"noindex\"&gt;<\/code> tag on the page itself, or remove the page entirely and send a <a href=\"https:\/\/www.forestsoftware.co.uk\/blog\/2025\/01\/understanding-html-error-codes-a-beginners-guide\/\">404 or 410 code<\/a> to the search engines<\/p>\n<h3>Myth 3: It Makes Your Website Faster<\/h3>\n<p>There\u2019s a tiny grain of truth here, but it\u2019s mostly fluff. Blocking some files from bots can reduce crawl activity slightly, but it\u2019s not going to speed up your website in a way that your customers notice. For proper speed boosts, you\u2019re better off looking at image compression, caching, and a good web host.<\/p>\n<h2>Should You Just Delete It?<\/h2>\n<p>If you\u2019ve got a robots.txt file that doesn\u2019t do anything \u2014 like one that\u2019s just:<\/p>\n<pre>User-agent: *\r\nDisallow:\r\n<\/pre>\n<p>\u2026then yeah, feel free to delete it. It\u2019s not doing anything useful, and it\u2019s just one more thing that could go wrong if someone edits it by mistake. Less is more sometimes.<\/p>\n<p>If you\u2019ve got a developer who\u2019s added one for a reason, have a quick chat with them first. But if you\u2019re managing your own site and you\u2019re not blocking anything on purpose, it\u2019s completely safe \u2014 and smart \u2014 to leave it out.<\/p>\n<h2>Quick Recap: What You Need to Know<\/h2>\n<ul>\n<li><code>robots.txt<\/code> tells search engines what not to crawl.<\/li>\n<li>If you don\u2019t have one, search engines will crawl everything they find \u2014 which is usually what you want.<\/li>\n<li>Don\u2019t use <code>robots.txt<\/code> to stop pages appearing in Google \u2014 that needs a noindex tag.<\/li>\n<li>For small business sites, you probably don\u2019t need a <code>robots.txt<\/code> file at all.<\/li>\n<li>Make sure you have an XML sitemap \u2014 and submit it to Google Search Console.<\/li>\n<li>Only use <code>robots.txt<\/code> if you\u2019ve got a specific reason to block something.<\/li>\n<\/ul>\n<h2>Final Thoughts<\/h2>\n<p>Look, there\u2019s enough to deal with when you\u2019re running a small business website \u2014 you don\u2019t need extra tech headaches. The good news is, you don\u2019t need to overthink <code>robots.txt<\/code>. If you want your whole site to be visible and indexable (and let\u2019s be honest, you probably do), then no <code>robots.txt<\/code> file is absolutely fine, although having an empty file may stop the Google Search Console from compaining about the file being missing.<\/p>\n<p>Focus on creating useful pages, writing good content, keeping your site speedy and mobile-friendly, and making sure your sitemap is up to date. That\u2019s the stuff that really makes a difference.<\/p>\n<p>And remember: sometimes doing nothing is the best option. Especially when it comes to robots.<\/p>\n","protected":false},"excerpt":{"rendered":"<p><span class=\"span-reading-time rt-reading-time\" style=\"display: block;\"><span class=\"rt-label rt-prefix\">Reading Time: <\/span> <span class=\"rt-time\"> 7<\/span> <span class=\"rt-label rt-postfix\">minutes : <\/span><\/span>What is robots.txt and Do You Even Need One for Your Small Business Website? If you\u2019ve ever dived into the geeky side of running a website, chances are you\u2019ve come across something called robots.txt. Sounds fancy, right? Like something out of a sci-fi film. But really, it\u2019s just a small text file that can make [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,3],"tags":[],"class_list":["post-2592","post","type-post","status-publish","format-standard","hentry","category-business-advice","category-seo"],"_links":{"self":[{"href":"https:\/\/www.forestsoftware.co.uk\/blog\/wp-json\/wp\/v2\/posts\/2592","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.forestsoftware.co.uk\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.forestsoftware.co.uk\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.forestsoftware.co.uk\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.forestsoftware.co.uk\/blog\/wp-json\/wp\/v2\/comments?post=2592"}],"version-history":[{"count":0,"href":"https:\/\/www.forestsoftware.co.uk\/blog\/wp-json\/wp\/v2\/posts\/2592\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.forestsoftware.co.uk\/blog\/wp-json\/wp\/v2\/media?parent=2592"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.forestsoftware.co.uk\/blog\/wp-json\/wp\/v2\/categories?post=2592"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.forestsoftware.co.uk\/blog\/wp-json\/wp\/v2\/tags?post=2592"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}