Robots.txt file controls the search engine crawling for a website. That means it plays a critical role in the search engine optimization of the Blogger blog. In this article, we’ll understand the best implementation of the robots.txt file in the Blogger blog.
What are the functions of the robots.txt file?
The robots.txt file informs the search engine about the pages which should and shouldn’t crawl. Hence it allows us to control the functioning of search engine bots.
In the robots.txt file, we declare user-agent, allow, disallow, sitemap functions for search engines like Google, Bing, Yandex, etc. Let’s understand the meaning of all these terms.
Usually, we use robots meta tags for all search engines crawling bots to index blog posts and pages throughout the web. But if you want to save crawling budget, block search engine bots in some sections of the website, you’ve to understand the robots.txt file for the Blogger blog.
Analyze the default Robots.txt file of the Blogger Blog
To create a perfect custom robots.txt file for Blogger BlogSpot blog. First, we’ve to understand the structure of the Blogger blog and analyze the default robots.txt file.
By default, this file look likes:
User-agent: Mediapartners-Google Disallow: User-agent: * Disallow: /search Allow: / Sitemap: https://www.example.com/sitemap.xml
- The first line of this file declares the bot type. Here it’s Google AdSense, which is disallowed to none. That means the AdSense ads can appear throughout the website.
- The next user agent is *, which means all the search engine bots are disallowed to /search pages. That means disallowed to all search and label pages(same URL structure).
- And allow tag define that all pages other than disallowing section will be allowed to crawl.
- The next line contains a post sitemap for the Blogger blog.
This is an almost perfect file to control the search engine bots and provide instruction for pages to crawl or not crawl. Please note, here, what is allowed to crawl will not make sure that the pages will index.
But this file allows for indexing the archive pages, which can cause a duplicate content issue. That means it will create junk for the Blogger blog.
Create a Perfect robots.txt file for the Blogger Blog.
We understood how to default robots.txt file perform its function for the Blogger blog. Let’s optimize it for the best SEO.
The default robots.txt allows the archive to index that causes the duplicate content issue. We can prevent the duplicate content issue by stopping the bots from crawling the archive section. For this, we’ve to apply a Disallow rule /20* into the robots.txt file. But this rule will stop the crawling of the pages. So to avoid this, we’ve to apply a new Allow rule for the /*.html section that allows the bots to crawl posts and pages.
The default sitemap includes posts, not pages. So you have to add a sitemap for pages located under https://example.blogspot.com/sitemap-pages.xml or https://www.example.com/sitemap-pages.xml for custom domain.
So the new perfect robots.txt file for the Blogger blog will look like this.
User-agent: Mediapartners-Google Disallow: User-agent: * Disallow: /search* Disallow: /20* Allow: /*.html Sitemap: https://www.example.com/sitemap.xml Sitemap: https://www.example.com/sitemap-pages.xml
You’ve to replace www.example.com with your Blogger domain or custom domain name. For example, suppose your custom domain name is www.iashindu.com, then the sitemap will be at https://www.iashindu.com/sitemap.xml. In addition, you can check the current robots.txt at https://www.example.com/robots.txt.
Above file, the setting is best robots.txt practice as well as for SEO. This will save the crawling budget for the website and will helps the Blogger blog to appear in the search results. Along with you’ve to write SEO-friendly content to appear in the search results.
How to edit the robots.txt file of the Blogger blog?
Robots.txt file must be located at the root level of any website. But in Blogger, there is no access to root, then how to edit this robots.txt file?
Blogger provides all root file settings in its settings like robots.txt and ads.txt files. You have to log in to the Blogger account and edit the robots.txt file.
- Go to Blogger Dashboard and click on the settings option,
- Scroll down to crawlers and indexing section,
- Enable custom robots.txt by the switch button.
- Click on custom robots.txt, a window will open up, paste the robots.txt file, and update.
After updating the custom robots.txt file, check it by visiting https://www.example.com/robots.txt, where www.example.com should be replaced with your domain address.
We understood the function of the robots.txt file. Blogger blog users can set up the above robots.txt file for best results.
In the default robots.txt file, the archive section is also allowed to crawl, which causes duplicate content issues for the search engine. And hence search engine gets confused about what to display in the search result, and not consider your pages for the search result.
It means the Robots tags are essential for the SEO of a website. You can consider combining both robots.txt and robots meta tag in the Blogger blog for the best results.
I hope you like this article. If any doubts or questions regarding Blogger or WordPress SEO, you can comment below.