Robots.txt file controls the search engine crawling for a website. That means it plays a critical role in the search engine optimization of the Blogger blog. In this article, we’ll understand the best implementation of the robots.txt file in the Blogger blog.
What is the function of the robots.txt file?
With the help of the robots.txt file, we tell the search engine about the pages which should and shouldn’t crawl. Hence it allows us to control the functioning of search engine bots.
In the robots.txt file, we use the user-agent, allow, disallow, sitemap function to declare search engine bots, pages allowed to crawl, pages not allowed to crawl.
Usually, we use commands for all search engine crawling bots to index the pages throughout the web. But, for more details, you’ve to understand the robots.txt file for the Blogger blog.
The Best Robots.txt file for the Blogger Blog
To create a perfect custom robots.txt file for BlogSpot. First, we’ve to understand the functioning of the Blogger blog. For this, let’s analyze the default robots.txt file.
By default, this file look likes:
User-agent: Mediapartners-Google Disallow: User-agent: * Disallow: /search Allow: / Sitemap: https://www.example.com/sitemap.xml
- The first line of this file declares the bot type. Here it’s Google AdSense, which is disallowed to none. That means the AdSense ads can appear throughout the website.
- The next user agent is *, which means all the search engine bots are disallowed to /search pages. That means disallowed to all search and label pages(due to the same URL structure).
- And allow tag define that all pages other than disallowing section will be allowed to crawl.
- The next line contains a post sitemap for the Blogger blog.
This is an almost perfect file to control the search engine bots and provide instruction for pages to crawl or not crawl. Please note, here, what is allowed to crawl will not make sure that the pages will index.
But this file allows for indexing the archive pages, which can cause a duplicate content issue. That means it will create junk for the Blogger blog.
We’ve to prevent this duplicate content issue caused by the archive section. That can achieve by stopping the bots from crawling the archive section. For this, we’ve to apply a Disallow rule /20* into the robots.txt file. But this rule will stop the crawling of the pages. So to avoid this, we’ve to apply a new Allow rule for the /*.html section that allows the bots to crawl posts and pages.
The default sitemap includes posts, not pages. So you have to add a sitemap for pages located under https://example.blogspot.com/sitemap-pages.xml or https://www.example.com/sitemap-pages.xml for custom domain.
So the new perfect robots.txt file for the Blogger blog will look like this.
User-agent: Mediapartners-Google Disallow: User-agent: * Disallow: /search* Disallow: /20* Allow: /*.html Sitemap: https://www.example.com/sitemap.xml Sitemap: https://www.example.com/sitemap-pages.xml
Please replace www.example.com with your Blogger domain or custom domain name. For example, suppose your custom domain name is www.iashindu.com, then the sitemap will be at https://www.iashindu.com/sitemap.xml. In addition, you can check the current robots.txt at https://www.example.com/robots.txt.
Above file, the setting is best robots.txt practice as well as for SEO. This will save the crawling budget for the website and will helps the Blogger blog to appear in the search results. Along with you’ve to write SEO-friendly content to appear in the search results.
How to edit the robots.txt file of the Blogger blog?
Robots.txt file is always located at the root level of any website. But in Blogger, there is no access to root, then how to edit this robots.txt file?
Blogger provides all root file settings in its settings like robots.txt and ads.txt files. You have to log in to the Blogger account and edit the robots.txt file.
- Go to Blogger Dashboard and click on the settings option,
- Scroll down to crawlers and indexing section,
- Enable custom robots.txt by the switch button.
- Click on custom robots.txt, a window will open up, paste the robots.txt file, and update.
After updating the custom robots.txt file, check it by visiting https://www.example.com/robots.txt, where www.example.com should be replaced with your domain address.
I hope you like this article. If any doubts or questions regarding Blogger or WordPress SEO, you can comment below.