Every search engine crawling bot first interacts with the website’s robots.txt file and crawling rules. That means robots.txt plays a critical role in the search engine optimization of the Blogger blog. This article will explain how you can create a perfect custom robots.txt file for Blogger.
What are the functions of the robots.txt file?
The robots.txt file informs the search engine about the pages which should and shouldn’t crawl. Hence it allows us to control the functioning of search engine bots.
In the robots.txt file, we declare user-agent, allow, disallow, and sitemap functions for search engines like Google, Bing, Yandex, etc. Let’s understand the meaning of all these terms.
Usually, we use robots meta tags to index or noindex blog posts and pages throughout the web. And robots.txt to control the search engine bots. You can allow the complete website to crawl, but it will exhaust the crawling budget of the website. To save the crawling budget of the website, you have to block the archive and label sections of the website.
Analyze the default Robots.txt file of the Blogger Blog
To create a perfect custom robots.txt file for the Blogger blog. First, we’ve to understand the Blogger blog structure and analyze the default robots.txt file.
By default, this file looks like this:
User-agent: Mediapartners-Google Disallow: User-agent: * Disallow: /search Allow: / Sitemap: https://www.example.com/sitemap.xml
- The first line (User-Agent) of this file declares the bot type. Here it’s Google AdSense, which is disallowed to none(declared in 2nd line). That means the AdSense ads can appear throughout the website.
- The following user agent is *, which means all the search engine bots are disallowed to /search pages. That means disallowing all search and label pages(same URL structure).
- And allow tag define that all pages other than disallowing section will be allowed to crawl.
- The following line contains a post sitemap for the Blogger blog.
This is an almost perfect file to control the search engine bots and provide instructions for pages to crawl or not crawl. Please note that what is allowed to crawl will not ensure that the pages will index.
But this file allows for indexing the archive pages, which can cause a duplicate content issue. That means it will create junk for the Blogger’s blog.
Create a Perfect custom robots.txt file for the Blogger Blog.
We understood how to default robots.txt file performs its function for the Blogger blog. Let’s optimize it for the best SEO.
The default robots.txt allows the archive to index, which causes the duplicate content issue. We can prevent the duplicate content issue by stopping the bots from crawling the archive section. For this,
- /search* will disable crawling of all search and label pages.
- Apply a Disallow rule /20* into the robots.txt file to stop the crawling of archive sections.
- The /20* rule will block the crawling of all posts, So to avoid this, we’ve to apply a new Allow rule for the /*.html section that allows the bots to crawl posts and pages.
The default sitemap includes posts, not pages. So you have to add a sitemap for pages located under https://example.blogspot.com/sitemap-pages.xml or https://www.example.com/sitemap-pages.xml for the custom domain. You can submit Blogger sitemaps to Google Search Console for good results.
So the new perfect custom robots.txt file for the Blogger blog will look like this.
User-agent: Mediapartners-Google Disallow: #below lines control all search engines, and blocks all search, archieve and allow all blog posts and pages. User-agent: * Disallow: /search* Disallow: /20* Allow: /*.html #sitemap of the blog Sitemap: https://www.example.com/sitemap.xml Sitemap: https://www.example.com/sitemap-pages.xml
You’ve to replace www.example.com with your Blogger domain or custom domain name. For example, suppose your custom domain name is www.iashindu.com; then the sitemap will be at https://www.iashindu.com/sitemap.xml. In addition, you can check the current robots.txt at https://www.example.com/robots.txt.
Above file, the setting is the best robots.txt practice for SEO. This will save the website’s crawling budget and help the Blogger blog to appear in the search results. You have to write SEO-friendly content to appear in the search results.
But if you want to allow bots to crawl the complete website, the best possible setting for robots.txt and robots meta tag, try advanced robots meta tag and robots.txt file. The combination is one of the best practices to boost the SEO of the Blogger blog.
How to implement the custom robots.txt file to Blogger?
The Robots.txt file is located at the root level of the website. And in Blogger, there is no access to the root, so how to edit this robots.txt file?
You can access root files like robots.txt and X-header Tags under the setting section of Blogger.
- Go to Blogger Dashboard and click on the settings option,
- Scroll down to the crawlers and indexing section,
- Enable custom robots.txt by the switch button.
- Click on custom robots.txt; a window will open, paste the robots.txt file, and update.
After updating the custom robots.txt file for the Blogger blog, check it by visiting your domain like https://www.example.com/robots.txt, where www.example.com should be replaced with your domain address.
We understood the function of the robots.txt file and created a perfect custom robots.txt file for the Blogger blog.
In the default robots.txt file, the archive section is also allowed to crawl, which causes duplicate content issues for the search engine. And hence search engine gets confused about what to display in the search result and what to not. In such a case, Google will not consider any Page for the search results.
The Robots tags are essential for the SEO of a website. You can also consider combining both robots.txt and robots meta tags in the Blogger blog if you don’t want to block any section to crawl. Alternatively, download responsive and SEO-friendly templates for the Blogger blog.
I hope you like this article. You can comment below if you have any doubts or questions regarding Blogger or WordPress SEO.