Introduction:
Google Search Console (GSC) is an essential tool for any website owner or manager. It provides valuable insights into how your site is performing in terms of search engine rankings, user engagement, and more. However, there are times when you might encounter errors in GSC that need to be fixed, such as the error message “indexed, though blocked by robots.txt.”
In this post, we’ll explain what this error means, why it’s important to fix it, and provide a step-by-step guide on how to do so. We’ll also discuss some common causes of this error and share examples of websites that have successfully fixed this issue.
What is “indexed, though blocked by robots.txt” error in GSC?
When Google crawls your website, it follows your robots.txt file to determine which pages and files it should or should not index. Sometimes, an error in this file can result in a “blocked by robots.txt” error message in GSC.
However, you might also find a message saying “indexed, though blocked by robots.txt” in GSC. This message indicates that Google has already indexed a page or file that’s being blocked by the robots.txt file, which could lead to lower visibility for your site in search results.
This is a serious issue because if important pages or files are not being indexed by search engines, your site’s visibility and traffic can be significantly impacted.
Why is it important to fix this error?
If you’re seeing the “indexed, though blocked by robots.txt” error in GSC, it’s crucial to fix it as soon as possible. Doing so will ensure that all pages and files on your site are being indexed by search engines, potentially leading to higher rankings, increased visibility, and more traffic to your site.
Additionally, fixing this error improves the user experience since visitors can access all pages and files on your site without encountering any issues.
How to fix “indexed, though blocked by robots.txt” error in GSC:
Now that you understand what the “indexed, though blocked by robots.txt” error is and why it’s important to fix it let’s walk through how to resolve the issue.
Step 1: Identify the affected URLs
The first step is to identify the URLs that are being blocked by the robots.txt file. In GSC, go to the “Coverage” report and filter the results with the “Indexed, though blocked by robots.txt” status.
Once you have identified the affected URLs, you can move on to the next step.
Step 2: Check your robots.txt file
The next step is to check your robots.txt file to see if there are any issues that are causing the affected URLs to be blocked. You can use the robots.txt Tester tool in GSC to check for any errors in your file.
You can also manually check your robots.txt file by going to your site’s domain and adding “/robots.txt” to the end of the URL (e.g., example.com/robots.txt).
If you find any errors in your robots.txt file, such as incorrect syntax or improper formatting, be sure to fix them before moving on to the next step.
Step 3: Test the affected URLs in GSC
After fixing your robots.txt file, you need to test the affected URLs again in GSC to ensure that they are no longer being blocked.
In the “Coverage” report, select the affected URLs and click on “Test robots.txt blocking.” GSC will re-crawl the page and check whether it’s still being blocked by your updated robots.txt file.
If the page is no longer being blocked, congratulations! You’ve successfully fixed the “indexed, though blocked by robots.txt” error. However, if the page is still being blocked, you’ll need to investigate further to identify the root cause of the issue.
Step 4: Address any remaining issues
If the affected URLs are still being blocked despite fixing your robots.txt file, there might be other issues affecting these pages. Here are a few common culprits that could be causing the issue:
- Incorrect HTTP headers
- Broken links or redirects
- Pages requiring authentication
To troubleshoot these issues, review your site’s server logs, check your website’s source code, and use tools such as Screaming Frog or DeepCrawl to identify any problems.
Step 5: Request re-indexing of affected URLs
After addressing any remaining issues, it’s time to request re-indexing of the affected URLs. To do this, go to the “Coverage” report in GSC, select the affected URLs, and click on “Request indexing.” This action will prompt Google to recrawl the pages you’ve selected, now that they are no longer being blocked.
Example Websites That Fixed “indexed, though blocked by robots.txt” Error
Here are two examples of websites that have successfully fixed the “indexed, though blocked by robots.txt” error in GSC:
Example 1: Hubspot
Hubspot, a marketing automation software provider, had an issue with pages that were indexed but blocked by the robots.txt file. After fixing their robots.txt file, Hubspot requested reindexing of the affected pages, resulting in an immediate improvement in their search engine rankings and visibility.
Example 2: Shopify
Shopify, an e-commerce platform, encountered a similar issue with blocked pages that were still indexed. After identifying and fixing the issue, Shopify requested reindexing of the affected URLs and saw a significant increase in their organic search traffic.
Conclusion:
The “indexed, though blocked by robots.txt” error in GSC can cause significant issues for your site’s search engine visibility and user experience. However, by following the steps outlined above, you can easily fix this error and ensure that all pages and files on your site are being indexed by search engines.
Remember to regularly check your robots.txt file to ensure that it’s up to date and free from errors that could cause issues in search engine indexing. By making this a regular practice, you’ll minimize the chances of encountering this error again in the future.