Removing Noindex From Your Robots.txt Files

Well, it’s official now. Google is sending out reminders to webmasters that they need to stop relying on noindex in their robots.txt file. Google will be removing support for the noindex directive altogether and is notifying many within the SEO community.

The notification itself comes in the form of a message in Google Search Console with a subject line reading, “Remove ‘noindex’ statements from the robots.txt of…” The body of the message goes on to say that Google never officially supported the rule, and that it will cease to function on September 1, 2019. It then directs the reader to the help centre for information on how to remove it.

Google announced the cut-off date of September 1 more than a month ago and is now working to ensure that the news spreads by sending out these reminders.

Removing Noindex From Your Robots.txt Files

What Needs to Be Done?

If you are among those who have received one of these notices, be sure to follow up on it. Ensure that you find a different way to support whatever is mentioned in the noindex directive. The most important action to take is ensuring that you are not using the noindex directive in the robots.txt file. In the event that you are, then take a look at the changes suggested above before September 1 rolls around.

You should also check to see if you are making use of the no follow or crawl-delay commands. If you are, make sure you start using the supported methods for those directives moving forward.

What Alternatives Are There?

There are some options that you should probably have been using already, but if not, this is what Google suggests:

Noindex in robots meta tags. This is supported in both the HTTP response headers and in HTML. This is the most effective way to remove URLs from the index.

404 and 410 HTTP status codes

404 and 410 HTTP status codes. Both of these codes indicate that the page does not exist. Which will result in the URLs being dropped by Google from their index when they are crawled and processed.

Disallow in robots.txt

Search engines need to know that a page exists before they can index them. That means that blocking the page from being crawled prevents its contents from being indexed. While it’s possible that a search engine may also index a URL by using links from other pages without being able to see the actual content, you are making the pages less visible.

Password protection

If your page is hidden behind a login, that will generally remove it from Google’s index, unless markup is used for paywall content or subscriptions.

Search Console Remove URL tool. Quick and easy to use, this tool removes a URL temporarily from Google’s search results.

With the September 1 deadline fast approaching, now is the time to make the changes you need and remove the noindex from your robots.txt files.

 

Read the official Google announcement here:
https://webmasters.googleblog.com/2019/07/a-note-on-unsupported-rules-in-robotstxt.html