Whatever You Required To Understand About The X-Robots-Tag HTTP Header

Posted by

Seo, in its most fundamental sense, trusts something above all others: Search engine spiders crawling and indexing your website.

However almost every site is going to have pages that you do not want to include in this exploration.

For example, do you really want your personal privacy policy or internal search pages showing up in Google results?

In a best-case scenario, these are doing nothing to drive traffic to your site actively, and in a worst-case, they might be diverting traffic from more crucial pages.

Fortunately, Google permits webmasters to inform online search engine bots what pages and content to crawl and what to overlook. There are a number of ways to do this, the most typical being utilizing a robots.txt file or the meta robots tag.

We have an exceptional and in-depth explanation of the ins and outs of robots.txt, which you should definitely read.

But in high-level terms, it’s a plain text file that lives in your website’s root and follows the Robots Exemption Procedure (ASSOCIATE).

Robots.txt supplies crawlers with guidelines about the site as an entire, while meta robots tags include directions for specific pages.

Some meta robots tags you might use consist of index, which informs search engines to include the page to their index; noindex, which tells it not to include a page to the index or include it in search results; follow, which advises an online search engine to follow the links on a page; nofollow, which tells it not to follow links, and a whole host of others.

Both robots.txt and meta robots tags are useful tools to keep in your toolbox, but there’s likewise another method to instruct search engine bots to noindex or nofollow: the X-Robots-Tag.

What Is The X-Robots-Tag?

The X-Robots-Tag is another method for you to control how your webpages are crawled and indexed by spiders. As part of the HTTP header response to a URL, it controls indexing for a whole page, along with the particular aspects on that page.

And whereas utilizing meta robots tags is fairly straightforward, the X-Robots-Tag is a bit more complicated.

But this, naturally, raises the concern:

When Should You Utilize The X-Robots-Tag?

According to Google, “Any regulation that can be utilized in a robots meta tag can also be defined as an X-Robots-Tag.”

While you can set robots.txt-related instructions in the headers of an HTTP action with both the meta robotics tag and X-Robots Tag, there are particular circumstances where you would want to utilize the X-Robots-Tag– the two most typical being when:

  • You wish to manage how your non-HTML files are being crawled and indexed.
  • You want to serve regulations site-wide instead of on a page level.

For instance, if you wish to obstruct a specific image or video from being crawled– the HTTP response method makes this simple.

The X-Robots-Tag header is likewise helpful because it permits you to combine several tags within an HTTP response or utilize a comma-separated list of directives to define regulations.

Possibly you do not want a specific page to be cached and want it to be unavailable after a certain date. You can utilize a mix of “noarchive” and “unavailable_after” tags to instruct search engine bots to follow these instructions.

Basically, the power of the X-Robots-Tag is that it is much more versatile than the meta robots tag.

The benefit of utilizing an X-Robots-Tag with HTTP actions is that it allows you to utilize routine expressions to carry out crawl instructions on non-HTML, along with apply parameters on a larger, worldwide level.

To assist you understand the difference in between these directives, it’s practical to categorize them by type. That is, are they crawler regulations or indexer directives?

Here’s a helpful cheat sheet to describe:

Spider Directives Indexer Directives
Robots.txt– utilizes the user representative, enable, disallow, and sitemap instructions to specify where on-site search engine bots are permitted to crawl and not enabled to crawl. Meta Robots tag– allows you to define and prevent online search engine from showing specific pages on a site in search results.

Nofollow– allows you to define links that need to not pass on authority or PageRank.

X-Robots-tag– allows you to manage how defined file types are indexed.

Where Do You Put The X-Robots-Tag?

Let’s state you wish to block specific file types. An ideal method would be to add the X-Robots-Tag to an Apache setup or a.htaccess file.

The X-Robots-Tag can be added to a website’s HTTP reactions in an Apache server configuration via.htaccess file.

Real-World Examples And Uses Of The X-Robots-Tag

So that sounds terrific in theory, but what does it look like in the real life? Let’s have a look.

Let’s state we desired online search engine not to index.pdf file types. This configuration on Apache servers would look something like the below:

Header set X-Robots-Tag “noindex, nofollow”

In Nginx, it would appear like the below:

area ~ * . pdf$

Now, let’s take a look at a various circumstance. Let’s say we want to use the X-Robots-Tag to obstruct image files, such as.jpg,. gif,. png, etc, from being indexed. You could do this with an X-Robots-Tag that would look like the below:

Header set X-Robots-Tag “noindex”

Please note that comprehending how these instructions work and the impact they have on one another is crucial.

For example, what occurs if both the X-Robots-Tag and a meta robotics tag are located when spider bots find a URL?

If that URL is obstructed from robots.txt, then certain indexing and serving instructions can not be found and will not be followed.

If instructions are to be followed, then the URLs containing those can not be prohibited from crawling.

Look for An X-Robots-Tag

There are a couple of various methods that can be used to check for an X-Robots-Tag on the site.

The most convenient way to examine is to set up a web browser extension that will inform you X-Robots-Tag details about the URL.

Screenshot of Robots Exclusion Checker, December 2022

Another plugin you can utilize to identify whether an X-Robots-Tag is being utilized, for instance, is the Web Developer plugin.

By clicking on the plugin in your browser and browsing to “View Response Headers,” you can see the numerous HTTP headers being utilized.

Another method that can be utilized for scaling in order to identify concerns on sites with a million pages is Screaming Frog

. After running a site through Shrieking Frog, you can browse to the “X-Robots-Tag” column.

This will reveal you which areas of the site are utilizing the tag, along with which specific regulations.

Screenshot of Yelling Frog Report. X-Robot-Tag, December 2022 Utilizing X-Robots-Tags On Your Site Understanding and controlling how online search engine engage with your website is

the cornerstone of seo. And the X-Robots-Tag is an effective tool you can use to do simply that. Simply know: It’s not without its dangers. It is very simple to make a mistake

and deindex your entire website. That stated, if you’re reading this piece, you’re probably not an SEO novice.

So long as you utilize it carefully, take your time and check your work, you’ll find the X-Robots-Tag to be a beneficial addition to your arsenal. More Resources: Included Image: Song_about_summer/ Best SMM Panel