Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

How to verify Googlebot

Wednesday, September 20, 2006 at 11:45 AM

Lately I've heard a couple smart people ask that search engines provide a way know that a bot is authentic. After all, any spammer could name their bot "Googlebot" and claim to be Google, so which bots do you trust and which do you block?

The common request we hear is to post a list of Googlebot IP addresses in some public place. The problem with that is that if/when the IP ranges of our crawlers change, not everyone will know to check. In fact, the crawl team migrated Googlebot IPs a couple years ago and it was a real hassle alerting webmasters who had hard-coded an IP range. So the crawl folks have provided another way to authenticate Googlebot. Here's an answer from one of the crawl people (quoted with their permission):


Telling webmasters to use DNS to verify on a case-by-case basis seems like the best way to go. I think the recommended technique would be to do a reverse DNS lookup, verify that the name is in the googlebot.com domain, and then do a corresponding forward DNS->IP lookup using that googlebot.com name; eg:

> host 66.249.66.1
1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com.

> host crawl-66-249-66-1.googlebot.com
crawl-66-249-66-1.googlebot.com has address 66.249.66.1

I don't think just doing a reverse DNS lookup is sufficient, because a spoofer could set up reverse DNS to point to crawl-a-b-c-d.googlebot.com.


This answer has also been provided to our help-desk, so I'd consider it an official way to authenticate Googlebot. In order to fetch from the "official" Googlebot IP range, the bot has to respect robots.txt and our internal hostload conventions so that Google doesn't crawl you too hard.

(Thanks to N. and J. for help on this answer from the crawl side of things.)
The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

8 comments:

mycall said...

one other thing to realize is that you need to look at the hostname from the RDNS lookup (i.e. must have "google.com" at end of string). Also, many of the search engine's IP addresses do not resolve to any DNS, so this is kinda buggy (see iplists.com). Finally, IP spoofing can get around this all together.

Mike Adewole said...

The solution described here is more complicated than necessary. I've explained a much simpler solution here:

http://botsosphere.blogspot.com/2007/05/automatic-verification-of-machine.html

Markus said...

It might also help if we know what Googlebot actually "looks" like. Since you're the closest to knowing, would you care to give us any hints for the contest? ;)
http://blog.auinteractive.com/googlebot-competition

Matt said...

Matt, Good to see your post here and ofcourse great information.

Pratheep

Lucian said...

You can check the IP address on websites like: Ip address lookup , you will know if it is google bot or no.

mpaine said...

When is GoogleBot going to support challenge SHA256 keys which domains can register? This way, nothing can be spoofed?

raghu said...

The information about verifying google bot to use DNS to verify. But i am a beginner to web hosting and i had created a site on Rudraksha . I can't afford to appoint anyone to maintain it and i donno to use DNS to verify IP's in this case any alternate solution to verify google bot

Google Webmaster Central said...

Hi everyone,

Since over a year has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Help Group.

Thanks and take care,
The Webmaster Central Team