Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

7 must-read Webmaster Central blog posts

Tuesday, February 12, 2008 at 2:42 PM

Our search quality and Webmaster Central teams love helping webmasters solve problems. But since we can't be in all places at all times answering all questions, we also try hard to show you how to help yourself. We put a lot of work into providing documentation and blog posts to answer your questions and guide you through the data and tools we provide, and we're constantly looking for ways to improve the visibility of that information.

While I always encourage people to search our Help Center and blog for answers, there are a few articles in particular to which I'm constantly referring people. Some are recent and some are buried in years' worth of archives, but each is worth a read:

  1. Googlebot can't access my website
    Web hosters seem to be getting more aggressive about blocking spam bots and aggressive crawlers from their servers, which is generally a good thing; however, sometimes they also block Googlebot without knowing it. If you or your hoster are "allowing" Googlebot through by whitelisting Googlebot IP addresses, you may still be blocking some of our IPs without knowing it (since our full IP list isn't public, for reasons explained in the post). In order to be sure you're allowing Googlebot access to your site, use the method in this blog post to verify whether a crawler is Googlebot.
  2. URL blocked by robots.txt
    Sometimes the web crawl section of Webmaster Tools reports a URL as "blocked by robots.txt", but your robots.txt file doesn't seem to block crawling of that URL. Check out this list of troubleshooting tips, especially the part about redirects. This thread from our Help Group also explains why you may see discrepancies between our web crawl error reports and our robots.txt analysis tool.
  3. Why was my URL removal request denied?
    (Okay, I'm cheating a little: this one is a Help Center article and not a blog post.) In order to remove a URL from Google search results you need to first put something in place that will prevent Googlebot from simply picking that URL up again the next time it crawls your site. This may be a 404 (or 410) status code, a noindex meta tag, or a robots.txt file, depending on what type of removal request you're submitting. Follow the directions in this article and you should be good to go.
  4. Flash best practices
    Flash continues to be a hot topic for webmasters interested in making visually complex content accessible to search engines. In this post Bergy, our resident Flash expert, outlines best practices for working with Flash.
  5. The supplemental index
    The "supplemental index" was a big topic of conversation in 2007, and it seems some webmasters are still worried about it. Instead of worrying, point your browser to this post on how we now search our entire index for every query.
  6. Duplicate content
    Duplicate content—another perennial concern of webmasters. This post talks in detail about duplicate content caused by URL parameters, and also references Adam's previous post on deftly dealing with duplicate content, which gives lots of good suggestions on how to avoid or mitigate problems caused by duplicate content.
  7. Sitemaps FAQs
    This post answers the most frequent questions we get about Sitemaps. And I'm not just saying it's great because I posted it. :-)

Sometimes, knowing how to find existing information is the biggest barrier to getting a question answered. So try searching our blog, Help Center and Help Group next time you have a question, and please let us know if you can't find a piece of information that you think should be there!

The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

23 comments:

Dani said...

Hi Google,

I'm Daniel Tinjeala owner and administrator of example(dot)com !

I've seen today that my website has lost all rankings !

I've checked site:example(dot)com, it's ok it's indexed

I've checked link:example(dot)com... it's ok they are still the same links

I've searched for example(dot)com .... in google ... my website is first

But what happened with all my rankings ? I know i've do it only white hat seo not black hat !

And for 1 week ago i've started to validated my html code !

Waiting for your answer !

Thanks for your precious time !

Have a nice day !

Best regards,
Daniel Tinjeala

Jennifer Mathews Somogyi said...

Thanks for the quick links to some of MY favorite posts. You forgot the linking and meta tag posts you made just in the past few months. I am always and forever referencing those two the most.

See you at SMX!

Jenn

Michael Martinez said...

Just stop telling people they don't need to worry about the Supplemental Index.

Clearly, since Google continues to promote less relevant Main Web Index content above more relevant Supplemental Index content, PEOPLE DO NEED TO WORRY ABOUT THE SUPPLEMENTAL INDEX.

I know a lot of Web searchers and content providers will be much, much happier with Google's currently mediocre results when Google starts focusing on RELEVANCE again.

Webnauts said...

I fully agree with Michael Martinez. Please stop telling us fairy tales.

dotcompals said...

makes good reading. all in one place. thanks

daniel said...

Hi Google,
I'm daniel owner and administrator of danieldaphone.com, may i tell my problem, Thank you for the time.
I has registered there to google webmaster and already in site verification, but I had a problem why in the part of link there are no links found same all of Pages with external links,Pages with internal links,Sitelinks.
I was very confused with this problem, please me, thank you.

SOSBlog said...
This comment has been removed by the author.
John L said...

Probably the wrong group, and I apologize, but I can't find an answer anywhere on the Google site.

A Google search often shows the #1 ranked hit with multiple pages - indexed and hot-linked. For instance, search the word "Aspen" and the #1 hit (Aspen Skiing Co) has eight indexed paged, all hot-linked, such as "Buy Lift Tickets" / "Daily Snow Report" and so forth.

Our company normally comes up 1st for the search word "Millennia" and has been there for years.

How can we get a Google result that lists and hot-links our top 8 pages in the Google search listing?

Thanks for helping.

KD said...

@John L

Search for "Google Sitelinks" and you will find the answer you seek.

Prashant Vikram Singh said...

Hi google, I have a blog on blogspot account thats address is http://prashant-vikram.blogspot.com my blog is ranking for some keywords but same url is not ranking, Url ranking for keywords containg http://www why it is happening ....solve my problem

blogger ku said...

good
thanks

gumbi said...

Hi Google!

Thanks for the great list of links, some really good reading there!

I have several sites that I use Webmaster tools for, and it seems that recently when I log in, I have to reverify most of the sites. This happens at least once every couple of months.

Am I missing something with this, or why would I continually have to reverify these sites? It takes quite a long time to reupload all of the verification meta's all the time.

Any help you could provide would be great!

Susan Moskwa said...

Hi gumbi:

Thank you!

Regarding verification, you should leave those meta tags on your site rather than getting rid of them. As noted here, we periodically re-check to make sure that your verification method is still there. If we no longer find the verification method on your site, we will mark the site as unverified in your account.

Manish Chauhan said...

Thanks Google...for your tips.
I have a blog http://seocrazy.blogspot.com.
Could you please let me know how can I restrict unidentified persons to visit my blog.

Thanks in advance.

Ireneusz said...

Hi,
I think if would be added new tag "<noindex>...</noindex>" (as analogy to "<noscript>") we have no problem with duplicate contents, syndication, etc. Everything between this tag wouldn't be visible for robots, and wouldn't be indexed. Part of page could be normally indexed, and other part could contain repeated text or whatever webmaster wants.
Best regards
Ireneusz DybczyƄski

Marius said...

Thanks for the great tips! The problem I am having however is that the URL's are changed from php to asp? I have 590 URL flagged as not found becuase of this. Here is one URL:
http://www.mysite.com/proddetail.asp?prod=ARR571979

Should be:
http://www.mysite.com/proddetail.php?prod=ARR571979

My XML feed has it correct as php so where is the Google crawler getting this from.

Puzzled. Any help would be appreciated.

Vanessa Alexander said...

Why does my blog verify and then unverify? I placed the Google meta tag in the right place. It verified (check) and then I go back and it is un-verified. Could you post the reasons this happens?

I have two other blogs I added. Then they disappeared. Today they were there. All three verified and then about an hour later the blogs I added had disappeared from the dashboard and my main blog needed verifying again.

Why?

Thanks in advance...

Susan Moskwa said...

If you have questions about your site that are unrelated to this blog post, our Webmaster Help Group would be a much more appropriate place for them.

Seo said...

Hi Google

I have a seo blog http://wwwseoservices.blogspot.com . Please have a look & guide me .
Thanks & Regards
:)

info said...

Hi!

My site completly disappeared from google.

One day it was the "first" of the list for words like:

sang vivant
analyse sang vivant
sang vivant geba

But not it won't even appear in the search engine anymore, even tho Yahoo still place him on top of the list?

Any help?

Patrick
For http://www.sangvivantgeba.com

Rakesh Jha said...

You need to create some good link which are frequently crawled by google.

Regards,
Rakesh Jha

amit thakur said...

Hi
my Blog does not cache last few days can any one help me why Google does not crawling my blog

http://amicars.blogspot.com

Maile Ohye said...

Hi everyone,

Since some time has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Help Forum.

Thanks and take care,
The Webmaster Central Team