Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

Our new search index: Caffeine

Tuesday, June 08, 2010 at 5:00 PM

(Cross-posted on the Official Google Blog)

Today, we're announcing the completion of a new web indexing system called Caffeine. Caffeine provides 50 percent fresher results for web searches than our last index, and it's the largest collection of web content we've offered. Whether it's a news story, a blog or a forum post, you can now find links to relevant content much sooner after it is published than was possible ever before.

Some background for those of you who don't build search engines for a living like us: when you search Google, you're not searching the live web. Instead you're searching Google's index of the web which, like the list in the back of a book, helps you pinpoint exactly the information you need. (Here's a good explanation of how it all works.)

So why did we build a new search indexing system? Content on the web is blossoming. It's growing not just in size and numbers but with the advent of video, images, news and real-time updates, the average webpage is richer and more complex. In addition, people's expectations for search are higher than they used to be. Searchers want to find the latest relevant content and publishers expect to be found the instant they publish.

To keep up with the evolution of the web and to meet rising user expectations, we've built Caffeine. The image below illustrates how our old indexing system worked compared to Caffeine:
Our old index had several layers, some of which were refreshed at a faster rate than others; the main layer would update every couple of weeks. To refresh a layer of the old index, we would analyze the entire web, which meant there was a significant delay between when we found a page and made it available to you.

With Caffeine, we analyze the web in small portions and update our search index on a continuous basis, globally. As we find new pages, or new information on existing pages, we can add these straight to the index. That means you can find fresher information than ever before — no matter when or where it was published.

Caffeine lets us index web pages on an enormous scale. In fact, every second Caffeine processes hundreds of thousands of pages in parallel. If this were a pile of paper it would grow three miles taller every second. Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. You would need 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for more than 40 miles.

We've built Caffeine with the future in mind. Not only is it fresher, it's a robust foundation that makes it possible for us to build an even faster and comprehensive search engine that scales with the growth of information online, and delivers even more relevant search results to you. So stay tuned, and look for more improvements in the months to come.

The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

116 comments:

Luke Wieselmann said...

Great update to the index

Alejandro Guidotti said...

YAY!

Simon Lynch said...

All very cool, until you get someone complaining about a post we deleted 160+ days ago which is still in Google under their name. Freshness everywhere pls.

http://www.infopediaonline.com/ said...

this is a good sstem. can we expect our new published posts to be indexed in google quickly than it used to be before

David Rivers said...

I love the graphics. Caffeine looks a bit chaotic though, eh.

Alecco Locco said...

Very interesting. Can you please elaborate a bit more on the new system internals? Please!

oscarmartell said...

This is a great name, hope will be a great form, as to which we have become accustomed.

Kimchi n Rice said...

that is awesome! cant wait

Suzen said...

re: I love the graphics. Caffeine looks a bit chaotic though, eh.

Why is it that lines look "organized" and curves look chaotic? Too much 3-D animation eh? to me looks like the cosmos vs a railroad track. lol

ht990332 said...

Will this change the googlebot useragent?

Lakkineni said...

Awesome post, very informative on how things work. Can you clarify on what are the percentages (%) of index is photos,content, real time updates?

Massimo said...

Looking forward to a better search experience. Thanks.

shrapnel09 said...

How do you predict this will affect the problem of SEO Poisoning websites that quickly climb the ranks of popular search terms?

Vithar said...

So whats the significants of the inverse square law in caffeine?

SoloMonster said...

Significant update!!!

Poker_Maniac said...

Does that mean for example that Google would update metadescriptions way more frequently? I am not talking about blog, tweet or fb updates i am talking i.e. about static content like in the eshop. If i would put the price of an item in my metadescription is there chance that it would be indexed way faster if i change the price? Does that affect rich snippet updates as well?

paygseo said...

that sounds really exciting as an seo!

does this mean that the push for QDF is going to continue and that it is now easier to rank well for fresh content instead of just <a href="http://www.paygseo.co.uk>building backlinks</a>?

Oğuz Kağan Aslan said...

SEO consultants have to work harder from now on! :)

Maung Nanda Linn Aung said...

This is interesting..

I am just a heavy coffee drinker and I got this here via twitter by the way..

Blog budowlany said...

It's definitely much better as the previous index. We can see improvements straight away - for example blog posts are being indexed right after they are published, news are much more fresh, etc. Good work!

Eric said...

finally the Caffeine!

this means quicker and quicker. for "fresher" sites/blogs they get to top the serp quicker than before.

correct me if i was wrong :)

Gareth said...

'So stay tuned, and look for more improvements in the months to come.' Hmm... wonder what else in the pipeline?

dotcompals said...

looking forward to the the new caffeine magic !

k madhav said...

Yes this is sure that all SEO's work more harder. This features is very cool and good for normal man who finds only relevant and updated information for his search. Good Job....But now how this change effects the search results..

k madhav said...

Yes, this can be very helpful for social media sites and bookmarking sites because the content in these sites on frequently updation ....

Prodosh said...

Let's see how fast it works. It seems it'll prioritize social media for ranking and indexing. It also somehow beneficial for SEO consultant like us as it proves again that SEO is not a one time deal ;-)

k madhav said...

What is this going on Official Google Webmaster Central Blog useless and spam comments are approved whether they are nofollow but this is totally spamming...So how our blogs and sites are safe from these spammers...

This is an blog for public announcements not for website promotion....

Google should take an immediate action against on these spammers.

Eswari said...

Fresh and informative blogs and sites will stand first now. :) So be update

Freek De Man said...

Cool, really curious to see how this is going to work!

Preeti said...

You are right madhav. Spam comments should be removed. What the hell google's spam controllers are doing!!!

Dobiatowski said...

Could you submit some ip addresses where caffeine is working now? and please give more information when it will be available globally. regards.

Shaikul Akbar said...

So if Google indexes web pages in short portions, then the websites like forums, blogs, twitter, facebook etc will get indexed faster than other normal website.

Typhoon said...

Superb News..Let's see how it effects in my blog rankings..

http://www.smartbloggerz.com

ZvrxxX said...

Thnks for information..

We Love Google :))


video izle

Ruchira said...

great information - look forward to working with it. :-)

Iulian Basescu said...

how does caffeine behaves versus google sitemap generator installed on the host ?

Greg Da Sole said...

I've been using the Caffeine beta for a while now and must say I prefer the depth of search options available.

Typhoon said...

Great News Google!

http://www.smartbloggerz.com

David Eaves said...

I don't get it, you are saying that it's the largest collection of web content ever offered, but in the last few weeks millions of pages have been dropped from the index.

MadArtjewelry said...

I am wondering how this will affect those of us who are Etsy sellers. http://MadArtjewelry.etsy.com

kRemtronicz said...

I love the new search engine. Yes, it is a bit different but it certainly seems to work faster. Good job!

Redcort Software said...

While I'm delighted in the caffeine enhancements, I find it interesting that the perpetual spammy top 2-3 SERP positions for my top search words (time clock software) for our Virtual TimeClock product haven't changed.

Either the black hat SEO guys are on top of caffeine or the full benefits have yet to arrive.

Yuhong Bao said...

What about Google Groups search?

Yuhong Bao said...
This comment has been removed by the author.
stang.app said...

Google rocks! Can't wait for Caffeine to kick in :)

tyler said...

Carrie Grimes... Anyone ever call you Grimey?

On-topic : This is good news. I am interested in playing around with this new engine and finding new ways to test SEO.

Tater Lady said...

I have some articles that are not indexing on google, can someone tell me who I can contact with questions?

thanks.
Kelly

Vernon Chalmers said...

Web 3 in the making...new challenges for SEO indeed. My clients will benefit for sure.

Sumit Pranav said...

Interesting.
Google now giving more fresh and relevant results.

Benj Arriola said...

Ok, time to check the results... did our sites go up, or go down. *LOL*

Debt Free Hispanic said...

Okay, at least i'm getting some information. Thanks for posting and letting us webmasters in on whats going on. Really appreciate it.

Hangar 17 said...

Its interesting, I hope the new system would better results than previous.

Looking forward to get more experience....

-luzie- said...

Hmm, a robust foundation for the future? There's live real time direct search lurking around the corner, so just re-vamping the dinosaur won't save you from going where others have gone before. The internet itself will be searchable, not a private index ... :D

-luzie-

Dhapri Software said...

Grate and awesome, worlds always admire with Google technology...

wm said...

I have an idea. Can we have a software that we install on our server or local computer which allows us to index our site when & as often as we want & how much to index. It then sends the raw info to Google. This takes the strain away from Google's computer, but ensures that our site is up to date & correct with Google.

Oscar said...

The new google is great but I was just wondering why it doesn't see all the updates of my site http://www.iberestudios.com !! We worked really hard to get a really fresh content. I also wonder why google still looks at old contect when we change all the web structure six months ago.
Thanks

Indian ePapers said...

Excellent work guys. Keep it up.

www.greenitweb.co.za said...

Thank you for this update South Africa will be green IT by www.greenitweb.co.za

Denise said...

Freshness is good and changes are good.BUT,SMBs have a hard time keep track of it all. See how SMBs can make make Caffeine work for them: http://www.zebworks.com/zeblog/google-caffeine-chaos-whats-an-smb-to-do

EastCode team said...

I wonder how this will effect the blog search or blog indexing? Faster recognition of daily update ?

徵信情感小屋 said...

Index it very quickly and content is more effective. Congratulations!
But where is the road of SEO.

Sathish said...

I have received a couple of Tweets from my friends regarding a problem with Google search and caffeine with inappropriate and irrelevant content showing up as the first result!!! can u please check it. I have posted the link here.

http://www.google.com/search?hl=en&source=hp&q=symbiosis+distance+mba&aq=f&aqi=g-c1g7g-m1&aql=&oq=&gs_rfai=

Article Guru said...

Was wondering what the soft 404's were for. Had six show up for my article directory.

webalytics said...

Very intresting! Let's see what the impact on SEO will be...

johnc said...

Under Caffine, will new pages index faster?
My client hosts their website with UniteU and is integrating their website inventory with the Retail Pro inventory management system. This integration will change the URL structure breaking all current URL's of department and product pages indexed by Google. The UniteU Account Manager says that they cannot do anything to redirect visitors who click on an indexed link containing the old URL to the new URL. So the link will take them to a "sorry, page not found" message on their site.

What are your thoughts?

.... said...

that illustration of how your old indexing system worked compared to Caffeine just cracks me up!

Great graphic!

@rul said...

Thank for your information,, i think this article very useful.

Justin said...

I saw my restaurants in cape town website jump from page 18 to 1 then back down to 4 thanks to this new indexing system.

el kybalion said...

does caffeine consider your web history when it gives you search results??

Stropdas-info said...

It seems like new blog posts can be found quickly on google, but the 'old fashioned' websites and shops are getting a lower rank. so it is good thing fresh information is ranked first, however it is a pitty that good information sites or shops will get a lower rank...

manoj said...

It's a great resource of update. that sounds really exciting as an off page seo

deeps said...

great stuff better for users as well as for new publishers

Bipin Preet Singh said...

this is good stuff, but where are the details?

NEY said...

increible......info actual

pratik said...

How do they assign a page ranks to a fresher webpage as compared to an a older page in caffeine indexing?

Neha Arora said...

Fabuluous c:)

Insulation said...

what about seo services now

Emmanuel said...

Great news!! Looking forward to get more details about the core process.

About Glad Its Them said...

Does SEO still factor into this or not?

Leo Sigh said...

I'm already noticing blog posts indexing faster. Nice job Google.

Power Sales Jobs said...

Does keyword density play a factor in indexing now?

Brilliant Success said...

I'm all excited with this new indexing system. Hope google would have a video on this, to help us spread the message to others.

SwayamDas2010 said...

I guess, this new update in the Search Algorithms is a vital strategic point of difference against other Search Engine Competitors! ;)

Well Done, Google! :)

Yours Forever,
SwayamDas2010.

john said...

Does this mean at last I can be on page 1

http://www.pension-transfers-qrops.com/

John Smith said...

Interesting. Now we will get fresh results. Thanks to Google for such a great service

dailywebarticles said...

Great step from Google, I am dreaming one day search engine can explore all websites in the word without spending time in indexing and ranking, it’s far from reality but caffeine system it’s one step to make it real.
Mike, Dailywebarticles.com

sparky said...

this is good news, hopefully the new system works well.

indoor led lighting

Nigel Coates said...

Great news for Bloggers! Gives us a better chance vs' the Established sites in our niche :)

Y said...

crawling and indexing... just it.

Tung said...

Finally. A definitive answer.

Chintala Radhika said...

great update for fresh results and advantageous,but in previous one of my site web cache is for every 5 days but now it doesn't change from when caffeine completed, if i update new content to my page when it will get cache and shown in search results

TooHow Administrator said...

Great.........Keep it up...Google

=====================
http://www.toohow.blogspot.com/
http://toohow.blogspot.com

junglenotes.com said...

Wow.. it's really great improvement!!
I have waiting for 2 week my new site junglenotes.com to be indexed!

But with this new search index - caffeine, I'm really excited!!

Hope we can enjoy it sooner!
Thanks Google!

admin carhireshop said...

I love Googlebot when it comes to my site again and again!

ericc858 said...

So basically, how do I have my website http://betfootballspreads.com appear in google search engines? Is it something that I have to 'add url' or will it automatically catch it?

Adnanfaridi said...

Hmm, Sounds all cool but our website GoingBusiness.com, nobody can find it in the US when u search for it's topic - businesses for sale, instead it shows up in India and were' a US based company with proof right on our about us page.
Bing and Yahoo both have us covered in the right location (US),we don't rank great on them but I guess something is better then nothing.

Artemis said...

Too bad that link spammers can still get their site pagerank 5, but good SEO doesnt get you any higher than a 3 with months of work :(

Coke said...

Amazing аgain! So what is next? I know Google, basic advantage of it is that it never sleeps!

Marketingonlinept.com said...

wow thats great for information and real time search,but it requires a new line of thought to the marketing online professional aroud the world

anil webmaster said...

I think google is reading JavaScript and showing in search results.

It indexed unwanted URLs in javascript code and In webmaster tools its shows as 404/500 page errors.

what do u say ?

IT Professional said...

really a very good and much needed update

jhakingshuk said...

Sir,
Can you kindly give me this information:
how can I have my website http://prithwiraj-jha.weebly.com
shown by using google search engines?
I had already indexed the site with http://www.google.com/addurl about 2 months ago, but still my site does not appear through google search............. can someone please help me out???


Yours faithfully,

Prithwiraj Jha

ndalbo said...

so.. how my web can index-ed by google

ndalbo said...

Will this change the page rank?

ronnie said...

I'm looking for more information on how to get listed on google maps. Can anyone direct me to a page that can help?

Misticyoda said...

Great speed for web 2.0 post and media.

Congrats

beweb said...

Too much caffeine? This new search index seems to have indexed an unrelated greek PDF as our homepage. Here is an unbeleiveable cached copy of our site www.datarecovery.co.nz

http://webcache.googleusercontent.com/search?hl=en&q=cache%3Ahttp%3A%2F%2Fwww.datarecovery.co.nz

This unrelated content has killed our site's ranking.

PennyStockHot.com said...

i hope this works out for all the 'honest' webmasters out there :-)

Saklı said...

I m always waiting googlebot to index my web site www.saklilezzetler.com
I know that Google always tries to improve itself.And i'm sure that new system will achieve the goals..
I hope this system will be better for webmasters.

Kapil Nakra said...

I posted a page 10 days back. It got crawled 5 days back. But it is yet not indexed. Is Caffeine working?

Sansar Grewal said...

Thanks Google for this new crawler

Jared said...

Caffeine may work for bloggers, but SUCKS for ecommerce users/owners. Now a simple search for a product we've had for years returns HIGHLY irrelevant results. What a joke. The only one benefiting from this is Google. Their CPC revenues will go through roof. Wonderful job Google ... NOT.

Ardian T. Saragih said...

Hope it's will be better. Good effort from google team. I appreciate it.

Real Estate SEO Expert said...

Really great update. Now indexing process is so brilliant and fast.

Saputro Herbals said...

This makes us more spirit to continue to provide the latest information for the readers.

Google Webmaster Central said...

Hi everyone,

Since over a year has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Central Help Forum.

Thanks and take care,
The Webmaster Central Team