How Googlebot Crawls the Web

By Bobby McIntyre / June 12, 2025

In this episode of Search Off the Record, Martin and Gary from the Google Search Relations team take a deep dive into how Googlebot and web crawling work—past, present, and future. Through their humorous and thoughtful conversation, they explore how crawling evolved from the early days of the internet, when scripts could index a chunk of the web from a single homepage, to the more complex and considerate systems used today. They discuss the basics of what a crawler is, how tools like cURL or Wget relate, and how policies like robots.txt ensure crawlers play nice with web infrastructure.

The conversation also covers Google’s internal shift to unified infrastructure for all crawling needs, highlighting how different teams moved from separate crawlers to a shared system that enforces consistent policies. They explain why some fetches bypass robots.txt (like user-initiated actions) and the rising impact of automated traffic from new products and AI agents. With a nod to initiatives like Common Crawl, the episode ends with a look at the road ahead, acknowledging growing internet congestion but remaining optimistic about the web’s capacity to adapt.

Resources:

Episode transcript → https://goo.gle/sotr092-transcript

Listen to more Search Off the Record → https://goo.gle/sotr-yt
Subscribe to Google Search Channel → https://goo.gle/SearchCentral

Search Off the Record is a podcast series that takes you behind the scenes of Google Search with the Search Relations team.

#SOTRpodcast #SEO #SearchOfTheRecord

Speakers: Martin Splitt, Gary Illyes
Products Mentioned: Googlebotl, Gemma, Google AI

source

12 thoughts on “How Googlebot Crawls the Web”

@frederikstilunddrost2924
June 12, 2025 at 9:28 am

My name is Frederik and my Job is: Job Title XD

Reply
@missamericausa
June 12, 2025 at 9:28 am

Liar info to intentionally harm….
And they won’t cooperate with IP addresses to take it down. We can have all the information photographs, etc., etc. and they don’t care. And it keeps reposting. Even not googling and not looking at it is seeing off of Google. You can look six months later and it’ll still be there. You could still send them a message they don’t help at all.

Reply
@missamericausa
June 12, 2025 at 9:28 am

Google is the worst False info represented over Google and other search engines that destroyed my reputation from people posting lies from stalkers. And it’s impossible to change and take down. Needless to say lots of jobs hard time finding a place to live and family and friends disowning you. All to false information on Googlespoken to Google 1000 times police report nobody does nothing. This is going on for eight years plus I’m still trying to take things down. And now they just ignore you. Have had repetition correctors and lawyers. And they still don’t cooperate. They keep accepting posts from unknown sources and keep posting negative information.

Reply
@glennferrell2902
June 12, 2025 at 9:28 am

Disappointing. I doubt anyone watching this wants to discuss the 1990s. With input from a team leader (or anyone closer to the development process) and a little preparation, this could have provided some real value in half the time.

Reply
@Praharshrhtdl
June 12, 2025 at 9:28 am

provide us a meditation room to understand your talks since there are no visual representation. Its like a voice coming from space….
very boring explanation….

Reply
@طارقمكاوي-ذ6ظ
June 12, 2025 at 9:28 am

❤

Reply
@AustineMorgan-p2f
June 12, 2025 at 9:28 am

Very informative episode! Can you see HasData playing a role in scraping data for SEO analysis like Googlebot does?

Reply
@RossDunn
June 12, 2025 at 9:28 am

Heads up – The link to the transcript goes to a denied access error.

Reply
@LouStoriale
June 12, 2025 at 9:28 am

I hope you are employed at Google for the rest of your long lives, because this unprofessionalism and lack of useful content wouldn't fly at any other company.

Reply
@THEROBINKEVIN
June 12, 2025 at 9:28 am

how to increase Crawls in my website ????

Reply
@eloueryaghlymohamed
June 12, 2025 at 9:28 am

Hello.
Before moving the podcast to YouTube, a document containing the video script was published with each podcast.
This was helpful as it made the content more accessible to non-English speakers, at least for translation.

Would you like to share the video script in future videos or elsewhere? Please.

Thank you very much.

Reply
@Pastora-q6l
June 12, 2025 at 9:28 am

Thanks 🎉

Reply

Related Posts

12 thoughts on “How Googlebot Crawls the Web”

Leave a Comment Cancel Reply