Google: Gary Illyes and Lizzi Sassman on Web Crawlers

In the latest episode of Search Off the Record, Gary Illyes and Lizzi Sassman took a deep dive into crawling the web: what is a web crawler, and how does it really work.

Google Web Crawler
Google Web Crawler

In the latest episode of Search Off the Record, Gary Illyes and Lizzi Sassman took a deep dive into crawling the web: what is a web crawler, and how does it really work.

Based on the conversation, we wrap up:

  • A crawler, also known as a spider or web spider, is a software program that downloads information from websites. Search engines use crawlers to index the web.
  • Crawlers schedule, fetch, and process information retrieved from the web. They schedule what to fetch based on various signals including links from other sites and sitemaps. Fetching refers to downloading the content of the webpages. Processing refers to analyzing the fetched content.
  • Crawl budget is a term used to describe the amount of resources that a search engine is willing to spend on crawling a particular website. This can be influenced by factors such as the size of the website, the frequency of updates, and the importance of the website to search users.
  • There are myths about crawl budget. For example, some people believe that you can pay Google to increase your crawl budget. This is not the case.
  • Web servers can be overloaded by crawlers if they are not careful. Crawlers should respect robots.txt files, which can be used to instruct crawlers which parts of a website should not be crawled.
  • Some webmasters mistakenly believe that they need to get all of their pages crawled by search engines. This is not necessarily the case. In fact, it can be beneficial to block search engines from crawling pages that are not useful for search results, such as login pages or pages with duplicate content.

PPC Land is an international news publication headquartered in Frankfurt, Germany. PPC Land delivers daily articles brimming with the latest news for marketing professionals of all experience levels.

Subscribe to our newsletter for just $10/year and get marketing news delivered straight to your inbox. By subscribing, you are supporting PPC Land. You can also follow PPC Land on LinkedIn, Bluesky, Reddit, Mastodon, X, Facebook, and Google News.

Know more about us or contact us via info@ppc.land

Subscribe via email

Don’t miss out on the latest marketing news. Sign up now to get the articles directly in your email.
jamie@example.com
Subscribe