What is Crawl Budget?
There are over a billion websites across the globe today. And this number continues to grow. In fact, data suggests that over 250 000 new websites are created everyday. For Google, that’s a lot of ground to cover!
To manage this massive task, Google uses a “crawl budget” to determine how many pages on each website it can crawl and index within a given time frame. Put simply? It’s the amount of attention and resources Google’s crawlers can allocate to your site.
How Does It Work?
A crawler, like Google Bot, will systematically go through a list of URLs on a site. It will start with URLs it knows about, following links withing the site and finding new pages as it does this. As they find new links on these pages, they add it to the list of pages to crawl next.
However, with so many pages on the internet, crawlers don’t have unlimited time or resources, so it prioritises certain pages based on factors like content quality, freshness, and relevance.
What Determines Crawl Budget?
A website’s “crawl budget” is determined by a number of factors, including the size of your site, your site’s health (think loading speed and server response times) and how frequently your content is updated. Some of the most common factors that determine budget include:
- Importance of webpage: As web crawlers can’t crawl the entire internet, it decides which pages to crawl based on how many pages are linked to that page, how many visitors it gets and other factors that might indicate that it is relevant or important to users.
- Updated/New Webpages: If a site regularly adds new pages or updates existing ones, crawlers are likely revisit these pages to ensure they are up to date.
- Robots.txt Requirements: A robots.txt file guides crawlers on which pages they should or shouldn’t crawl. If properly configured, it can help focus crawl budget on the pages that matter most.
- Size of Site: Larger sites naturally require more crawl budget simply because they have more pages to be indexed. However, if a large portion of these pages are low-quality or duplicate content, it can waste the budget and leave important pages uncrawled.
- Health of Site: The speed and reliability of a site also play a role in budget allocation. If a site is slow or frequently encounters errors, crawlers might spend less time indexing these pages, reducing the crawl budget.
- Crawl limit: This is how much a website can handle and is impacted by server capabilities craw limit set by site owner in Search Console etc.
While there is no exact details on your website’s crawl budget, you can monitor craw activity in Search Console’s Crawl Stats Report.
Why Does Crawl Budget Matter?
When you have a website, you want to ensure that your most important pages (new, old and refreshed) are discovered, crawled, and indexed by search engines; as quickly as possible.
However, not all pages on your website are created equal. Some pages are more important than others, and if search engines run out of crawl budget before getting to these pages, they might not be indexed or updated in search results as quickly as you’d like.
While not necessarily a big concern for small, or new websites, efficient use of your crawl budget can help ensure that search engines focus on your most valuable content. This can improve your site’s visibility in search results ensuring that users find the pages that matter most to them when searching for relevant information.
When Is Crawl Budget an Issue
As outlined by Google, craw budget allocation may become an issue if you have:
- Large sites (1 million+ unique pages) with content that changes around once a week.
- Medium or larger sites (10,000+ unique pages) with content that changes daily.
- Sites with URLs classified by Search Console as “Discovered – currently not indexed”
How To Optimise Your Crawl Budget
There are a number of ways that you can increase your website’s crawl budget. These include:
- Remove or Consolidate Low-Value Pages: If you have pages that don’t add much value, consider removing them or merging them with more relevant content.
- Improve Page Load Times: Faster pages mean crawlers can get through more of your site in less time. Compress images, minify code, and use caching to boost speed. (this isn’t just great for crawl budget, but also fantastic for users!)
- Fix Broken Links: Broken links lead crawlers to dead ends, wasting valuable crawl budget. Regularly check for and fix any broken links on your site to ensure there is a clear link between pages across your site.
- Optimise Internal Links: Googlebot prioritises pages with more backlinks and internal links and this tells the crawler that the page is important. By optimising your internal and external link building strategy you can guide crawlers to key pages.
- Use a Sitemap: Submitting a sitemap to search engines helps them find your important pages more easily, ensuring they get crawled and indexed.