Crawl Budget

The limit of resources that a search engine allocates to crawl your site. Especially important for large sites (10k+ pages).

In brief

Crawl budget is the number of URLs a search crawler can and wants to scan on your site within a given period. If the budget is wasted on low‑value pages, new useful content may remain unindexed.

What is Crawl Budget

The limit of resources that a search engine allocates to crawl your site. Especially important for sites with 10k+ pages.

Why it Matters

If the crawler wastes time on duplicate filters or session URLs, it may not index new useful products. Optimising budget means blocking trash in robots.txt and improving internal linking.

Factors Affecting Budget

  • Site popularity — more backlinks → higher budget
  • Server response speed — slow responses reduce budget
  • Amount of low‑value pages — duplicates, filters, sessions
  • Content freshness — frequent updates increase budget

How to Optimize Crawl Budget

  • Block useless pages in robots.txt (but not from indexing, from crawling)
  • Use noindex on trash pages (filters, sorts, sessions)
  • Ensure fast server response (caching, CDN)
  • Internal link important pages so the crawler reaches them quickly
  • Properly configure XML Sitemap — include only canonical, useful URLs
  • Avoid infinite spaces (calendars, pagination date loops)
Blocking pages in robots.txt does not save budget if there are external or internal links pointing to them — the crawler will still attempt to fetch them, see the block, and waste time. For budget efficiency, use noindex or URL parameter blocking in GSC.

Common questions

Usually not. Google can easily crawl 1000 pages in one go. Indexing problems on small sites are caused by other factors: server errors, robots.txt block, or poor internal linking.
Google does not disclose an absolute number. You can infer it from Google Search Console → Crawl Stats report. Look at the trend — how many requests Googlebot makes per day and how many pages it crawls.
Improve server speed (reduce TTFB), block useless URL parameters in GSC settings, remove duplicates, set up clean internal linking, and refresh your XML Sitemap.
Yes. If you have many pagination pages with non‑canonical parameters (page=2, page=3…), Google may crawl them. Use rel='prev'/'next' (deprecated but sometimes works) or block deep pagination via URL parameters.
Likely budget goes to outdated or low‑value content. Check for duplicates, endless filters, session URLs. Block unnecessary dynamic parameters using the URL Parameters tool in GSC.
Direct contacts

Discuss your project?

Share your goals and website context — I will suggest a practical next step.