Search Bot (Crawler)

A crawler of a search engine (Googlebot, YandexBot).

In brief

A search bot (crawler, spider) is an automated program of a search engine that traverses web pages, downloads their content, and follows links to discover new documents. Robot control is done via robots.txt and meta robots. Examples: Googlebot, YandexBot, Bingbot.

What is a crawler

A crawler/spider is a bot that browses pages and indexes content. Googlebot, YandexBot, Bingbot are the most well‑known. Bots work continuously, visiting sites at different frequencies.

Managing bots

robots.txt — blocks crawling of certain sections.
meta robots (noindex, nofollow) — page‑level control.
X‑Robots‑Tag — for non‑HTML files.

Crawl budget

Search bots have a limited crawl budget for each site (depends on popularity, size, server speed). Prioritising pages helps the bot spend that budget on the most important content.

If important pages aren’t being refreshed in the index, check your logs — the bot may be wasting budget on filter parameters or poorly optimised pagination.

FAQ

Common questions

Frequency depends on authority and update frequency. For new sites, Googlebot may visit every few days. For popular news sites — every few minutes.

Yes: User-agent: * Disallow: / in robots.txt. But be careful — this will stop crawling of the entire site.

Check your server logs or the 'Crawling' report in Google Search Console.

Search Bot (Crawler)

What is a crawler

Managing bots

Crawl budget

Common questions

Discuss your project?