Technical SEO
Pagination in SEO

How pagination affects crawl budget and indexing, when to use noindex vs canonical, with code examples for HTML, PHP and sitemap.
Pagination splits large content lists into separate pages with unique URLs: /catalog/page/2, /catalog/page/3. Simple in concept, but pagination is one of the top crawl budget killers on large sites. Googlebot crawls hundreds of templated pages instead of product cards — and part of the catalog never makes it into the index.
What is pagination
Pagination splits a large set of content — products, articles, reviews — into sequential pages with unique URLs. Common patterns: /catalog/?page=2, /catalog/page/2/, /catalog/2/. Search engines treat each URL as a separate document.
Three pagination formats behave fundamentally differently from an SEO perspective:
Buttons «1 2 3 ... 100», each page has a unique URL. Flexible SEO control: noindex, canonical, or full indexing.
Content loads on scroll without URL changes. Without JavaScript rendering, Googlebot only sees the first screen. Requires an HTML fallback.
URL doesn't change, content adds on click. Same crawling problem — without JS rendering only the initial content is visible to Googlebot.
Crawl budget and duplicate content
A store with 500 products and 20 per page generates 25 pagination URLs per category. With 50 categories that's 1,250 URLs Googlebot must crawl before reaching individual product pages.
noindex, follow to pages 2+, indexing time dropped to 3–5 days.Three specific problems pagination creates for SEO:
- Duplicate content. Title and Description for the same category are identical across
/page/1and/page/15. The section heading repeats. Google sees similar pages and doesn't know which one to rank. - PageRank dilution. Internal links are distributed across dozens of pagination URLs. Products on page 10+ receive minimal internal authority.
- Crawl budget waste. On small and mid-size sites, Googlebot burns crawl budget on templated listing pages instead of priority content.
Pagination format comparison
| Format | Crawlability | Duplicates | SEO solution |
|---|---|---|---|
| Numbered pages | Full crawl of all URLs | Yes (Title/Description) | noindex / canonical / full indexing |
| Infinite scroll | First screen only (no JS) | No (single URL) | HTML fallback + rendering |
| "Load more" | Visible content only | No (single URL) | SSR or JS rendering |
SEO configuration strategies
Strategy selection comes down to one question: do pages 2+ have standalone search value? If users never land on /catalog/page/7 from organic search — there's no point spending crawl budget on it.
Strategy 1 — noindex pages 2+. Best for blogs, news and content sites. Pagination pages have no standalone search value — noindex, follow removes them from the index while preserving link crawling. Crawl budget flows to actual content.
Strategy 2 — rel=canonical to page 1. Softer than noindex: pages 2+ stay accessible (for direct links, ads) but canonical points to /page/1. Google treats the series as one section with page 1 as the canonical version.
Strategy 3 — full indexing. For large e-commerce where each pagination page contains unique products that need to be indexed. Requires unique Title and Description per page — e.g. «Buy Laptops — Page 3» — plus a self-referencing canonical on each page.
/page/. If pagination pages drive any organic traffic — think twice before applying noindex.Technical implementations
Most tasks require just three <head> constructs: <meta name="robots">, <link rel="canonical">, and sitemap entries. Here are code examples for each.
rel=next/prev: deprecated attribute
Before 2019, Google supported rel="next" and rel="prev" to identify pagination series. In March 2019, Google announced it had stopped using these attributes. Adding them for Google is pointless. Bing formally supports them, but Bing's share in most markets is marginal.
noindex for pagination pages
Add to <head> on pages 2+. Key detail: use noindex, follow — not noindex, nofollow. With nofollow, Googlebot won't follow links from pagination pages and won't index the products or articles they link to.
<!-- Page /catalog/page/2/ and above -->
<head>
<meta name="robots" content="noindex, follow">
</head>
<!-- Page /catalog/ (first page) — no noindex -->
<head>
<meta name="robots" content="index, follow">
</head>In WordPress, implement this via a wp_head hook checking is_paged():
// functions.php
add_action('wp_head', function() {
if (is_paged()) {
echo '<meta name="robots" content="noindex, follow">' . "\n";
}
});rel=canonical to page 1
When pages 2+ must remain accessible but shouldn't duplicate the section, canonical points to the first page. Page 1 gets a self-referencing canonical.
<!-- /catalog/page/3/ -->
<head>
<link rel="canonical" href="https://example.com/catalog/">
</head>
<!-- /catalog/ — self-canonical (required) -->
<head>
<link rel="canonical" href="https://example.com/catalog/">
</head>noindex.Pagination in sitemap.xml
Rule: only include pages in sitemap.xml that you want indexed. Pages with noindex in the sitemap create conflicting signals and waste crawl budget.
<!-- sitemap.xml — indexable pages only -->
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<!-- Page 1 — always include -->
<url>
<loc>https://example.com/catalog/</loc>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>
<!-- Pages 2+ — only with full indexing strategy -->
<url>
<loc>https://example.com/catalog/page/2/</loc>
<changefreq>weekly</changefreq>
<priority>0.4</priority>
</url>
</urlset>Common pagination mistakes
Most mistakes come from inattentive implementation rather than conceptual misunderstanding. CMS sitemap generators and meta tag templates often fail to account for pagination:
The if page > 1 condition is written incorrectly — the main category page gets noindex too. Section traffic drops to zero within weeks.
Sitemap generator automatically adds all URLs. Google receives conflicting signals: sitemap says "index it", meta robots says "no". Crawl budget gets wasted resolving the conflict.
Mistake 2Page 1 is at /catalog/ but canonicals from pages 2+ point to /catalog/page/1/ (which returns 404). Google gets a broken canonical chain.
All pagination is JavaScript-only, no static URLs. Googlebot sees only the first N items — the rest of the catalog never gets indexed.
Mistake 4Under full indexing strategy, all pages share one Title tag. Google sees duplicate content and reduces the section's relevance. Each page needs a unique meta tag with the page number.
Mistake 5Most of these errors surface in Google Search Console → Coverage: pages marked noindex, excluded URLs, and canonical conflicts are visible in the report immediately.