401 Unauthorized: meaning and SEO impact

What HTTP 401 Unauthorized means, how it differs from 403 Forbidden, how it affects Googlebot crawling, and what to do when it appears on public pages.

In brief

401 Unauthorized is an HTTP response code the server returns when a request lacks valid authentication credentials or they failed verification. Unlike 403, it is not a permanent denial — it is an invitation to authenticate.

What is 401 Unauthorized

401 Unauthorized is an HTTP status from the 4xx group (client errors) that the server returns when a request requires authentication but none was provided or the credentials failed verification. Along with the status, the server typically sends a WWW-Authenticate header indicating which authentication method to use.

The name 'Unauthorized' is slightly misleading — it actually refers to a lack of authentication, not authorization. A more accurate name would be 'Unauthenticated', which is exactly the meaning codified in RFC 9110.

401 is not a verdict — it is a prompt: 'Identify yourself and we will let you in.' The server assumes that a repeated request with valid credentials will succeed. That is why browsers automatically show a login dialog when they receive 401.

401 vs 403: the difference

Both codes indicate an access refusal, but for different reasons:

Parameter401 Unauthorized403 Forbidden
Reason for refusalAuthentication not provided or failedAccess denied regardless of authorization
What the server knows about the userNothing — the user has not identified themselvesUser is known but lacks the required permissions
Answer to 'Who are you?'You did not answer or answered incorrectlyYou are known but you are not allowed here
Retry with credentialsMay workWill not help — different permissions are needed
ExamplePassword-protected page without loginAdmin-only page for a logged-in regular user
Misusing these codes is a common mistake. If you know for certain the user is authenticated but lacks the required permissions, return 403, not 401. Otherwise the browser will show a login dialog again even though the user is already logged in.

Impact on SEO and indexing

When Googlebot requests a URL and receives a 401 in response, it cannot read the page content. The result: the page is not indexed or gets dropped from the index. If 401 appears on public pages, it is a serious SEO problem:

  • The page loses rankings and drops out of the index within a few weeks.
  • Crawl budget is wasted: the crawler visits the URL but receives nothing.
  • Link equity pointing to blocked pages is not passed further.
  • Errors will appear in Google Search Console under Coverage → Excluded pages.
Google does not penalize sites for having 401 on closed sections (user account areas, CMS, APIs). The problem only arises when 401 prevents indexing of pages that should be publicly accessible.
4xx

client errors

Request-side HTTP error group

401

authenticate

Asks the client to provide credentials

403

forbidden

Credentials won't help — no permission

0%

content for Googlebot

Page is not indexed on 401

Common causes of 401 on public pages

  1. HTTP Basic Auth on the entire site or section. Often set up on staging and then forgotten before launch, or accidentally applied to the production environment.
  2. Misconfigured CDN or reverse proxy. The intermediate layer requires a token or signature that Googlebot does not have.
  3. Expired API tokens. Pages generated via an API with an expired key return 401 instead of content.
  4. Application code error. A middleware or guard accidentally requires authentication where it should not.
  5. WordPress or other CMS after an update. Content-protection plugins may change settings during an update.
A particularly dangerous scenario is HTTP Basic Auth on a staging environment open to crawlers. Googlebot sometimes discovers staging domains through random links. Protect staging with robots.txt or an IP allowlist — but not with Basic Auth alone without a robots.txt.

How to fix 401 on public pages

Diagnosis and resolution workflow:

  1. Find problematic URLs via Google Search Console → Coverage or a crawler (Screaming Frog).
  2. Check the response code with curl — confirm the problem is reproducible.
  3. Identify where authentication is configured: server (Apache/Nginx), CDN, or application middleware.
  4. Remove the authentication requirement from public URLs or add an exception for them.
  5. If staging needs protection — add robots.txt with Disallow: / or restrict access by IP.
BASH
# Check response code
curl -o /dev/null -s -w "%{http_code}\n" https://example.com/page
# Expected: 200 (or 301/302)
# If you see 401 — there is a problem

# Check with Basic Auth
curl -u login:password -o /dev/null -s -w "%{http_code}\n" https://example.com/page
APACHE
# Apache — protect a directory while keeping public pages open
<Directory /var/www/html/admin>
    AuthType Basic
    AuthName "Admin Area"
    AuthUserFile /etc/.htpasswd
    Require valid-user
</Directory>

# Public pages are NOT touched — accessible without authentication
NGINX
# Nginx — Basic Auth for /admin only, public pages remain open
server {
    # Public section
    location / {
        try_files $uri $uri/ =404;
    }

    # Protected section
    location /admin/ {
        auth_basic "Admin Area";
        auth_basic_user_file /etc/nginx/.htpasswd;
    }
}
TYPESCRIPT
// Next.js — API route with correct 401
import { NextRequest, NextResponse } from 'next/server';

export async function GET(request: NextRequest) {
  const token = request.headers.get('authorization');

  if (!token || !isValidToken(token)) {
    return NextResponse.json(
      { error: 'Authentication required' },
      {
        status: 401,
        headers: { 'WWW-Authenticate': 'Bearer realm="API"' },
      }
    );
  }

  return NextResponse.json({ data: 'protected content' });
}

Found 401 on public pages?

Tell us about your project — we will run a technical SEO audit and identify all crawler-blocking issues.

Contact us

Common questions

Yes, gradually. If Googlebot consistently receives 401 on a URL that was previously accessible, it will start lowering crawl priority and eventually exclude the page from the index. The process may take a few weeks. The more authoritative the page, the longer Google will wait and recheck.
It works, but there are more reliable approaches: add robots.txt with Disallow: / or block staging by IP. 401 alone does not guarantee complete protection — Googlebot sometimes bypasses Basic Auth and follows links. Best practice: robots.txt plus Basic Auth together.
From an indexing standpoint the effect is identical: Googlebot receives no content and cannot index the page. The difference is semantic: 401 means 'authenticate yourself', 403 means 'you have no permission even if you authenticate'. For SEO, what matters is that both codes block indexing of public content.
At the next crawl of that URL. If the page is important and previously had traffic, Google will recheck it within a few days. You can speed this up via URL Inspection in Google Search Console → Request indexing.
Per RFC 9110, yes. Without it, a 401 is technically non-compliant. Browsers and clients expect this header to know which authentication method is supported. In practice, many APIs return 401 without it and nothing breaks — but following the standard is the correct approach.
No. Disavow is a tool for link spam, not HTTP codes. Once the 401 is fixed, Google will update the status on the next crawl. There is no need to manually remove URLs from Search Console.
Direct contacts

Discuss your project?

Share your goals and website context — I will suggest a practical next step.