Start with the URL Inspection tool
Before guessing, paste the URL into Google Search Console → URL Inspection. The tool reports exactly why Google rejected the page and which of the nine reasons below applies. The two most common labels you'll see are:
- Discovered – currently not indexed: Google knows about the URL but hasn't crawled it. Almost always a crawl-budget or content-quality issue.
- Crawled – currently not indexed: Google fetched the page but decided not to keep it. Usually thin content, duplicates, or low domain trust.
- Excluded by noindex tag: A technical block on your end — easy to fix.
- Blocked by robots.txt: A different technical block — also easy.
- Page with redirect / Alternate page with canonical: A canonical configuration issue.
Reason 1: A noindex directive is set
Either a <meta name="robots" content="noindex"> tag in the HTML head or an X-Robots-Tag: noindex HTTP response header is telling Google to skip the page. This is the most common cause of indexing failure on new sites because some frameworks ship with a development-mode noindex that never gets removed in production.
Fix: curl -I <url> and look at the headers. Then view-source on the page and search for the literal string "noindex". Remove from wherever it appears. Re-deploy. Submit via Indexing API.
Reason 2: robots.txt is blocking the URL
Open https://yoursite.com/robots.txt and check for any Disallow lines that match your URL. A common mistake is shipping a wildcard Disallow: / for staging and forgetting to update it on production.
Reason 3: Soft 404
Your page returns HTTP 200, but the content reads like a 404 page — "sorry, we couldn't find what you're looking for" or an empty product listing. Google flags this as a soft 404 and refuses to index. Worse, it lowers trust signals for the surrounding URL paths.
Fix: either populate the page with real content or return a proper 404 status code so Google ignores it cleanly.
Reason 4: Duplicate or thin content
If your page substantially duplicates another page on your site (or anywhere on the web), Google picks one canonical and discards the rest. Thin pages — under 300 words, mostly boilerplate, repeated across many URLs — get the same treatment.
Fix: consolidate duplicates with a 301 redirect or rel=canonical. Beef up thin pages with substantive, unique content — not filler.
Reason 5: Wrong canonical declaration
Your page declares its canonical to be a different URL — either accidentally (a layout-level canonical inherited by every child page) or because a CMS plugin is auto-canonicalizing. Google honors the declared canonical and refuses to index the page you wanted indexed.
Fix: in Next.js, set canonical: "./" in your root layout's metadata so each page declares itself as canonical. In WordPress, audit Yoast/RankMath settings. Always confirm with: curl -s <url> | grep canonical.
Reason 6: Low domain authority
Brand new domains and domains with no backlinks have effectively zero trust signals. Google will crawl your pages but treat them as low priority and may decline to index unless the content is exceptional.
Fix: this is a months-long project, not a one-day fix. Build legitimate backlinks (guest posts, product directories, PR mentions), publish substantive content, and use internal linking to concentrate authority on your priority pages.
Reason 7: Slow server response
If Googlebot routinely sees 5+ second TTFB or 5xx errors from your server, it backs off. Crawl rate drops, indexing slows, and eventually pages disappear from the index entirely.
Fix: check the Crawl Stats report in GSC. Target sub-200ms TTFB. Add caching, upgrade hosting, use a CDN, fix slow database queries.
Reason 8: JavaScript rendering failure
If your content only appears after client-side JavaScript runs, Googlebot has to render the page in two passes. The second pass is queue-based and can lag by days. If your JS errors out, Google sees a near-empty page and won't index.
Fix: server-side render critical content. Use the URL Inspection tool's "View tested page → Screenshot" to see what Google actually saw. If the screenshot is blank, you have a rendering problem.
Reason 9: Insufficient internal links
If your new page is only reachable from a single sitemap entry with no internal links from elsewhere on the site, Google treats it as low priority. Internal linking signals importance.
Fix: link to new pages from your homepage, related blog posts, navigation, or category pages. The minimum bar is one inbound internal link from a regularly-crawled page.
After fixing: force a re-crawl
Once you've fixed the technical blocker, you don't have to wait for Google to discover the change. Submit the URL via the Google Indexing API or a stacked indexer like Instant URL Indexer. The page will be re-crawled within minutes, and if the fix is real, it lands in the index almost immediately.