SitemapGroup — status code¶
Wygard fetches every XML sitemap and sitemap index your site declares and alerts you the moment one stops returning HTTP 200.
| Scope | Tier | Default | Alert |
|---|---|---|---|
| Site-wide | Basic | On — every sitemap declared in robots.txt |
🔴 Danger |
Why it matters¶
Your sitemaps are how you hand Google the list of URLs you want crawled. If a sitemap returns 404 or 5xx, Google can't read it — it falls back to whatever it already knows, and newly published or updated pages wait far longer to be discovered. Because the Sitemap: line still sits in robots.txt and the rest of the site loads fine, there's no visible symptom; the only sign is slower indexing weeks down the line.
The default severity is Danger because a broken sitemap silently throttles discovery of your most important new content, and the cause — a regenerated file at a new path, a generator timing out, a deploy that 404s the sitemap route — is rarely noticed until rankings stall.
What Wygard checks¶
On every run, the crawler:
- Reads
robots.txtand collects everySitemap:directive. - Follows each sitemap index and resolves the child sitemaps it points to.
- Requests every sitemap and sitemap-index URL and records its HTTP status code.
- Flags any that don't return
200.
Only what's in robots.txt
Wygard tests the sitemaps declared in your robots.txt and the sitemap indexes they reference — not arbitrary sitemap URLs. Keeping robots.txt as the single source of truth means the test follows your site exactly the way a search engine does.
Common alerts¶
- Sitemap returned a non-200 status — a sitemap that was
200now returns404(deleted or moved) or5xx(generator error, timeout). - Sitemap-index child unreachable — the index file loads, but one of the child sitemaps it references is broken.
- New sitemap broken on deploy — a sitemap path changed and the old URL still listed in
robots.txtno longer resolves.
Why the default is Danger
A sitemap that quietly 404s doesn't break anything a visitor can see — it just stops feeding Google the URLs you most want crawled. By the time slow indexing shows up in your traffic, days or weeks have passed. Danger severity surfaces the failure immediately after the crawl batch instead.
Responding to an alert¶
- Open the alert and review which sitemap URL failed and the status it now returns.
- Decide whether the change was intended (you really did retire that sitemap) or accidental (a broken route, a generator error, a path that changed on deploy).
- If accidental, fix the source — regenerate the sitemap, correct the route, or update the
Sitemap:line inrobots.txtto the new path. - If the sitemap is genuinely gone, remove its
Sitemap:line fromrobots.txt; the next crawl drops it from the test and turns it green.
Pair it with Status code
This test confirms the sitemaps respond; the per-URL Status code test confirms the pages listed inside them respond. Together they cover both ends of the discovery path — the map and the destinations it points to.