← Back to blogSEO & Performance

XML Sitemap: A Complete Guide for SEO

XML sitemap structure for search engines

An XML sitemap is a structured file that lists the important URLs on your website so search engines can discover and crawl them efficiently. It is the equivalent of handing Google a table of contents for your site. Sitemaps speed up indexing for new pages, surface deep content that internal links miss, and let you signal which URLs are canonical, when they were updated, and how they relate to alternate language versions. Every site that wants reliable Google coverage should have one.

What an XML Sitemap Actually Is

An XML sitemap is a plain text file written in a standardized XML schema. It lives at a public URL such as https://example.com/sitemap.xml and contains a list of <url> entries, each with a <loc> tag pointing to a canonical URL. Optional tags describe when the page was last updated, how frequently it changes, and how important it is relative to other pages.

Google, Bing, and other search engines read the sitemap during their regular crawl cycles. New or recently changed URLs in the sitemap signal that the page is a discovery priority. A sitemap does not guarantee indexing — Google still decides what is worth keeping — but it dramatically improves discoverability, especially for large or deep sites.

The Basic Sitemap Structure

A minimal valid sitemap looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2026-05-15</lastmod>
  </url>
  <url>
    <loc>https://example.com/blog/</loc>
    <lastmod>2026-05-20</lastmod>
  </url>
</urlset>

The <loc> tag holds the URL. The <lastmod> tag tells Google when the content last changed. Older optional tags like <changefreq> and <priority> exist but Google has explicitly said it ignores them.

Why XML Sitemaps Matter for SEO

Search engines find pages two ways: by following links and by reading sitemaps. Internal links work well for pages that are one or two clicks from the homepage, but they fail for deep content, newly published posts, or pages that have no incoming links yet. Sitemaps fill that gap.

  • Faster discovery of new pages — a blog post added to your sitemap can be crawled within hours instead of days.
  • Better coverage on large sites — sites with thousands of URLs get more pages indexed when a sitemap acts as the spine.
  • Cleaner crawl budget — Google focuses on the URLs you signal as important rather than wasting crawls on filter pages or parameter combinations.
  • Multilingual signals — hreflang annotations in the sitemap tell Google which language version to show to which audience.
  • Last-mod hints — a fresh <lastmod> value nudges Google to recrawl pages you have updated.

Types of Sitemaps

The standard URL sitemap is the most common, but there are several specialized formats for different content types.

URL Sitemap

The default. Lists pages, posts, category pages, and any other indexable HTML URL on the site. Most sites need only this format.

Image Sitemap

Lists image URLs along with metadata such as captions and titles. Useful for image-heavy sites like photographers, retailers, or stock libraries. Pair this with strong image optimization to maximize visibility in Google Images.

Video Sitemap

Includes thumbnail URLs, durations, descriptions, and player URLs for embedded video content. Sites with significant original video should publish one.

News Sitemap

Specifically for Google News publishers. Includes publication dates and categories. Required if you want eligibility for Google News surfaces.

Sitemap Index

A sitemap of sitemaps. When you exceed the 50,000 URL or 50 MB limit per file, you split into multiple sitemaps and reference them all from a sitemap index. Large sites with categorized content often use one sitemap per content type — posts, pages, products, categories — under a master index.

How to Create an XML Sitemap

You rarely have to write XML by hand. Most platforms generate sitemaps automatically.

WordPress

WordPress 5.5 and later includes a built-in sitemap at /wp-sitemap.xml. For more control, install a plugin like Rank Math, Yoast SEO, or All in One SEO. These plugins let you exclude specific content types, control which post statuses appear, and add multilingual support.

Framer

Framer generates and serves sitemap.xml automatically. Every published page and CMS entry is included. The sitemap updates the moment you publish, so there is no manual step.

Webflow, Squarespace, Wix, Shopify

All four hosted platforms auto-generate sitemaps. Find them at /sitemap.xml or under the SEO settings panel. Each has a checkbox or toggle to exclude individual pages from the sitemap if you do not want them indexed.

Custom Sites (Next.js, React, Astro, SvelteKit)

For custom-built sites on Next.js, React, or other frameworks, generate the sitemap at build time. Next.js supports app/sitemap.ts or sitemap.js that exports an array of URLs. Astro and SvelteKit have similar conventions. The build pipeline writes the file to /public/sitemap.xml and it is served as a static asset.

How to Submit Your Sitemap to Google Search Console

Once your sitemap is live at a public URL, you should submit it explicitly to Google.

  1. Open Google Search Console and select your property.
  2. Click Sitemaps in the left navigation.
  3. Enter your sitemap URL (typically sitemap.xml or sitemap_index.xml) and click Submit.
  4. Wait 24 to 72 hours for Google to process. The status column will show Success with a count of discovered URLs.

Also reference your sitemap from robots.txt with a line like Sitemap: https://example.com/sitemap.xml. Other search engines like Bing and Yandex use this signal to find your sitemap without you submitting it manually.

What to Include — and Exclude

Your sitemap should list every URL you want indexed and nothing else. Inclusion in the sitemap is a signal that the URL is important and canonical.

Include

  • Canonical versions of every important page.
  • Blog posts, product pages, category pages, landing pages.
  • Pages with at least some unique, valuable content.
  • The current version, not historical or paginated archive pages unless they are uniquely valuable.

Exclude

  • Non-canonical URLs — duplicate versions you have marked with a canonical meta tag.
  • Pages blocked by robots.txt — you cannot index them anyway.
  • Pages with noindex directives.
  • Search results pages, faceted navigation, filter pages.
  • Thin content, thank-you pages, account pages, internal tools.
  • Redirected URLs — list only the destination.

Common XML Sitemap Mistakes

Sitemaps look simple but are easy to break. Most issues fall into the same handful of patterns.

Listing Non-Canonical URLs

If your sitemap lists https://example.com/page but the canonical tag on that page points to https://www.example.com/page, Google sees a conflicting signal. Make sure every sitemap URL matches its self-referencing canonical exactly.

Including Blocked or Noindex Pages

Pages blocked by robots.txt or marked noindex should not appear in the sitemap. This wastes Google’s time and creates errors in Search Console.

Stale Last-Mod Values

Either keep <lastmod> values accurate or omit them. Google has stated it ignores last-mod when sites lie about it — for example, setting every URL to today’s date. An accurate signal is helpful, a fake one is worthless.

Exceeding Size Limits

Each sitemap is limited to 50,000 URLs and 50 MB uncompressed. Sites that exceed these limits need a sitemap index splitting URLs into multiple files. Most plugins handle this automatically.

404s and Server Errors in the Sitemap

If sitemap URLs return 404 or 5xx codes, Google flags them. Run a periodic crawl of your sitemap to catch broken URLs.

Forgetting to Reference It from Robots.txt

Adding a Sitemap: line to robots.txt takes thirty seconds and helps every crawler find your sitemap without you submitting it to each one individually.

Sitemaps and Crawl Budget

For most sites, crawl budget is not a concern. Google can comfortably crawl sites with thousands of pages. But for very large sites — ecommerce stores with millions of SKUs, news sites with decades of archives — every URL Google wastes on a stale or low-value page is a URL it does not spend on something important.

A well-curated sitemap that lists only canonical, valuable URLs concentrates Google’s attention where it matters. Combine this with strong internal linking and your important pages get crawled often and indexed reliably.

Monitoring Your Sitemap in Search Console

Submit and forget is not the play. Check the Sitemaps report in Search Console monthly to catch issues early.

  • Discovered URLs vs Indexed URLs — if Google discovers 1,000 URLs but indexes only 400, investigate the gap. Thin content, duplication, or canonical conflicts are usually the culprit.
  • Errors — fix any flagged sitemap errors immediately. They block indexing.
  • Recently Discovered — confirm that new pages appear in the sitemap quickly. If they do not, your sitemap generator is broken.

FAQ

Do I really need an XML sitemap if my site has good internal linking?

Yes. Internal linking handles most discovery for small sites, but a sitemap accelerates indexing for new pages, provides redundancy when links fail, and gives you a measurable feed in Search Console. The marginal cost of having one is zero on every modern platform, and the upside is real, so there is no reason to skip it.

How often should I update my sitemap?

Sitemaps should regenerate whenever content changes. Most platforms — WordPress, Framer, Webflow, Shopify — do this automatically on publish. If you maintain a custom sitemap, rebuild it daily at minimum, or whenever you publish, whichever is more frequent.

What is the difference between sitemap.xml and an HTML sitemap?

An XML sitemap is a machine-readable file for search engines. An HTML sitemap is a human-readable page that links to important sections of the site. They serve different audiences and you can have both, though XML is the one that matters for SEO.

Want a site that handles sitemaps, indexing, and SEO infrastructure correctly from day one? See our pricing or get in touch.

Ready to build your Framer website?

Book a free strategy call to discuss your project.