TL;DR
- The problem: AI needs to understand not just your pages, but how they connect to each other and to your brand
- LLM Sitemap is a new idea we're testing - a semantic HTML page combining FAQs, comparison tables, and process documentation
- It builds on existing standards: XML (discovery) + HTML (structure) + llms.txt (context) + deep semantic layer
- Early results: We've seen pages getting crawled, indexed, and cited by LLMs more often
- This is a work in progress - we'd love your feedback on what works and what doesn't
The Evolution of Sitemaps
Sitemaps have evolved alongside how machines consume our content. Each format solved a different problem:
XML sitemaps help search engines discover and index your pages. They list URLs and track freshness via lastmod. While AI systems don't crawl sitemaps directly, they rely on search engines that do - so indexed pages become available to AI through search results. Essential infrastructure.
HTML sitemaps organize your site for humans. They group pages by section, provide navigation structure, and help visitors find what they need. But they're typically just titles and links - no descriptions, no context about what each page covers or how content relates.
llms.txt (proposed by Jeremy Howard in 2024) adds semantic context. It's a Markdown file that provides background information about your site and curated links to key resources. Designed specifically for LLM inference - when users ask AI about your content.
Each format does something valuable. But for content-heavy sites, there's still a gap: how do you help AI systems not just find your pages, but understand them well enough to cite accurately?
That's what the LLM Sitemap addresses.
HTML Sitemaps: The Missing Middle
Before we get to llms.txt, let's talk about HTML sitemaps - they're often overlooked but represent an important step in this evolution.
A typical HTML sitemap organizes your site by section:
- Solutions: By Industry, By Use Case, By Team Size
- Resources: Blog, Templates, Case Studies
- Company: About, Pricing, Contact
This is useful for humans navigating your site. But for AI systems trying to understand and cite your content, HTML sitemaps have significant gaps:
- Just titles and links - no descriptions of what each page covers
- No semantic context - doesn't explain relationships between content
- No depth - can't tell which pages are comprehensive vs. supporting
- No pre-answered questions - doesn't match how users actually query AI
The LLM Sitemap is essentially an HTML sitemap with semantic depth added - descriptions, FAQs, comparison data, and relationship mapping.
llms.txt: Context for LLM Inference
The llms.txt proposal (by Jeremy Howard, September 2024) was specifically designed for LLM inference - when users ask AI about your content at runtime, not for training. It's a Markdown file that provides brief background information and guidance, along with links to markdown files providing more detailed information.
The format follows a specific structure:
# Project Name (required H1)
> Brief description in a blockquote (key context)
Optional detailed paragraphs about how to interpret the content.
## Docs (H2 sections with file lists)
- [Link title](url): Optional notes about this resource
- [Another link](url): More notes
## Optional (special section - can be skipped for shorter context)
- [Secondary resource](url): Less critical informationThe LLM gets context about what you do AND curated links to key pages. The "Optional" section has special meaning - those URLs can be skipped when shorter context is needed.
But for content-heavy sites, there are gaps:
When llms.txt Works Best
llms.txt is designed to coexist with existing standards, not replace them. It's perfect for documentation sites, software projects, and focused products where a curated subset makes sense. But for content-heavy sites with hundreds of pages across multiple topics - blogs, resource hubs, SaaS platforms - you may need something more complete. That's where the LLM Sitemap comes in.
Introducing: The LLM Sitemap
Definition by Growtika
LLM Sitemap /ˌel-el-ˈem ˈsīt-map/ noun
A semantic HTML page that helps AI systems understand, explain, and accurately cite your content. Combines human navigation, content hierarchy, first-person FAQs, comparison tables, and "how it works" documentation into a single crawlable resource.
This isn't about replacing your XML sitemap or llms.txt. Those do important work. But for content-heavy sites with hundreds of pages across multiple topics, you need an additional semantic layer that helps AI not just find your content, but understand it well enough to recommend accurately.
An LLM Sitemap combines:
- Human navigation - visitors can browse your content
- Crawlable links - search engines and AI can follow URLs
- Rich semantic context - explains what each section covers
- Content hierarchy - organized by site sections or authority topics
- First-person FAQs - pre-answer queries exactly how users ask AI
- Comparison tables - real pricing and competitor data AI can cite
- "How it works" documentation - process flows that help AI explain your product
- Cross-topic relationships - related links show how content connects
Why This Matters for AI Citations
XML sitemaps help AI crawlers find your pages. llms.txt gives them context about your business. The LLM Sitemap adds semantic depth - the FAQs, comparisons, and process documentation that help AI answer user questions accurately and cite your content as the source.
Implementation Guide
Step 1: Define Your Sections or Authority Topics
Choose how to organize based on your site structure. Two common approaches:
- By site sections: /learn, /blog, /academy, /solutions, /resources - mirrors your navigation
- By authority topics: DSPM, Cloud Security, SSPM, Identity Management - the themes you want AI to associate with your brand
Either works. Pick 5-15 main groupings that make sense for your content.
Step 2: Map Content to Sections
Group all your content under relevant sections or topics. Each page should belong somewhere. If a page doesn't fit, either create a new section or consider if the content is necessary.
Step 3: Write Section Context
For each major section, write 2-3 sentences explaining:
- What this section covers
- Who it's for
- What problems it solves
- Key topics included (natural keyword integration)
Step 4: Add Cross-Links
After each major cluster, add "Related Topics" links to content in OTHER sections. This shows AI how your content interconnects.
Step 5: Add First-Person Section FAQs
This is the secret weapon. For each strategic section, add 3-5 FAQs that pre-answer the queries users actually search for.
Why First-Person FAQs Work
When users ask "I'm a therapist drowning in notes. Will this actually help?" - if that exact question and answer is on your sitemap page, it's a direct retrieval match. You're essentially writing the answers AI will give.
Critical: Write FAQs in First Person
- Don't write: "What are the benefits of [product] for [audience]?"
- Do write: "I'm a [role] drowning in [pain point]. Will [product] actually help?"
First-person questions match how users actually talk to AI.
What to cover in Section FAQs:
- Persona pain points: "I'm a [role] struggling with [problem]. Will this help?"
- Comparison questions: "How does this compare to [competitor]?"
- Fit questions: "I'm a [specific situation]. Is this right for me?"
- Objection questions: "I've been burned before by [concern]. How is this different?"
Step 6: Add "How It Works" Documentation
For product sites, add a comprehensive section explaining your product's capabilities. This isn't marketing copy - it's structured documentation that helps AI understand and explain your product accurately.
For each major capability, include:
- "Why We Offer This" - The problem this solves (helps AI understand when to recommend)
- "How We're Different" - Specific differentiators from alternatives (helps AI compare)
- "How It Works" - Process flow explanation (helps AI explain accurately)
Step 7: Add Comparison Tables with Real Data
Don't just say you're better - show actual pricing comparisons with a verification date:
Step 8: Add "Browse All" Links for Large Content Sets
When you have 200+ pages in a category (like templates or blog posts), show featured examples with a note:
"The articles below are featured examples - the full collection of all blog posts is available on the main blog page."
Then link to the full archive. This gives AI context without overwhelming the sitemap.
Step 9: Write Explicit Expertise Signals
Don't rely on visual badges that say "Pillar" or "Featured" - the LLM won't see them. Instead, use explicit text:
- "This is our comprehensive guide to..." (not just a badge)
- "Start here if you're new to..." (explicit onboarding signal)
- "Our most popular resource on..." (social proof in text)
- "Complete reference covering..." (scope indicator)
The text IS the signal. Write like you're describing the page to someone who can't see your design.
Step 10: Add "About This LLM Sitemap" Meta Section
Add a brief explanation of what makes this sitemap special:
- Page Groupings - how content is organized
- FAQ Sections - what they cover
- "How It Works" Panels - capability documentation
- Relationship Mapping - how topics connect
This signals to AI that the page is intentionally structured for their use.
A Note on This Framework
Work in Progress
The LLM Sitemap is a new idea we're actively testing. It might have logic issues or aspects we haven't fully considered yet. But so far, we've seen positive impact when implementing it - pages getting crawled, indexed, and cited by LLMs more often than before.
We'd love to get your feedback on this approach. If you try implementing an LLM Sitemap, let us know what works and what doesn't.
Conclusion
As AI systems become a primary way people discover and evaluate products, having the right content isn't enough - AI needs to understand how your content connects and when to recommend it.
The LLM Sitemap builds on existing standards (XML sitemaps, HTML sitemaps, llms.txt) by adding the semantic depth that helps AI cite your content accurately: first-person FAQs that match how users actually ask questions, comparison tables with real data AI can reference, and process documentation that explains how your product works.
Start simple - organize your content into clear sections, add FAQs that answer the questions your prospects actually ask, and include comparison data that helps AI give accurate recommendations. You can always expand from there.
