For most of the last decade, the audience for technical documentation was a developer at a keyboard with a search box open. Today, half the time, that developer never lands on the page. An AI assistant — Claude Code, Cursor, Windsurf, Copilot, the in-house agent the platform team is quietly shipping — reads your docs on their behalf, summarises the answer, pastes a code snippet, and the developer never sees the URL.
This is the most consequential shift in technical writing in years, and most teams are still writing docs as if it hasn't happened.
The good news is that a docs site that's good for LLMs is also good for humans. The bad news is that the inverse isn't true: plenty of pages humans tolerate are unreadable to a retrieval system, and the gaps don't show up in your analytics until your support load starts creeping up. This post is a working list of the patterns that make documentation legible to both audiences.
The new reader has different failure modes
A human reader scrolls. They scan headings, glance at code blocks, jump back to the table of contents, and accept that the page they need is the one they're already reading. They forgive prose that meanders because their eyes can skip ahead.
An LLM doesn't get any of those affordances. A retrieval system pulls a chunk — typically a few hundred to a couple of thousand tokens — and the model answers from that chunk alone. If the chunk has no headings, the model doesn't know which section it's in. If the chunk references the previous section, the model has no previous section. If the chunk needs a config example that lives three pages over, the model invents one.
The failure mode of an unhelpful page for a human is wasted time. The failure mode for an LLM is a confidently wrong answer that ships into a developer's codebase. The stakes have quietly moved from frustration to correctness.
Six patterns that work
The patterns below aren't theoretical. They're the recurring shape of documentation that performs well when piped through retrieval, embeddings, and chat-style answer generation.
1. Self-contained pages
Treat every page as if it might be the only page the model ever sees. That doesn't mean copying every concept into every page; it means writing the opening paragraph as a context-setter, naming the product and the surface you're documenting, and resolving any pronouns before you reach the first code block.
A page that opens with First, configure the webhook is invisible to retrieval. A page that opens with This page documents how to configure outbound webhooks for the Doccupine API is found, ranked, and cited.
2. Stable, semantic URLs
LLMs cite URLs the way a footnote cites a paper. When a citation breaks, the model either hallucinates a replacement or refuses to answer. Either is a worse experience than getting an outdated answer at a stable URL.
Pick URL slugs that describe the content, not the marketing position of the moment. Avoid renaming pages once they're indexed. If you must reorganise, ship a permanent redirect rather than a 404 — and keep the redirect alive longer than feels necessary, because LLM training cycles and crawler caches outlive your sense of long enough.
3. Headings that describe the answer, not the area
A retrieval system uses headings to understand what a chunk is about. Authentication tells the model very little. How to authenticate API requests with a personal access token tells the model exactly when to surface the chunk.
Write headings the way you'd write the question you imagine a developer typing into a chat box. The page reads more naturally for humans too.
4. Code that runs without context
A snippet that depends on three earlier code blocks for its imports, its config object, and its example data is unusable as a citation. Treat each code block as a copy-paste unit: include the imports it needs, name the variables it depends on, and make sure the example values are realistic enough to swap into a real project.
If a snippet is genuinely meant to be a fragment, mark it. A short comment like // inside your existing handler does more work than a paragraph of prose, and it survives the trip through a vector database.
5. Examples before abstractions
A reference page that opens with a parameter table forces the LLM to construct an example from the schema. It usually does this badly, because the schema doesn't tell it which field combinations are valid in the real product.
Open with a working example, then break down the parameters underneath. The model will quote the example verbatim, which is the cheapest possible win for accuracy.
6. Metadata in formats machines already trust
Frontmatter, structured data, and OpenAPI specs are not optional polish. They're the difference between a docs page that retrieves cleanly and one that gets summarised through a guess.
The minimum useful set:
- A
titleanddescriptionin frontmatter, used as the canonical short summary. - A
categoryfield (or a tag list, on platforms that support arrays) that lets retrieval systems narrow scope. - A
datefield, with an optionalupdatedcompanion for the last-modified timestamp, so models can tell whether the content is fresh. - A JSON-LD
Articleblock in the rendered HTML, so general-purpose crawlers understand the page.
If the page documents an API, ship an OpenAPI spec alongside the prose. The spec is the contract; the prose is the explanation.
The new infrastructure layer: llms.txt, sitemaps, and MCP
Three pieces of infrastructure now sit between your documentation and the agents that read it. None of them are universally adopted yet, but each is cheap to ship and the cost of skipping them rises every quarter.
llms.txt is the documentation equivalent of robots.txt: a top-level file that points AI tools at the canonical pages they should retrieve. The llms.txt standard pairs it with two companion artifacts that work in concert:
/llms.txt— a curated markdown index of every page worth retrieving, grouped by section and ordered the way you want a model to walk the docs./llms-full.txt— every page's raw markdown concatenated into a single file, sized for one-shot ingestion by long-context models./<slug>.md— per-page raw markdown at predictable URLs, so an agent that already knows which page it needs can fetch the source without scraping rendered HTML.
The three together turn your docs into something an LLM can navigate the way a developer navigates a repository: an index to start from, a single-page fetch for surgical reads, and a full-content file for the cases where the model wants to read everything in one pass. Even a minimal version of each cuts hallucination rates, because the model has a hard target to retrieve against instead of guessing from snippets it pulled out of search results.
A sitemap remains the simplest way to tell crawlers — including agent-based crawlers — what pages exist and how recently they changed. If your documentation platform doesn't generate a sitemap automatically, that's a signal worth investigating.
MCP servers are the newer entry. The Model Context Protocol lets an AI tool query your documentation directly, by structured request, instead of scraping the rendered HTML. Pages get retrieved with their metadata intact, the model gets clean text instead of stripped-down markdown, and the retrieval is scoped to your content alone. Doccupine ships an MCP server with every site by default, but the broader principle holds whatever platform you use: if your docs aren't queryable by an MCP client, they're harder to integrate into the agents your users are already running.
What to stop doing
A few practices made sense in the human-only era and now actively cost you.
Hiding text behind tabs and accordions is fine for a human who clicks. Many crawlers and embedders only see the default-visible state. Either render all states in the HTML and toggle their visibility with CSS, or accept that the hidden content is invisible to retrieval. Your platform's choice here matters; check before you commit.
Splitting a single concept across many tiny pages for marketing tidiness. Retrieval is page-scoped. A 200-word page on each of what is X, why X matters, how to set up X, and X examples gets pulled into four disconnected chunks, none of which contains the full picture. Consolidate or accept the fragmentation.
Image-only diagrams. A flowchart shipped as a PNG is invisible to LLMs. Either render the diagram inline as Mermaid or another text format, or pair the image with a paragraph that describes the same flow in prose. Both is best.
Versioning by URL parameter. ?v=2 is invisible to crawlers and LLMs. If you ship a v1 and a v2, give them distinct slugs (/v1/auth, /v2/auth) and link between them. The retrieval cost of getting it wrong is shipping the wrong version into a developer's prompt.
A practical checklist
Before publishing a new page, ask:
- If a retrieval system pulled the first 500 tokens, would the model know what product, version, and surface this page documents?
- Could a developer copy any code block on this page and run it without reading anything else?
- Are the headings questions or descriptions of an answer, rather than category labels?
- Is the URL stable, semantic, and unlikely to change?
- Is there frontmatter or structured data telling crawlers what kind of page this is?
- If the page documents an API, is there a machine-readable spec alongside the prose?
A yes to all six puts the page in the top decile of what we see in the wild. The pages that perform best in both human analytics and LLM citation always answer all six.
A note on tooling
Most of the patterns above are platform-agnostic. Stable URLs, good headings, runnable examples — you can ship those on any docs stack with enough discipline. A few of them are easier when the platform handles them by default.
Doccupine handles a fair amount of this for you. Every site ships with the structural pieces wired up by default:
- A Git-backed source of truth, so docs and code share the same review workflow.
- A
robots.txton every build, plus asitemap.xmlwhenever a site URL is configured. - An
llms.txtindex, anllms-full.txtconcatenated body, and per-page.mdfetches for surgical retrieval — emitted into the site'spublic/directory on every build, with stale files cleaned up across runs. - An MCP server in front of your content, so agents can query it without scraping HTML.
- An AI assistant pre-wired to your docs.
The result is that most of the patterns above stop being extra discipline and start being the default. The editorial work — clear headings, runnable examples, good prose — is yours, and that's true on any platform you pick.
If you're retrofitting an existing docs site for AI readers and bumping into something specific, drop me a line at [email protected]. The interesting cases tend to come from teams who realise their docs were answering the wrong reader.