← Publications
Essay

A field guide to the agent-legible web

The agent-legible web boardAish BrownPublished 2025 · 12~6 min
GET   /v1/publications/field-guide-to-the-agent-legible-web
kind   Essay
published   2025-12-08
board   /v1/boards/agent-legible-web
author   Aish Brown
cite_as   Xooplab (2025). "A field guide to the agent-legible web." xooplab.com/publications/field-guide-to-the-agent-legible-web
Machine abstract · key claims
  1. Retrieval systems read passages, not pages; the trusted passages have structure.
  2. Four compounding habits: answer-first sections, inline structured data, curated /llms.txt, claim-level provenance.
  3. Agent legibility is not SEO — SEO optimises for ranking, agent legibility optimises for the evidence pipeline.
  4. Sites that don't adapt fall into invisibility, where models paraphrase them without attribution.

A field guide implies you've been somewhere. We have been somewhere for about a year, mostly watching how retrieval systems actually read the sites they cite, and the punchline is small enough to fit on the back of a card: crawlers don't read pages, they read passages, and the passages they trust have structure.

The agent-legible web is the web written for that fact. Not a new web — the same one, with four habits that turn out to compound:

  • Answer-first sections. Claim before context.
  • Inline structure (JSON-LD, schema.org) generated from the same data the human page renders.
  • A curated /llms.txt that points an agent at the right things first.
  • Provenance attached to claims, not to pages.

Why now

Three years ago this would have been a vanity exercise. The crawlers were either too dumb to use structured data well or too rapacious to be welcomed. Today both sides have moved: retrieval systems are very good at extracting structured records, and they would prefer to cite sources they can verify. Sites that publish for that audience get cited more. Sites that don't fall back into invisibility, where the model paraphrases them without attribution.

What this isn't

It isn't SEO with extra steps. SEO optimises for a ranking algorithm; agent legibility optimises for a retrieval system's evidence pipeline. The difference is whether you're trying to be found or trying to be trusted. They're related but not the same, and the techniques diverge fast.

It also isn't a brand-safety play. This is research infrastructure. The point is that the web's second reader, whatever it ends up being called this quarter, is now part of the audience your work has to function for. You write for it the same way you write for a human reader: clearly, with structure, and without lying.

What ships from this board

Four artefacts, in rough order: a provenance primitive (see the dedicated note), a llms.txt benchmarking method, a passage-friendly markup convention, and a reference emitter. The goal is small specs that an unaffiliated site can adopt in an afternoon — not yet another standard nobody implements.

From the The agent-legible web board. Replications, counter-arguments, and "you reinvented X" corrections all welcome in the thread.