When architecture decisions age well

Jun 3, 2026 4 min read

A conversation about llms.txt turned into unexpected validation. The content architecture built for clarity and ownership turned out to already be structured the way AI systems want to consume knowledge.

A few weeks ago I wrote about why I deliberately chose a statically generated website.

The reasoning was about content ownership, signal clarity, and building a foundation that remains easy to reason about as systems evolve.

Recently, a conversation about llms.txt made something concrete that was previously just a direction.

The architecture I had built for my own reasons turned out to already be structured the way AI systems want to consume knowledge.

That is usually the sign of a sound architectural decision: it adapts to requirements that did not exist when the decision was made.

What llms.txt is

llms.txt is an emerging convention, proposed in 2024, placing a structured Markdown file at the root of a website to help AI systems understand its content and structure.

It is analogous to robots.txt and sitemap.xml, but designed for AI crawlers and language models rather than search engines.

The file provides:

  • a site-level summary
  • structured links to key pages
  • clean Markdown versions of content at predictable URLs

Adoption is early. No major AI crawler has confirmed reading it yet.

The shift mirrors what happened with search engines, but the mechanism is different:

SEO era versus AI-native era Two-column comparison. SEO era: robots.txt controls crawling, sitemap.xml maps URLs, HTML pages rank in search results. AI-native era: robots.txt still controls crawling, llms.txt provides structured site context, .md files expose clean content for reasoning. SEO era AI-native era controls crawling robots.txt maps page URLs sitemap.xml structured site context llms.txt rank in search results HTML pages clean content for reasoning .md files rank in search results Searchindex synthesised answer Language model SEO era versus AI-native era

But the direction is clear: AI systems increasingly need structured, machine-readable content to reason about a site accurately.

What the original article already anticipated

When I wrote about static generation, I argued that content should be:

  • portable
  • versionable
  • structurally interpretable
  • AI-processable
  • and separable from the presentation layer

That was not written with llms.txt in mind. That convention did not exist yet when I was forming those principles.

But when I looked at what llms.txt requires, the content architecture I had already built satisfied it almost completely:

  • YAML source files already separate content from rendering
  • each page is self-contained with explicit sections
  • per-page Markdown files are a direct transformation of existing YAML
  • the llms.txt index is generated from the same source at build time

Nothing needed to be restructured. The pipeline extended naturally.

The image summary field as a concrete example

One specific design decision became unexpectedly relevant in this context.

When SVG diagrams are referenced in content, I use an image_sc shortcode with a summary field:

:image_sc[
  src="publishing-flow-horizontal.svg",
  alt="Traditional publishing process",
  summary="content creation, editing, transformation, publication, distribution"
]

The alt field describes the image for accessibility.

The summary field captures what the diagram actually communicates: the semantic content, not the visual description.

I had solved this before knowing about llms.txt.

The reason was the same: meaning should not be locked inside a visual artifact. It should be explicit, portable, and machine-readable.

When generating per-page Markdown files for AI consumption, the shortcode is not rendered as an image reference. The summary field becomes a simple inline line:

*Diagram: content creation, editing, transformation, publication, distribution*

The image becomes a structured concept list that an LLM can reason about without needing to process the visual at all.

You can see this in practice in the Markdown version of the original article.

That is content-presentation separation applied at the image level.

The distinction that matters

Traditional SEO optimised content for search engine crawlers:

  • keyword density
  • metadata
  • backlink structure
  • heading hierarchies for ranking signals

AI systems do not rank pages. They synthesize content.

What matters for AI consumption is different:

  • clarity of meaning
  • consistency of structure
  • semantic density
  • self-contained sections that can be retrieved and reasoned about independently
  • explicit relationships between ideas

The interesting observation is that these requirements are not new to AI.

They are the same requirements that make content worth writing in the first place.

Optimising for AI consumption and optimising for clarity are not separate concerns. They converge on the same architectural decisions.

Sound decisions adapt without restructuring

The practical consequence of this conversation was minimal.

Two Astro endpoint files were added to the build pipeline:

  1. generating /llms.txt as a structured index of all content,
  2. generating per-page .md files at predictable URLs.

The content itself required no changes. The build pipeline extended without modification to the source structure. The image summary field already handled the one case that could have been a gap.

That is what I mean by a foundation that ages well.

Not that it anticipated every future requirement.

But that the underlying principles — explicit structure, separated concerns, semantic clarity — made adaptation straightforward when new requirements arrived.

What I am still building toward

The statically generated website was never the final goal.

It is the foundation.

What I am building toward is a content system where:

  • ideas evolve traceably
  • perspectives refine explicitly over time
  • structure remains machine-readable at every layer
  • and AI can assist in reasoning over knowledge without becoming a source of uncontrolled noise itself

The llms.txt conversation was one data point.

It confirmed the direction.

It did not change it.