Ai & agent research

I research how AI agents work, how they consume information, and how the ecosystems around them are evolving. This page collects that work in one place. For my documentation background, see Documentation & Developer Education. For my programming projects, see Programming.

Talks & Interviews

State of Docs Report 2026 - Featured discussing AI consumption of documentation. Podcast forthcoming.
Can ANY AI Pass This Agent Reading Test? (BetterStack YouTube, 2026) - Walkthrough of the Agent Reading Test, a benchmark for measuring how AI agent web fetch pipelines handle documentation failure modes.
Why AI Agents Struggle with Modern Documentation (YouTube, 2026) - Interview covering how agents access documentation in real time and the failure modes most docs teams don't know about.
Designing Documentation for Agents, Not Just Users (Behind the Docs podcast, March 2026) - How AI agents struggle to use existing documentation, common structural issues like truncation and hidden content, and the Agent-Friendly Documentation Spec as a solution framework.
When AI Reads the Docs: LLMs, Agents, and Documentation Design (Deborah Emeni's YouTube Coffee Chat, March 2026) - How LLMs and agents are two very different documentation consumers, what "AI-friendly documentation" actually means, how documentation structure affects machine interpretation, and how documentation practices may evolve as machines increasingly consume technical content.

Specifications & Standards

Agent-Friendly Documentation Spec

A specification defining 22 checks across 7 categories for evaluating how well a documentation site serves AI agent consumers. Covers llms.txt discovery, markdown availability, page size, content structure, URL stability, and more. Based on real-world agent access patterns I've been researching since late 2025.

Links: agentdocsspec.com

Tools

afdocs

A CLI tool that implements the Agent-Friendly Documentation Spec and tests docs sites against it. Point it at a URL and it reports where your docs stand. Published on npm. Fern's Agent Score directory uses afdocs to score API documentation sites at scale.

Language: TypeScript
Links: afdocs.dev ・ GitHub ・ npm

skill-validator

A CLI that validates Agent Skills against the agentskills.io specification. Checks directory structure, frontmatter, content quality, cross-contamination risk, and token budget composition.

Language: Go
Links: GitHub

Research & Analysis

Agent Skill Ecosystem Analysis

An ecosystem-scale analysis of 673 Agent Skills across 41 repositories, examining compliance with the Agent Skills specification and content quality. Includes an interactive dashboard and a downloadable paper.

Links: Interactive Report ・ Blog post

Agent Skill Implementation Research

Empirical research into how agent platforms actually implement Agent Skill loading, management, and presentation. Catalogs 23 checks across 9 categories, with 17 benchmark skills containing canary phrases for testing platform behavior without relying on model self-reporting. A community-driven project accepting per-platform contributions.

Links: agentskillimplementation.com ・ GitHub ・ Blog post

Agent Reading Test

A benchmark for measuring how AI agent web fetch pipelines handle real-world documentation failure modes. 10 test pages target specific failures (truncation, SPA shells, tabbed content, redirects, soft 404s) using canary tokens embedded at strategic positions. Task-first design prevents relevance-layer priming, and human-side scoring avoids agent self-report inflation.

Links: agentreadingtest.com ・ GitHub ・ Blog post

Agent Web Fetch Behavior

Research into how coding agents actually fetch and process web content, including truncation behavior, redirect handling, and content negotiation across platforms.

Links: Blog post

Agent-Friendly Documentation Audit

An analysis of hundreds of documentation pages across popular developer tools, examining how well they serve AI agent consumers. The research that led to the Agent-Friendly Documentation Spec.

Links: Blog post

Writing

I write about agents, documentation, and the AI ecosystem on this blog and at AE Shift.

Selected articles:

Designing an Agent Reading Test - Building a benchmark that survives score inflation, relevance-layer priming, and the Hawthorne effect
Measure Agent Web Traffic Redux - Revisiting agent web traffic measurement with updated methods
The Verification Gap in AI Content Pipelines - Where AI content pipelines break down between generation and publication
When a Feature Request Becomes a Research Project - How an evals/ directory question turned into a 26-platform empirical research project
Why a Platform Shouldn't Own an Open Spec - How Anthropic's stewardship of the Agent Skills spec is fragmenting the ecosystem
Is Your llms.txt Already Stale? - Building a freshness check and discovering the tools were the problem
Agent Skill Mega Repo Woes - Validating a 23.7k-star skill mega repo and finding problems the star count won't tell you
An Agent is More Than Its Brain - What's inside a coding agent, and why the model is only one piece
LLMs vs. Agents as Docs Consumers - Why "AI-friendly docs" means two different things
Case Study: upgrade-stripe Agent Skill - Deep dive on a real-world Agent Skill
Make Your Hugo Site Agent-Friendly - Practical how-to for static site owners
Upskilling in the AI Age - Advice for people getting started with AI tools