Markdown for AI Agents is a lightweight WordPress plugin that enables HTTP content negotiation for your site’s content. When a client (like an AI agent or a custom script) requests a page with the Accept: text/markdown header, the plugin intercepts the request and returns a clean, structured Markdown representation of the post or page content.
This is ideal for AI crawlers, RAG (Retrieval-Augmented Generation) systems, and non-browser clients that prefer machine-friendly text over complex HTML.
Important note: This plugin is primarily a developer/integration tool. Human visitors browsing your site will never see any difference — the Markdown output is only served when explicitly requested via the Accept: text/markdown HTTP header. Normal browser requests always receive the standard HTML page.
Key Features:
- Automatically detects
Accept: text/markdownheaders. - Converts HTML content to clean Markdown using the League HTMLToMarkdown library.
- Strips away theme layout, navigation, headers, footers, and sidebars — serving only the main content.
- Adds useful HTTP response headers:
Content-Type: text/markdown,Vary: Accept, andX-Markdown-Word-Count. - Respects WordPress visibility rules and filters.
- No configuration required — works out of the box for posts, pages, and custom post types.
How It Works
This plugin uses a standard web technique called HTTP content negotiation. The same URL on your site can serve different representations of the same content depending on what the client asks for:
- A regular browser sends
Accept: text/htmlreceives your normal HTML page. - An AI agent sends
Accept: text/markdownreceives a clean Markdown version of the same page.
No extra URLs, no duplicate content, no configuration needed. The plugin hooks into WordPress’s template_redirect action, detects the Accept header, captures the rendered HTML, converts it to Markdown, and returns it with appropriate headers.
Why Markdown for AI Agents?
When building RAG (Retrieval-Augmented Generation) applications or AI pipelines that ingest web content, HTML is extremely noisy. A typical WordPress page contains thousands of tokens worth of HTML tags, inline styles, navigation menus, scripts, and layout markup — none of which carries meaning for an AI model.
Serving clean Markdown instead can reduce token consumption by up to 60%, which means:
- Lower API costs — fewer tokens ingested when loading pages into vector stores or LLM pipelines.
- Faster processing — less text for the model to parse, filter, and discard.
- Better retrieval accuracy — higher signal-to-noise ratio improves the quality of RAG results.
- Simpler pipelines — no need for custom HTML stripping logic on the client side; the plugin handles it server-side.
Any AI agent, crawler, or ingestion script that sends Accept: text/markdown in its request header will automatically receive the clean Markdown version — no extra URLs, no separate endpoints, no changes to your content workflow.