Markdown for AI Agents

Serve clean Markdown versions of WordPress content to AI agents using HTTP content negotiation.

By Selvakumar Duraipandian

Version 1.0.0 Active Installs 0+ Updated 7 days ago 6 days old

Description

Markdown for AI Agents is a lightweight WordPress plugin that enables HTTP content negotiation for your site’s content. When a client (like an AI agent or a custom script) requests a page with the Accept: text/markdown header, the plugin intercepts the request and returns a clean, structured Markdown representation of the post or page content.

This is ideal for AI crawlers, RAG (Retrieval-Augmented Generation) systems, and non-browser clients that prefer machine-friendly text over complex HTML.

Important note: This plugin is primarily a developer/integration tool. Human visitors browsing your site will never see any difference — the Markdown output is only served when explicitly requested via the Accept: text/markdown HTTP header. Normal browser requests always receive the standard HTML page.

Key Features:

  • Automatically detects Accept: text/markdown headers.
  • Converts HTML content to clean Markdown using the League HTMLToMarkdown library.
  • Strips away theme layout, navigation, headers, footers, and sidebars — serving only the main content.
  • Adds useful HTTP response headers: Content-Type: text/markdown, Vary: Accept, and X-Markdown-Word-Count.
  • Respects WordPress visibility rules and filters.
  • No configuration required — works out of the box for posts, pages, and custom post types.

How It Works

This plugin uses a standard web technique called HTTP content negotiation. The same URL on your site can serve different representations of the same content depending on what the client asks for:

  • A regular browser sends Accept: text/html receives your normal HTML page.
  • An AI agent sends Accept: text/markdown receives a clean Markdown version of the same page.

No extra URLs, no duplicate content, no configuration needed. The plugin hooks into WordPress’s template_redirect action, detects the Accept header, captures the rendered HTML, converts it to Markdown, and returns it with appropriate headers.

Why Markdown for AI Agents?

When building RAG (Retrieval-Augmented Generation) applications or AI pipelines that ingest web content, HTML is extremely noisy. A typical WordPress page contains thousands of tokens worth of HTML tags, inline styles, navigation menus, scripts, and layout markup — none of which carries meaning for an AI model.

Serving clean Markdown instead can reduce token consumption by up to 60%, which means:

  • Lower API costs — fewer tokens ingested when loading pages into vector stores or LLM pipelines.
  • Faster processing — less text for the model to parse, filter, and discard.
  • Better retrieval accuracy — higher signal-to-noise ratio improves the quality of RAG results.
  • Simpler pipelines — no need for custom HTML stripping logic on the client side; the plugin handles it server-side.

Any AI agent, crawler, or ingestion script that sends Accept: text/markdown in its request header will automatically receive the clean Markdown version — no extra URLs, no separate endpoints, no changes to your content workflow.

Plugin comparisons

See how this plugin stacks up against alternatives side by side.