> ## Documentation Index
> Fetch the complete documentation index at: https://mint.skeptrune.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Serve Lean Markdown to LLM Agents via Accept Header

> Detect the Accept header and serve lean Markdown to LLMs while humans see normal HTML. Includes Cloudflare Workers and Caddy configs for Astro static sites.

Agents don't need to see websites with markup and styling. Anything other than plain Markdown is wasted money spent on context tokens. By inspecting the `Accept` header on incoming requests, you can serve a lean Markdown version of your pages to LLMs while humans continue to see normal HTML.

This technique was inspired by [a post from the Bun team on X](https://x.com/bunjavascript/status/1971934734940098971) and is now live on skeptrune.com. You can verify it right now:

```bash theme={null}
curl -H "Accept: text/markdown" https://www.skeptrune.com
curl -H "Accept: text/plain" https://www.skeptrune.com
```

The motivation is both economic and strategic. The Bun team reported a 10x token drop for Markdown vs HTML. Frontier labs charge per token, so cheaper pages get scraped more often, are more likely to end up in training data, and earn a little extra lift from AI assistants and search.

## Static site generators are already halfway there

Static site generators like Astro and Gatsby already generate a big folder of HTML files, typically in a `dist` or `public` folder through `npm run build`. The only missing piece is converting those HTML files to Markdown.

There's a great CLI tool for this: [html-to-markdown](https://www.npmjs.com/package/@wcj/html-to-markdown-cli). Install it as a dev dependency:

```bash theme={null}
npm install -D @wcj/html-to-markdown-cli
```

Here's a Bash script that converts all HTML files in `dist/html` to Markdown files in `dist/markdown`, preserving the directory structure:

```bash theme={null}
# convert-to-markdown.sh
mkdir -p dist/markdown

find dist/html -type f -name "*.html" | while read -r file; do
    relative_path="${file#dist/html/}"
    dest_path="dist/markdown/${relative_path%.html}.md"
    mkdir -p "$(dirname "$dest_path")"
    npx @wcj/html-to-markdown-cli "$file" --stdout > "$dest_path"
done
```

Wire this into your `package.json` as a post-build action:

```json theme={null}
"scripts": {
    "build": "astro build && yarn mv-html && yarn convert-to-markdown",
    "mv-html": "mkdir -p dist/html && find dist -type f -name '*.html' -not -path 'dist/html/*' -exec sh -c 'for f; do dest=\"dist/html/${f#dist/}\"; mkdir -p \"$(dirname \"$dest\")\"; mv -f \"$f\" \"$dest\"; done' sh {} +",
    "convert-to-markdown": "bash convert-to-markdown.sh"
}
```

<Note>
  Moving all HTML files to `dist/html` first is only necessary if you're using Cloudflare Workers, which will serve existing static assets before falling back to your Worker. If you're using a traditional reverse proxy, skip that step and convert directly from `dist` to `dist/markdown`.
</Note>

<Note>
  After finishing this setup, there's a simpler Cloudflare-specific alternative: add `run_worker_first = ["*"]` to your `wrangler.json`. This forces the worker to always run first, so you don't have to move files around at all.
</Note>

## Cloudflare Workers configuration

If you're hosting on Cloudflare Workers, configuring this requires more steps than a traditional reverse proxy. If you're using Nginx or Caddy, skip this section — you'll have an easier time.

Cloudflare Workers force you into a different paradigm. What would normally be a simple Nginx rewrite rule becomes custom `wrangler.jsonc` configuration, shadow directories, and JavaScript that manually checks headers and uses `env.ASSETS.fetch` to serve files.

This is also what makes Next.js middleware click: it's not middleware in the REST API sense. It's more like "use this where you would normally have a real reverse proxy." Both Cloudflare Workers and Next.js Middleware are JavaScript-based reverse proxies that intercept requests before they hit your application.

### wrangler.jsonc

Reference a new worker script and bind your build output directory as a static asset namespace:

```jsonc theme={null}
{
  "main": "worker.js",
  "assets": {
    "directory": "./dist",
    "binding": "ASSETS"
  }
}
```

### Worker script

Below is a minimal worker that inspects the `Accept` header and serves Markdown when requested, otherwise falling back to HTML:

```javascript theme={null}
export default {
  async fetch(request, env) {
    const url = new URL(request.url);
    const acceptHeader = request.headers.get("accept") || "";
    const acceptTypes = acceptHeader.split(",");

    const plainIndex = acceptTypes.findIndex(
      (t) => t.includes("text/plain") || t.includes("text/markdown")
    );
    const htmlIndex = acceptTypes.findIndex((t) => t.includes("text/html"));
    const prefersMarkdown =
      plainIndex !== -1 && (htmlIndex === -1 || plainIndex < htmlIndex);

    const tryServeContent = async (format) => {
      let contentType;
      if (format === "markdown") {
        if (url.pathname == "" || url.pathname == "/") {
          const sitemapResponse = await env.ASSETS.fetch(
            new Request(new URL("/sitemap-0.xml", request.url))
          );
          if (sitemapResponse.ok) {
            const content = await sitemapResponse.text();
            return new Response(content, {
              headers: {
                "Content-Type": "application/xml; charset=utf-8",
                "Cache-Control": "public, max-age=3600",
              },
            });
          }
        }

        contentType = "text/plain; charset=utf-8";
        let distPath = `/markdown${url.pathname}`;

        if (!distPath.endsWith(".md") && !distPath.endsWith("/")) {
          distPath += "/index.md";
        } else if (distPath.endsWith("/")) {
          distPath += "index.md";
        }

        if (url.pathname === "/") {
          distPath = "/markdown/index.md";
        }

        try {
          const response = await env.ASSETS.fetch(
            new Request(new URL(distPath, request.url))
          );
          if (response.ok) {
            const content = await response.text();
            return new Response(content, {
              headers: {
                "Content-Type": contentType,
                "Cache-Control": "public, max-age=3600",
              },
            });
          }
        } catch (error) {
          console.error(`Error fetching markdown file from ${distPath}:`, error);
        }
      } else {
        contentType = "text/html; charset=utf-8";
        let distPath = `/html${url.pathname}`;

        if (!distPath.endsWith(".html") && !distPath.endsWith("/")) {
          distPath += "/index.html";
        } else if (distPath.endsWith("/")) {
          distPath += "index.html";
        }

        if (url.pathname === "/") {
          distPath = "/html/index.html";
        }

        try {
          const response = await env.ASSETS.fetch(
            new Request(new URL(distPath, request.url))
          );
          if (response.ok) {
            const content = await response.text();
            return new Response(content, {
              headers: {
                "Content-Type": contentType,
                "Cache-Control": "public, max-age=3600",
              },
            });
          }
        } catch (error) {
          console.error(`Error fetching HTML file from ${distPath}:`, error);
        }
      }

      return null;
    };

    if (prefersMarkdown) {
      const markdownResponse = await tryServeContent("markdown");
      if (markdownResponse) return markdownResponse;

      const htmlResponse = await tryServeContent("html");
      if (htmlResponse) return htmlResponse;
    } else {
      const htmlResponse = await tryServeContent("html");
      if (htmlResponse) return htmlResponse;

      const markdownResponse = await tryServeContent("markdown");
      if (markdownResponse) return markdownResponse;
    }

    return await env.ASSETS.fetch(
      new Request(new URL("/html/404.html", request.url))
    );
  },
};
```

<Tip>
  Make the root path `/` serve your `sitemap.xml` instead of Markdown content for your homepage. That way, an agent visiting your root URL can see all the links on your site and discover content efficiently.
</Tip>

## Caddy configuration

If you're using a traditional reverse proxy, Caddy makes this significantly simpler. Here's a complete `Caddyfile` configuration:

```caddyfile theme={null}
your-personal-domain.com {
    root * /path/to/your/dist
    file_server

    @markdown {
        header Accept *text/markdown*
        header Accept *text/plain*
        not header Accept *text/html*
    }
    handle @markdown {
        rewrite * /markdown{path}/index.md
        try_files {path} {path}.md /markdown/index.md
        file_server
    }

    handle {
        rewrite * /html{path}/index.html
        try_files {path} {path}.html /html/index.html
        file_server
    }

    handle_errors {
        respond "404 Not Found" 404
        try_files /html/404.html
    }
}
```

Nginx configuration is left as an exercise for the reader — or the reader's LLM of choice.

## Conclusion: a more accessible web for agents

By serving lean, semantic Markdown to LLM agents, you achieve a 10x reduction in token usage while making your content more accessible and efficient for the AI systems that increasingly browse the web.

This optimization isn't just about saving money on tokens. It's about GEO (Generative Engine Optimization) for a world where millions of users discover content through AI assistants. Cheaper pages get scraped more, are more likely to end up in training data, and earn more visibility from assistants and search.

Astro's flexibility made this implementation surprisingly straightforward — only a couple of hours to get a personal blog and a production app to support the feature.

For a fun exercise, copy the URL of a blog post and ask your favorite LLM to "Use the blog post to write a Cloudflare Worker for my own site." See how it does. Source code for a working implementation is at [github.com/skeptrunedev/personal-site](https://github.com/skeptrunedev/personal-site).
