Skip to content

Markdown export for every page + AI-bot-aware routing#10

Merged
nadyyym merged 1 commit into
masterfrom
feat/markdown-pages
May 2, 2026
Merged

Markdown export for every page + AI-bot-aware routing#10
nadyyym merged 1 commit into
masterfrom
feat/markdown-pages

Conversation

@nadyyym

@nadyyym nadyyym commented May 2, 2026

Copy link
Copy Markdown
Member

Summary

  • Every page on www.getbeton.ai is now available as a Markdown file (PostHog-style: append .md to any URL).
  • Adds /llms.txt (curated index, auto-generated) and /llms-full.txt (concatenated bundle).
  • AI-crawler User-Agents (GPTBot, ClaudeBot, PerplexityBot, etc.) are auto-rewritten to the .md version of any indexed page via Vercel rewrites — humans and search engines hit the HTML unchanged.

Why this instead of ostr.io

The site is fully static (Astro SSG): every page is pre-rendered HTML at build time. Routing bots through ostr.io would add a network hop with no prerendering value to extract. AI bots in particular benefit far more from clean Markdown than from re-rendered HTML, so we serve them the .md directly. Standard search-engine crawlers (Googlebot, Bingbot) keep hitting the static HTML, which is already optimal for them.

What's added

Path Source
/index.md src/data/pages/home.md + integrations collection
/about.md, /privacy.md, /terms.md corresponding src/data/pages/*.md body
/pricing.md src/data/pages/pricing.md + tiers/addons/FAQ JSON
/team.md team collection
/blog/index.md, /blog/<slug>.md blog collection (raw md body)
/integrations/index.md, /integrations/<slug>.md integrations JSON, rendered
/tools/dryfit.md + scenarios dryfit-scenarios JSON, rendered
/llms.txt auto-generated curated index
/llms-full.txt auto-generated full bundle

Plus: <link rel="alternate" type="text/markdown"> on every indexed HTML page for explicit discoverability, and a robots.txt reference to /llms.txt.

Files

  • src/lib/markdown-export.ts — shared rendering helpers (integration / scenario / competitor / agent JSON → Markdown).
  • src/pages/**/*.md.ts, src/pages/llms.txt.ts, src/pages/llms-full.txt.ts — Astro static endpoints.
  • src/components/seo/Head.astro — adds the markdown alternate link (skipped on noindex pages).
  • vercel.json — adds rewrites block with UA-conditioned routes.
  • public/llms.txt — deleted (now generated).
  • public/robots.txt — adds reference to /llms.txt and /llms-full.txt.

Test plan

  • Verify Vercel preview build succeeds.
  • Curl /about.md, /blog/we-rebuilt-beton-automated-signal.md, /integrations/posthog.md, /llms.txt, /llms-full.txt and confirm Content-Type: text/markdown (or text/plain for .txt) and clean output.
  • Curl with User-Agent: GPTBot/1.0 against /, /about/, /integrations/posthog/ and confirm response body matches the corresponding .md file (rewrite working).
  • Curl with a normal browser UA against the same paths and confirm HTML response (no rewrite triggered).
  • View source on a couple of pages and confirm <link rel="alternate" type="text/markdown" href="..."> is present (and absent on /seqd/ and /404).

🤖 Generated with Claude Code

Every page now has a `.md` companion (PostHog-style: append `.md` to any
URL). Generated at build time via Astro endpoints reading the same
content collections as the HTML pages — single source of truth.

What changed:

- Per-page `.md` endpoints under `src/pages/` for home, about, pricing,
  team, privacy, terms, blog index + posts, integrations index +
  details, dryfit tool + scenarios.
- `/llms.txt` curated index and `/llms-full.txt` concatenated bundle,
  both auto-generated. Replaces the previously hand-maintained
  `public/llms.txt`.
- `<link rel="alternate" type="text/markdown">` in every indexed page
  so AI agents can discover the markdown version.
- Vercel `rewrites` in `vercel.json` that route AI-crawler User-Agents
  (GPTBot, ClaudeBot, PerplexityBot, etc.) to the `.md` version of the
  page they requested. Humans and search engines hit the HTML
  unchanged. Replaces the need for a third-party prerender service like
  ostr.io — the site is fully static, so prerendering adds latency
  without value, while raw markdown is what AI bots actually want.

Shared helpers in `src/lib/markdown-export.ts` render integration,
agent, competitor, and scenario JSON entries into clean Markdown with
consistent headers (title, description, canonical link).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel

vercel Bot commented May 2, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
getbeton-ai-landings Ready Ready Preview, Comment May 2, 2026 9:58am

Request Review

@nadyyym nadyyym merged commit 6d3edce into master May 2, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant