Add JSON-LD, canonical URLs, robots.txt; pages in sitemap

Add JSON-LD, canonical URLs, robots.txt; pages in sitemap

#43 in Riparion/riparion-cms — merged 2026-06-02

Implements four additive SEO / crawlability features identified by reviewing the reader against Google's AI-optimization guide. The guide's core message is that AI features ride on the same crawl/index/rank systems — so solid, server-rendered SEO plus rich-result eligibility is what matters (explicitly not llms.txt or AI-specific files). The reader already had SSR content, full Open Graph/Twitter tags, an Atom feed, sitemap, and responsive images; these fill the remaining gaps.

What's in here

1. JSON-LD structured data (new src/pages/reader/jsonld.rs)

  • BlogPosting + BreadcrumbList per post
  • BreadcrumbList per page
  • Site-wide WebSite + Organization (emitted once in GlobalMeta)
  • Built with serde_json, rendered as <script type="application/ld+json"> in the SSR HTML so crawlers read it without running JS. </ is escaped to <\/ to prevent early </script> breakout. Skipped when SITE_URL is unknown (schema.org wants absolute URLs). All fields come from data editors already fill in (title, excerpt, author, date, featured image).

2. Canonical URLs<link rel="canonical"> added to OgHead, so every reader page (post / page / help / home) declares its own absolute URL as canonical, alongside the existing og:url.

3. robots.txtGET /robots.txt: permissive (Allow: /) and advertises the sitemap with an absolute Sitemap: line. Route registered; robots reserved as a page slug.

4. Pages in the sitemap — published CMS pages now appear in /sitemap.xml (materialized path + lastmod) via a new feed_published_pages_db query. Previously only posts/categories/tags/authors were listed.

Docs

  • New help article: AI search & structured data (/help/ai-search) — covers JSON-LD, canonical, robots.txt, and Google's do's/don'ts.
  • Updated the existing SEO article's endpoint table (adds robots.txt, notes pages are now in the sitemap).

Verification

  • cargo fmt --all -- --check
  • cargo clippy --features server,sqlite --all-targets -D warnings
  • cargo clippy --features server,postgres --all-targets -D warnings
  • cargo clippy --features web --target wasm32-unknown-unknown -D warnings

🤖 Generated with Claude Code

Last updated 2026-06-03