-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Problem
The state legislative tracker is embedded in policyengine.org via an iframe that loads from Modal (policyengine--state-legislative-tracker.modal.run). This causes SEO problems:
- Canonical URLs point to Modal, not policyengine.org — Google indexes the wrong domain
- Social sharing cards (Twitter, Facebook, LinkedIn) show generic app-v2 OG tags instead of per-bill metadata (title, description, revenue impact)
- Structured data (JSON-LD) and sitemap live on Modal and reference Modal URLs, so Google doesn't associate the tracker content with policyengine.org
- iframe content is not reliably indexed by all search engines
The tracker has rich pre-rendered HTML for every bill page (19 bills across 16 states) with unique titles, descriptions, and noscript content — but crawlers hitting policyengine.org never see it.
Solution
Add a crawler-only reverse proxy in Vercel middleware. When a search engine or social media bot requests /us/state-legislative-tracker/*, the middleware fetches the pre-rendered HTML from Modal and returns it directly. This HTML contains:
- Canonical URLs on
policyengine.org - Per-bill OG tags (title, description, image)
- JSON-LD structured data
- Noscript content with bill provisions and revenue impacts
- Sitemap and robots.txt referencing policyengine.org URLs
Regular users continue to see the existing iframe version with the full policyengine.org nav and footer — no visual change for humans.
How it works
Crawlers: policyengine.org/us/state-legislative-tracker/GA
→ middleware detects bot UA → fetches modal.run/GA
→ returns pre-rendered HTML with policyengine.org canonical URLs
Users: policyengine.org/us/state-legislative-tracker/GA
→ middleware returns nothing → catch-all rewrite → website.html → iframe
Assets: policyengine.org/_tracker/index-*.js (loaded by crawler-rendered pages)
→ Vercel rewrite → modal.run/_tracker/index-*.js
Companion PR
Tracker-side changes (already merged): PolicyEngine/state-legislative-tracker#107
- Renamed asset directory from
assets/to_tracker/to avoid path collisions - Updated prerender to emit canonical URLs on policyengine.org