This file provides context for AI assistants (Claude, Copilot, etc.) working on this project.
Parsely CLI is a terminal-based recipe scraper that extracts structured data (ingredients, instructions, cook times) from recipe URLs. It features an interactive Ink TUI with a viewport-filling app shell, responsive panels, and alternate-screen terminal behavior.
- Runtime: Node.js (v18+)
- Language: TypeScript (ESM,
"type": "module") - TUI Framework: Ink v5 — React renderer for the terminal
- React: v18 with hooks (
useState,useCallback,useInput,useApp) - Browser Automation: Puppeteer-core — headless Chrome for scraping (auto-detects system Chrome; no bundled Chromium)
- HTML Parsing: Cheerio — jQuery-like API for extracting JSON-LD
- AI Fallback: OpenAI SDK —
gpt-4o-minifor recipe extraction - TypeScript Executor: tsx — runs .tsx files directly
The app uses a phase-based state machine in src/app.tsx:
idle → scraping → display
→ error → idle (retry)
display → idle (new recipe)
The state machine still drives the app, but the screen layout now changes per phase instead of swapping a single centered card.
<App>
{idle && <LandingScreen />}
{scraping && <LoadingScreen />}
{display && <RecipeCard />}
{error && <Banner /> + <ErrorDisplay /> + <Panel><URLInput /></Panel> + <PhaseRail /> + <Footer />}
- Puppeteer → Launch headless Chrome → navigate → extract HTML
- Cheerio → Parse HTML → find
<script type="application/ld+json">→ locate Recipe schema - Parsing status → Report a dedicated
parsingphase back to the UI - OpenAI fallback → Send normalized page content to
gpt-4o-mini→ normalize the JSON response back into the sharedRecipeshape
The browser path now uses a more browser-like Puppeteer configuration (userAgent, accept-language, and a small webdriver-masking shim) because some recipe sites return Cloudflare challenges to the default headless setup.
src/cli.tsxenters the terminal alternate screen before rendering Ink, restores it after exit, and explicitly resets the terminal background color on shutdownsrc/cli.tsxenables synchronized output in terminals that are known to support DECSET 2026 reliably (ghostty,WezTerm, and Kitty-styleTERMvalues)useTerminalViewport()reads live terminal width/height from Ink stdout and updates layout on resizesrc/utils/terminal.tskeeps the Ink tree one row shorter than the terminal so Ink stays on its incremental-update path instead of clearing the entire screen on each rendersrc/utils/terminal.tsalso gates advanced terminal features by environment so palette changes are enabled only for terminals we explicitly support, and are disabled inside multiplexers such astmuxandscreensrc/app.tsxcollapses non-essential panels on shorter terminals so the URL input remains visible and usablesrc/components/URLInput.tsxstrips pasted CR/LF characters, treatsEnteras submit, and showsCtrl+C/Ctrl+Thints directly under the fieldsrc/app.tsxowns anAbortControllerfor the active scrape soCtrl+Ccan abort browser or AI work before exitingsrc/app.tsxowns the active theme mode, auto-detects light/dark preference on startup, and letsCtrl+Ttoggle the full app theme across all screens
| File | Purpose |
|---|---|
src/cli.tsx |
Entry point — parses --help, --version, optional URL arg |
src/app.tsx |
Root component — theme mode, layout switching, phase orchestration |
src/theme.ts |
Theme registry, theme-mode detection, symbols — single source of truth for styling |
src/services/scraper.ts |
All scraping logic — Puppeteer, Cheerio, OpenAI |
src/utils/helpers.ts |
ISO duration parser, URL validation, env config, URL host formatting |
src/utils/terminal.ts |
Terminal capability detection, render-height policy, synchronized-output, palette helpers |
src/hooks/useTerminalViewport.ts |
Terminal resize tracking |
src/hooks/useDisplayPalette.ts |
Root-level hook that applies and resets the active terminal background color |
src/components/Banner.tsx |
Shell header with status badge and current host |
src/components/Panel.tsx |
Shared bordered panel primitive |
src/components/PhaseRail.tsx |
Pipeline view for browser, parsing, and AI stages |
src/components/URLInput.tsx |
Bordered text input with validation and newline-safe submit handling |
src/components/RecipeCard.tsx |
Responsive recipe deck — summary, timing, ingredients, instructions |
src/components/LandingScreen.tsx |
Idle landing screen with Parsely wordmark and centered URL input |
src/components/LoadingScreen.tsx |
Minimal loading state shown between landing and recipe display |
src/components/Footer.tsx |
Status line and dynamic keybind hints based on current phase |
src/components/ErrorDisplay.tsx |
Error recovery panel with troubleshooting guidance |
test/helpers.test.ts |
Input normalization and URL validation coverage |
test/scraper.test.ts |
Schema extraction and challenge detection coverage |
test/theme.test.ts |
Theme-mode detection and toggle coverage |
test/terminal.test.ts |
Terminal compatibility matrix and palette/sync helper coverage |
npm start # Run the CLI
npm run dev # Run with file watching (auto-reload)
npm run typecheck # Type-check without emitting
npm test # Run helper and scraper unit tests
./run.sh # Quick-start (installs deps if needed)
./run.sh <url> # Scrape a specific URL immediatelyOPENAI_API_KEY— Required for AI fallback. Set in.env.localfile.PARSELY_SYNC_OUTPUT— Optional override for synchronized-output terminal writes (1forces on,0forces off).PARSELY_DISPLAY_PALETTE— Optional override for terminal background palette writes (1forces on,0forces off).PARSELY_THEME— Optional startup theme override (lightordark).
- Ink over raw ANSI — Declarative React components are easier to maintain than imperative terminal output. Ink uses Yoga (Flexbox) for layout.
- Phase-based state — A simple state machine (
idle | scraping | display | error) keeps the UI predictable. - Alternate screen shell — Parsely behaves like a focused terminal app, not a command that leaves the UI in shell history. The screen switch is handled in
src/cli.tsx, outside the React tree, to avoid corrupting Ink's render cycle. - Stable repaint path — Ink v5 clears the whole terminal when rendered output fills the viewport, so Parsely deliberately leaves one row free to stay on incremental updates and reduce flicker in terminals like Ghostty.
- Theme at the app root — The active palette is applied once from
Appso all screens share the same background and theme toggle behavior. - Conservative terminal capability detection — Palette updates are enabled only for terminals we explicitly recognize as safe defaults, and synchronized output is narrower still. Multiplexers and embedded terminals are treated as opt-in to avoid protocol regressions.
- Synchronized output — Parsely wraps Ink's stdout with DECSET 2026 escape codes only for terminals we explicitly trust for it.
PARSELY_SYNC_OUTPUT=1forces this on andPARSELY_SYNC_OUTPUT=0disables it. - Display palette override — Parsely writes OSC background-color sequences only in supported terminals or when explicitly forced with
PARSELY_DISPLAY_PALETTE=1. - Callback-driven scraping — The scraper accepts an
onStatuscallback so the TUI can show real-time progress without polling. - Explicit pipeline UI — Browser fetch, parsing, and AI fallback are surfaced as distinct stages so users can see which path produced the recipe.
- Defensive input handling — URL submission is handled locally in
URLInput, with CR/LF sanitization so paste-plus-enter works reliably in real terminals. - Abortable scraping — The app keeps an
AbortControllerper scrape soCtrl+Cor unmounting does not leave Puppeteer or OpenAI requests running in the background. - Challenge-aware browser mode — The browser scraper uses a browser-like fingerprint and challenge detection to avoid falling back to AI on sites that block the default headless signature.
- Puppeteer first, AI second — Browser scraping is more reliable and doesn't require an API key. AI is the fallback, not the default. Uses
puppeteer-corewith auto-detection of system Chrome to avoid a heavy Chromium download. - Theme module — All colors are centralized in
theme.tsfor easy customization and consistency. - ESM throughout — The project uses ES modules (
"type": "module") for compatibility with Ink v5 which is ESM-only.
- Create
src/components/MyComponent.tsx - Import theme:
import { theme } from '../theme.js'; - Prefer composing with
Panelfor bordered surfaces before introducing a new one-off container - Use Ink primitives:
<Box>,<Text>,useInput(),useApp() - Import in
src/app.tsxand add to the appropriate phase/layout
- Add the function to
src/services/scraper.ts - Call it from
scrapeRecipe()with appropriateonStatus()updates - Return a
Recipeobject with thesourcefield set - Update
PhaseRailif the strategy should appear as a user-visible stage
Edit src/theme.ts — the module now defines both light and dark themes, startup detection, and toggle helpers. Keep the Parsely green accent stable across modes and make sure recipePaper remains suitable for terminal background application.
Run tests with:
npm testThe current suite focuses on pure helpers rather than Ink rendering:
test/helpers.test.tscovers URL normalization and pasted newline sanitizationtest/scraper.test.tscovers JSON-LD extraction and challenge-page detectiontest/theme.test.tscovers theme detection andCtrl+Ttoggle helperstest/terminal.test.tscovers render-height buffering, synchronized-output detection, palette detection, and env-matrix compatibility cases for common macOS and Linux terminals