Add Wikipedia page download test for all supported engines#10
Conversation
Adding CLAUDE.md with task information for AI processing. This file will be removed when the task is complete. Issue: undefined
Add comprehensive integration tests that verify the ability to download Wikipedia pages (https://en.wikipedia.org/wiki/Wikipedia) in all three formats (HTML, Markdown, and screenshot) using both Puppeteer and Playwright browser engines. Test coverage includes: - Downloading Wikipedia page as HTML (both engines) - Converting Wikipedia page to Markdown (both engines) - Capturing Wikipedia page as PNG screenshot (both engines) - Comparing content download across both engines Also updated jest.config.mjs to include integration tests in the test suite. Fixes #8 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This reverts commit 3657e47.
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 📎 Log file uploaded as GitHub Gist (279KB) Now working session is ended, feel free to review and add any feedback on the solution draft. |
|
Get latest changes from default branch. Make sure this feature is rethinked and is properly implemented for all APIs we have in codebase, and docs are updated. We need to ensure all changes are correct, consistent, validated, tested, documented, logged and fully meet each and all discussed requirements in deepest and widest possible sense (check issue description and all comments in issue and in pull request, make sure each and every requirement listed before actually checking if they were addressed). Nothing should be defered or delayed. The scope is the entire repository, every change in one place should also be applied in all places in the codebase and docs. Ensure all CI/CD checks pass. Please plan and execute everything in this single pull request, you have unlimited time and context, as context auto-compacts and you can continue indefinitely, until it is each and every requirement fully addressed, and everything is totally done. |
|
🤖 AI Work Session Started Starting automated work session at 2026-06-01T21:04:08.881Z The PR has been converted to draft mode while work is in progress. This comment marks the beginning of an AI work session. Please wait for the session to finish, and provide your feedback. |
# Conflicts: # js/jest.config.mjs # js/tests/integration/wikipedia-download.test.js
…ines (#8) Add live integration tests that actually download the Wikipedia article (https://en.wikipedia.org/wiki/Wikipedia) as both Markdown and a PNG image for every supported browser engine, across both implementations: - JS: Puppeteer and Playwright (js/tests/integration/wikipedia-download.test.js) - Rust: chromiumoxide (rust/tests/integration/wikipedia_download.rs) The suites are gated behind WIKIPEDIA_INTEGRATION and wired into the JS and Rust CI workflows as dedicated live steps, mirroring the existing Habr and Google Docs integration suites. Docs (both READMEs) and a JS changeset are updated accordingly.
Working session summaryDone. Issue #8 is fully resolved and PR #10 is ready with all CI checks green. SummaryPR #10 ( What was deliveredTests that actually download the Wikipedia page (https://en.wikipedia.org/wiki/Wikipedia) as both Markdown and an image (PNG) in every supported engine, across both implementations in the monorepo:
Plus: CI live steps in both Verified on CI (not just locally)
Process notes
This summary was automatically extracted from the AI working session output. |
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost: $5.823867📊 Context and tokens usage:Claude Opus 4.8: (2 sub-sessions)
Total: (13.4K new + 194.7K cache writes + 7.1M cache reads) input tokens, 40.5K output tokens, $5.823867 cost 🤖 Models used:
📎 Log file uploaded as Gist (2610KB)Now working session is ended, feel free to review and add any feedback on the solution draft. |
✅ Ready to mergeThis pull request is now ready to be merged:
Monitored by hive-mind with --auto-restart-until-mergeable flag |
Summary
Closes #8.
Adds a live integration test proving web-capture can actually download the Wikipedia page (https://en.wikipedia.org/wiki/Wikipedia) as both Markdown and an image (PNG screenshot) in every supported engine, across both implementations in this monorepo:
js/)js/tests/integration/wikipedia-download.test.jsrust/)chrome)rust/tests/integration/wikipedia_download.rsThe work was rethought and applied consistently everywhere in this single PR — implementation, CI wiring, docs, and changeset — after merging the latest
maininto the branch.What's included
describe.each(['puppeteer', 'playwright'])covers both engines for Markdown + PNG, plus an explicit engine-parity test that downloads with both. Uses the existingcreateBrowser,convertHtmlToMarkdown, andretryhelpers; navigation retries with exponential backoff. Validates the PNG magic number and that the Markdown carries real content without raw HTML scaffolding.chromiumoxide; the URL routes through the generic browser path, not Google Docs) plus 2 live#[tokio::test]s that render Markdown viarender_html+convert_html_to_markdownand capture a PNG viacapture_screenshot. Live tests short-circuit when no Chrome/Chromium binary is present..github/workflows/js.ymland.github/workflows/rust.yml, mirroring the existing Habr / Google Docs integration steps.js/README.mdandrust/README.mddocument the newWIKIPEDIA_INTEGRATIONopt-in alongside the existing integration suites.js/.changeset/wikipedia-download-test.md(patch) for the JS package.Gating
Live tests are skipped by default so offline/default CI stays deterministic. Enable them with:
Verification (local)
cargo fmt --check,cargo clippy,prettier --check,eslint(0 errors), andjscpdduplication check all clean.