Skip to content

New doc-collector#25

Open
DenysKuchma wants to merge 2 commits into
mainfrom
feature/24-documentation-researcher
Open

New doc-collector#25
DenysKuchma wants to merge 2 commits into
mainfrom
feature/24-documentation-researcher

Conversation

@DenysKuchma
Copy link
Copy Markdown
Collaborator

#24

Summary

Adds a new doc-collector boat that crawls website pages and generates lightweight user-facing documentation in spec form.

The new flow is exposed as:

  • explorbot docs collect
  • explorbot docs init

Generated output includes:

  • output/docs/spec.md
  • output/docs/pages/*.md
  • output/research/*.md

Each documented page is summarized as:

  • Purpose
  • User Can
  • User Might

What’s Included

  • standalone boat/doc-collector subproject with its own CLI, config loader, crawl loop, renderer, and Documentarian
  • page traversal with visited-state tracking and dead-loop protection via the existing state manager
  • crawl target filtering and dynamic URL collapsing
  • optional crawl scope modes:
    • site
    • section
    • subtree
  • navigation discovery from both regular links and research-derived navigation targets for hash-based/OpenAPI-style docs
  • fallback document generation path when structured JSON generation fails on noisy research input

Notes

Most of the feature lives inside boat/doc-collector.

The non-boat changes are either:

  • integration changes required to expose the feature through the main CLI, or - small core fixes discovered while running the collector against real sites.

@DenysKuchma DenysKuchma requested a review from DavertMik May 11, 2026 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant