Version: 1.0
Date: December 16, 2024
Author: Shakes
Status: Draft
HtmlGraph is a lightweight graph database framework built entirely on web standards (HTML, CSS, JavaScript) that eliminates the need for external graph databases like Neo4j or Memgraph. It enables AI agents to coordinate work using HTML files as nodes, hyperlinks as edges, and CSS selectors as the query language.
Market Opportunity: With the explosion of AI coding agents (Claude Code, Copilot, Cursor, Aider), developers need simple infrastructure for agent coordination and observability. Current solutions require Docker, complex databases, and custom protocols. HtmlGraph provides a zero-dependency alternative that's human-readable and version-control friendly.
Target Launch: Q1 2025
For AI Agent Developers:
-
Infrastructure Complexity
- Need Docker for Neo4j/Memgraph
- Learn Cypher query language
- Manage database migrations
- Handle connection pooling
- Deploy and monitor database servers
-
Poor Human Observability
- Database queries needed to see agent state
- No visual representation without custom UI
- Hard to debug agent behavior
- Can't "view source" on agent decisions
-
Version Control Issues
- Binary database dumps don't diff well
- Can't see what changed in git
- Hard to code review agent state
- Difficult to rollback changes
-
Integration Friction
- Each agent needs database drivers
- Serialization/deserialization overhead
- Network latency to database
- Credential management
For Human Operators:
- Can't easily see what agents are doing
- Need specialized tools to query state
- Hard to manually intervene or correct
- No "open in browser and understand" option
- LangGraph (3.5k GitHub stars) - Custom state management
- AutoGPT (167k stars) - JSON files for memory
- Claude Code - Proprietary coordination
- Cursor/Windsurf - Opaque agent state
All of these reinvent coordination mechanisms. None use web standards.
Use HTML as the graph database format:
- HTML files = Graph nodes
<a href>= Graph edgesdata-*attributes = Node properties- CSS selectors = Query language
- Web browsers = Human interface
- justhtml/DOMParser = Machine interface
Dual Interface Design:
┌─────────────────┐
│ HTML Files │
│ (Source of │
│ Truth) │
└────────┬────────┘
│
┌─────────────┴──────────────┐
│ │
┌──────▼───────┐ ┌───────▼────────┐
│ Browsers │ │ AI Agents │
│ (Human │ │ (Machine │
│ View) │ │ View) │
└──────────────┘ └────────────────┘
CSS Pydantic
Rendering Serialization
For Developers:
- ✅ Zero dependencies (no Docker, no database servers)
- ✅ Zero learning curve (everyone knows HTML/CSS)
- ✅ Zero deployment complexity (just files)
- ✅ Version control native (git diff works perfectly)
For AI Agents:
- ✅ Lightweight context (Pydantic serialization)
- ✅ Simple queries (CSS selectors)
- ✅ Fast reads (no network latency)
- ✅ Safe writes (file system guarantees)
For Human Operators:
- ✅ Visual observability (open in browser)
- ✅ Easy debugging (view source)
- ✅ Simple intervention (edit HTML)
- ✅ Searchable history (grep works)
1. AI Agent Developer ("Alex")
- Profile: Software engineer building AI coding assistants
- Pain: Struggling with Neo4j complexity for agent coordination
- Goals: Simple, reliable agent state management
- Success Metric: Agents coordinate without database infrastructure
2. Solo Developer with ADHD ("Shakes")
- Profile: Needs visual observability of complex systems
- Pain: Can't see what agents are doing without queries
- Goals: "Open browser and understand" simplicity
- Success Metric: Can debug agent behavior visually
3. Open Source Maintainer ("Morgan")
- Profile: Building public AI agent tools/frameworks
- Pain: Users struggle with database setup in README
- Goals: Zero-dependency tool that "just works"
- Success Metric: Contributors can run project without Docker
4. Technical Writer ("Jordan")
- Use Case: Documentation site with graph structure
- Value: Hyperlinked docs, no static site generator needed
5. Knowledge Worker ("Taylor")
- Use Case: Personal knowledge base with connections
- Value: Visual note-taking, version controlled
Story 1: Create Graph Nodes
As an AI agent developer
I want to create graph nodes as HTML files
So that I can store agent state in version control
Acceptance Criteria:
- [ ] Can create Node with Python API
- [ ] Node.to_html() generates valid HTML
- [ ] HTML file opens in browser
- [ ] All properties visible in browser
Story 2: Query with CSS Selectors
As an AI agent developer
I want to query nodes using CSS selectors
So that I don't need to learn Cypher or SQL
Acceptance Criteria:
- [ ] graph.query("[data-status='blocked']") works
- [ ] Can combine multiple selectors
- [ ] Returns Python objects for processing
- [ ] Query performance <100ms for 100 nodes
Story 3: Visual Dashboard
As a human operator
I want to open a dashboard in my browser
So that I can see what agents are doing
Acceptance Criteria:
- [ ] Dashboard shows all nodes
- [ ] Can filter by status/priority
- [ ] Shows dependency graph
- [ ] Pure vanilla JS, no build step
Story 4: Agent Coordination
As an AI coding agent
I want to get lightweight task context
So that I don't waste tokens on HTML
Acceptance Criteria:
- [ ] node.to_context() returns <100 tokens
- [ ] Contains next action to take
- [ ] Shows blocking dependencies
- [ ] Can update and write back
Story 5: Full-Text Search
As a user with large graphs
I want to search across all nodes
So that I can find relevant information quickly
Solution: Optional SQLite FTS index
Story 6: Real-Time Updates
As a developer running multiple agents
I want the dashboard to update in real-time
So that I can see live progress
Solution: File watching + WebSocket/SSE
Story 7: Graph Visualization
As a visual thinker
I want to see the dependency graph
So that I can understand relationships
Solution: D3.js or Cytoscape.js integration
| Feature | Priority | Status | Notes |
|---|---|---|---|
| SQLite indexer | P2 | Backlog | For large graphs |
| File watching | P2 | Backlog | Real-time updates |
| Graph visualization | P2 | Backlog | D3.js integration |
| TypeScript definitions | P2 | Backlog | Better DX |
| VSCode extension | P3 | Idea | Graph explorer |
| CLI tool | P3 | Idea | Command-line queries |
Performance:
- Parse 1000 nodes in <1 second (Python)
- Query with CSS in <100ms (JavaScript)
- Dashboard loads in <500ms
- Memory usage <100MB for 1000 nodes
Reliability:
- 100% test coverage for core library
- Graceful handling of malformed HTML
- Data validation with Pydantic
- File corruption detection
Usability:
- Zero dependencies except justhtml + Pydantic
- No build step required
- Works offline
- Copy-paste examples that work
Compatibility:
- Python 3.10+
- Modern browsers (Chrome, Firefox, Safari)
- Windows, macOS, Linux
- Works in Pyodide (WASM)
┌─────────────────────────────────────────────────────────────┐
│ File System Layer │
│ features/feature-001.html sessions/session-abc.html │
└────────────────┬────────────────────────────────────────────┘
│
┌────────────────┴────────────────────────────────────────────┐
│ Python Library │
│ - Parser (justhtml wrapper) │
│ - Models (Pydantic schemas) │
│ - Graph (operations + algorithms) │
│ - Converter (HTML ↔ Pydantic) │
│ - Agents (lightweight interface) │
└────────────────┬────────────────────────────────────────────┘
│
┌────────────────┴────────────────────────────────────────────┐
│ JavaScript Library │
│ - HtmlGraph class │
│ - DOMParser integration │
│ - CSS selector queries │
│ - Graph algorithms (BFS, DFS) │
│ - Dashboard rendering │
└────────────────┬────────────────────────────────────────────┘
│
┌────────────────┴────────────────────────────────────────────┐
│ Optional: SQLite Index │
│ - Full-text search │
│ - Complex queries │
│ - Performance optimization │
└─────────────────────────────────────────────────────────────┘
Node (HTML File):
Node(
id: str, # Unique identifier
title: str, # Human-readable title
type: str, # "feature", "task", "note", etc.
status: str, # "todo", "in-progress", "done", etc.
priority: str, # "low", "medium", "high", "critical"
properties: dict, # Arbitrary key-value pairs
edges: dict, # {relationship_type: [node_ids]}
steps: list[Step], # Implementation steps
created: datetime, # Creation timestamp
updated: datetime, # Last update timestamp
)Edge (Hyperlink):
<a href="target-node.html"
data-relationship="blocks"
data-since="2024-12-16">
Target Node Title
</a>Python:
# Graph operations
graph = HtmlGraph('directory/')
graph.add(node)
graph.update(node)
graph.query(css_selector)
graph.get(node_id)
graph.shortest_path(from_id, to_id)
# Agent interface
agent = AgentInterface('directory/')
task = agent.get_next_task()
context = agent.get_context(task_id)
agent.complete_step(task_id, step_index)JavaScript:
// Graph operations
const graph = new HtmlGraph();
await graph.loadFrom('directory/');
const nodes = graph.query(css_selector);
const path = graph.findPath(from, to);
// Dashboard
graph.renderDashboard('#app', options);Adoption:
- 100 GitHub stars
- 10 PyPI downloads/day
- 5 real-world implementations
- 3 blog posts from others
Community:
- Front page of Hacker News
- r/programming discussion (100+ upvotes)
- Twitter thread (1000+ likes)
- 5+ contributors
Technical:
- 100% test coverage
- Zero critical bugs
- Documentation complete
- 3+ working examples
Adoption:
- 1000 GitHub stars
- 100 PyPI downloads/day
- Used in 3+ production systems
- 10+ blog posts/tutorials
Ecosystem:
- VSCode extension
- Integration with LangChain
- Integration with AutoGPT
- Featured in awesome-lists
Impact:
- "HTML is All You Need" becomes a meme
- Cited in academic papers
- Adopted by major AI agent projects
- Conference talk acceptances
Neo4j
- Strengths: Mature, powerful Cypher queries, enterprise support
- Weaknesses: Complex setup, expensive, binary format
- Differentiation: HtmlGraph has zero dependencies, human-readable
Memgraph
- Strengths: Fast, Cypher compatible, good for streaming
- Weaknesses: Requires Docker, learning curve
- Differentiation: HtmlGraph needs no server
ArangoDB
- Strengths: Multi-model (graph, document, key-value)
- Weaknesses: Heavy, complex, overkill for simple use cases
- Differentiation: HtmlGraph is lightweight and focused
JSON Files
- Strengths: Simple, ubiquitous
- Weaknesses: No native graph structure, no visual rendering
- Differentiation: HtmlGraph has hyperlinks + browser rendering
SQLite
- Strengths: Zero setup, fast queries
- Weaknesses: Not designed for graphs, no visual layer
- Differentiation: HtmlGraph is graph-native + human-readable
Markdown + Obsidian
- Strengths: Note-taking focused, backlinks, visual
- Weaknesses: Desktop app required, not agent-native
- Differentiation: HtmlGraph is agent-first + programmable
HtmlGraph positions itself as:
- Simpler than Neo4j/Memgraph (zero dependencies)
- More visual than JSON/SQLite (browsers render it)
- More programmable than Obsidian/Roam (API-first)
- More standard than custom agent protocols (web standards)
Tagline Options:
- "HTML is All You Need" (primary)
- "Graph Database on Web Standards"
- "Zero-Dependency AI Agent Coordination"
- "The Web is Already a Graph Database"
Activities:
- Daily Twitter threads on progress
- Stream coding sessions on Twitch
- Post to r/programming weekly
- Engage with AI agent communities
Content:
- "Why I'm building a graph DB in HTML" blog post
- "Zero to Dashboard in 10 minutes" video
- "HTML vs Neo4j: A Comparison" article
Primary Channels:
- Hacker News submission (aim for front page)
- r/programming (detailed post)
- r/MachineLearning (agent focus)
- Twitter thread with demos
- Dev.to article
Launch Assets:
- GitHub repo with README
- Live demo site
- 3 working examples
- Quickstart guide
- Comparison chart
- "HTML is All You Need" manifesto
Activities:
- Weekly office hours (Discord/GitHub Discussions)
- Guest blog posts on popular dev blogs
- Conference talk submissions
- Integration with popular AI tools
Content Calendar:
- Week 1: "Getting Started with HtmlGraph"
- Week 2: "Migrating from Neo4j to HtmlGraph"
- Week 3: "Building AI Agent Systems"
- Week 4: "HtmlGraph Performance Deep-Dive"
Integrations:
- LangChain plugin
- AutoGPT adapter
- Claude Code integration
- Cursor/Windsurf support
Tooling:
- VSCode extension
- CLI tool
- GitHub Action
- Vercel/Netlify deployment
Risk 1: Performance at Scale
- Issue: HTML parsing may be slow for 10k+ nodes
- Likelihood: Medium
- Impact: High
- Mitigation: Add optional SQLite indexer for large graphs
Risk 2: Concurrent Writes
- Issue: Multiple agents writing to same file
- Likelihood: High
- Impact: Medium
- Mitigation: Document as append-mostly pattern, add file locking
Risk 3: Browser Memory
- Issue: Dashboard loading 1000+ nodes may crash browser
- Likelihood: Low
- Impact: Medium
- Mitigation: Implement pagination, virtual scrolling
Risk 4: "Not Serious Enough"
- Issue: Developers may see HTML as toy solution
- Likelihood: Medium
- Impact: High
- Mitigation: Focus on simplicity as feature, not limitation
Risk 5: Limited Adoption
- Issue: Network effect of existing graph DBs
- Likelihood: Medium
- Impact: High
- Mitigation: Target niche (AI agents) where complexity is pain point
Risk 6: Competition
- Issue: Neo4j or others build similar solution
- Likelihood: Low
- Impact: Medium
- Mitigation: Open source + community-driven development
Risk 7: Scope Creep
- Issue: Adding too many features before launch
- Likelihood: High
- Impact: Medium
- Mitigation: Strict MVP scope, post-launch roadmap
Risk 8: Documentation Debt
- Issue: Code without good docs won't get adopted
- Likelihood: Medium
- Impact: High
- Mitigation: Write docs alongside code, not after
- Project setup (structure, config)
- Python core (parser, models, graph)
- Unit tests (>90% coverage)
- Simple todo example
- JavaScript library
- Dashboard (vanilla JS + CSS)
- Agent interface
- Agent coordination example
- Documentation (README, quickstart, cookbook)
- Performance optimization
- Launch materials (blog post, demos)
- Submit to HN/Reddit
- Office hours
- Guest blog posts
- Conference submissions
- Integration work
- VSCode extension
- LangChain integration
- Performance improvements
- Version 1.0 release
-
Naming: Is "HtmlGraph" the right name? Alternatives:
- WebGraph
- HtmlDB
- GraphML (might conflict with existing GraphML format)
- HyperGraph (too generic)
-
Licensing: MIT vs Apache 2.0?
- MIT is simpler, more permissive
- Apache has patent protection
-
Monetization: Is there a business model here?
- Consulting/training?
- Enterprise features (SQLite indexer, hosting)?
- SaaS dashboard?
-
Scope: Should we include graph visualization in MVP?
- Pro: More impressive demos
- Con: Adds complexity, dependencies
-
Standards: Should we propose this as a W3C standard?
- Pro: Legitimacy, wider adoption
- Con: Slow process, may constrain evolution
Product Owner: Shakes
Technical Lead: Shakes
Target Launch: Q1 2025
Approved by:
- Technical Review
- Design Review
- Marketing Review
- Legal Review (open source license)
See CLAUDE.md section "HTML File Format Specification"
See CLAUDE.md section "Python API Design" and "JavaScript API Design"
See CLAUDE.md section "Comparison to Alternatives"
See CLAUDE.md section "Use Cases"
Document Version History:
- v1.0 (2024-12-16): Initial PRD draft
Next Review Date: 2024-12-30
Contact: github.com/shakestzd/htmlgraph