CodeSage is an intelligent code documentation assistant that leverages AI to automatically generate human-readable documentation from source code. It supports multiple programming languages and provides both CLI and Web API interfaces.
- Multi-Language Support: Parse code structures across different programming languages using Tree-sitter
- AI-Powered Documentation: Generate comprehensive docstrings using Large Language Models
- Flexible Output Formats: Insert docstrings directly into source code or export to markdown
- Command-Line Interface: Easily integrate into development workflows and CI/CD pipelines
- Web API (Planned): Access documentation capabilities through a web interface
- Semantic Code Q&A (Optional): Ask questions about your codebase using embeddings and semantic search
# Clone the repository
git clone https://github.com/yourusername/CodeSage-Service.git
cd CodeSage-Service
# Create a virtual environment
python -m venv codesage-venv
# Activate the environment
# On Windows:
codesage-venv\Scripts\activate
# On Unix/MacOS:
# source codesage-venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Initialize Tree-sitter grammars (if needed)
# Instructions will be provided# Generate documentation for a Python file
python -m cli.cli document path/to/your/file.py
# Generate documentation for an entire directory
python -m cli.cli document path/to/your/project/ --recursive
# Export documentation as markdown
python -m cli.cli document path/to/your/file.py --output markdown --output-path docs/
# Get help
python -m cli.cli --helpA web interface is planned for future releases, which will provide:
- RESTful API endpoints for documentation generation
- Interactive UI for exploring and managing documentation
- Team collaboration features
CodeSage follows a modular, extensible architecture:
CodeSage-Service/
│
├── parsers/ # Code parsers for different languages
│ ├── __init__.py
│ └── ...
│
├── agents/ # LLM-based documentation generators
│ ├── __init__.py
│ └── ...
│
├── outputs/ # Documentation formatters and writers
│ ├── __init__.py
│ └── ...
│
├── cli/ # Command-line interface
│ ├── cli.py
│ └── ...
│
├── web/ # Future web API and interface
│ └── ...
│
├── embeddings/ # Vector storage and semantic search (planned)
│ └── ...
│
└── vendor/ # Tree-sitter grammars and external dependencies
└── ...
To add support for a new programming language:
- Add appropriate Tree-sitter grammar to the
vendor/directory - Create a new parser in the
parsers/module that implements the common parser interface - Register the new parser in the parser factory
To add a new output format:
- Create a new writer class in the
outputs/module - Implement the required methods to format and write documentation
- Register the new format in the output format factory
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for providing the LLM capabilities
- Tree-sitter for robust code parsing