Pyragify

Pyragify turns a code repository into plain-text chunks that are easier to load into NotebookLM and other LLM tools. It extracts semantic units from source files, writes .txt output grouped by file type, and stores metadata for incremental re-runs.

What It Does

Chunks Python code into functions, classes, and comments
Splits Markdown files by header sections
Processes common repository files into LLM-friendly text output
Respects .gitignore and .dockerignore patterns
Tracks file hashes so unchanged files can be skipped on later runs

Supported Inputs

Pyragify has dedicated handling for:

Python: .py
Markdown: .md, .markdown
HTML: .html
CSS: .css
Other common repository files are included as plain text when they can be read as UTF-8

Installation

Install From PyPI

uv pip install pyragify

or

pip install pyragify

Install From Source

git clone https://github.com/ThomasBury/pyragify.git
cd pyragify
uv sync --group dev

Quick Start

Run With A Config File

The default entrypoint is pyragify.

uv run pyragify --config-file config.yaml

You can also run it as a module:

python -m pyragify --config-file config.yaml

Run Without A Config File

If you do not use config.yaml, pass every setting you want to rely on directly on the command line.

uv run pyragify \
  --repo-path /path/to/repository \
  --output-dir /path/to/output \
  --max-words 200000 \
  --max-file-size 10485760 \
  --skip-patterns "*.log" \
  --skip-patterns "*.tmp" \
  --skip-dirs "__pycache__" \
  --skip-dirs "node_modules" \
  --verbose

CLI Notes

Use pyragify --help for the full option list
Command-line options override values loaded from config.yaml
Repeat --skip-patterns once per pattern
Repeat --skip-dirs once per directory name

Configuration

Example config.yaml:

repo_path: /path/to/repository
output_dir: /path/to/output
max_words: 200000
max_file_size: 10485760  # 10 MB
skip_patterns:
  - "*.log"
  - "*.tmp"
skip_dirs:
  - "__pycache__"
  - "node_modules"
verbose: false

Example Workflow

Point repo_path at the repository you want to process.
Choose an output_dir where generated chunks and metadata should be written.
Run uv run pyragify --config-file config.yaml or pass the same settings on the command line.
Open the generated files in output/, especially output/remaining/chunk_0.txt, in NotebookLM or another LLM workflow.

Output Structure

The generated output is grouped by content type:

python/: Python functions, classes, and comment chunks
markdown/: Markdown sections split by headers
html/: HTML script and style chunks
css/: CSS rule chunks
other/: Readable files that do not have a dedicated parser
remaining/: Overflow chunks once grouped outputs reach the word limit
metadata.json: Summary of processed files
hashes.json: MD5 hashes used for incremental processing

NotebookLM Workflow

Run Pyragify on the repository you care about.
Upload one or more generated .txt chunks to a NotebookLM notebook.
Ask questions about the codebase and use the generated citations to trace answers back to the source text.

Development

Set up the local environment:

uv sync --group dev

Run the test suite:

uv run pytest

Run a focused test slice while iterating:

uv run pytest tests/test_processor.py -k markdown

Contributing

Contributions are welcome. Open an issue for bugs or feature requests, then send a pull request with focused changes and matching tests.

License

This project is licensed under the MIT License. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
src/pyragify		src/pyragify
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
chat_code_base.png		chat_code_base.png
config.yaml		config.yaml
pyproject.toml		pyproject.toml
text2flow_optimized.yaml		text2flow_optimized.yaml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pyragify

What It Does

Supported Inputs

Installation

Install From PyPI

Install From Source

Quick Start

Run With A Config File

Run Without A Config File

CLI Notes

Configuration

Example Workflow

Output Structure

NotebookLM Workflow

Development

Contributing

License

Star History

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pyragify

What It Does

Supported Inputs

Installation

Install From PyPI

Install From Source

Quick Start

Run With A Config File

Run Without A Config File

CLI Notes

Configuration

Example Workflow

Output Structure

NotebookLM Workflow

Development

Contributing

License

Star History

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages