Why use many token when few do trick.
A Kimi Code CLI skill that makes your agent talk like caveman — cutting ~60-75% of output tokens while keeping full technical accuracy.
Based on the viral observation that terse, telegraphic communication dramatically reduces LLM token usage without losing substance.
- 🪶 Lite / 🪨 Full / 🔥 Ultra / 📜 文言文 — pick your grunt level
- 🎯 Same accuracy — all technical info kept, only fluff dropped
- ⚡ Faster responses — less tokens to generate = speed go brrr
- 🗜️ caveman-compress — rewrite markdown/memory files into caveman-speak (~46% input token savings)
- 💬 caveman-commit — terse commit messages (≤50 chars)
- 🔍 caveman-review — one-line code review comments
- 📊 Stats tracking — token savings estimation
# Clone to your skills directory
git clone https://github.com/theretech/kimi-caveman.git ~/.kimi/skills/caveman-modeOr install via pip:
pip install kimi-cavemanJust say to Kimi:
- "caveman mode"
- "talk like caveman"
- "less tokens please"
- "modo caveman"
Deactivate with: "stop caveman" or "normal mode"
| Level | Trigger | Style |
|---|---|---|
| 🪶 Lite | caveman lite |
Drop filler, keep grammar |
| 🪨 Full | caveman full |
Default caveman. No articles, fragments |
| 🔥 Ultra | caveman ultra |
Maximum compression, telegraphic |
| 📜 文言文 | caveman wenyan |
Classical Chinese literary compression |
Normal Kimi (69 tokens):
"The reason your React component is re-rendering is likely because you're creating a new object reference on each render cycle. I'd recommend using useMemo to memoize the object."
🪨 Caveman Kimi (19 tokens):
"New object ref each render. Inline object prop = new ref = re-render. Wrap in
useMemo."
🔥 Ultra (12 tokens):
"Inline obj prop → new ref → re-render.
useMemo."
Compress markdown/memory files into caveman-speak. Preserves code blocks, URLs, and paths byte-for-byte.
# Compress a file
caveman-compress my-notes.md
# Output: my-notes.caveman.md (backup saved as .original.md)| File | Original | Compressed | Saved |
|---|---|---|---|
claude-md-preferences.md |
706 | 285 | 59.6% |
project-notes.md |
1145 | 535 | 53.3% |
| Average | 898 | 481 | 46% |
kimi-caveman/
├── caveman_mode/
│ ├── SKILL.md # Skill instructions for Kimi
│ ├── scripts/
│ │ └── compress.py # Markdown compression tool
│ └── references/
│ └── modes.md # Mode reference card
├── tests/
├── README.md
└── pyproject.toml
- Fork the repo
- Create a feature branch
- Make your changes
- Run tests:
pytest - Run linter:
ruff check . - Submit a PR
PIX (Brazil):
54802231000148 — THE RETECH LTDA - EPP
MIT — see LICENSE for details.
kimi-caveman is a token-efficient communication skill for Kimi Code CLI. It reduces agent output verbosity by 60-75% while maintaining 100% technical accuracy, making sessions faster, cheaper, and more readable.
Part of the caveman ecosystem: less tokens, same brain.
Built with 🪨 by The Retech and friends.