Skip to content

ofershap/prompt-compression

Repository files navigation

Prompt Compression

License: MIT Skills

Compress your docs, prompts, and context into minimal tokens for AGENTS.md. Get 80% fewer tokens with zero information loss.

Vercel's research proved it: an 8KB compressed docs index in AGENTS.md hits 100% eval pass rate, compared to 53% with no docs and 79% with skills. Passive context beats active retrieval -- but only if it fits. This plugin teaches your agent the compression techniques that make it fit.

The Problem

You want your AI agent to follow project conventions, use the right API versions, and reference your docs. You have two options:

  1. Skills (active retrieval) -- agent must decide to invoke them. Vercel's evals show agents skip this 56% of the time. Result: 53% pass rate, same as no docs at all.
  2. AGENTS.md (passive context) -- always loaded, no decision point. 100% pass rate. But raw docs are 40KB+, eating your context window on every request.

The answer is compression. Vercel compressed 40KB of Next.js docs down to 8KB (80% reduction) and maintained the perfect pass rate. This plugin packages those compression techniques so your agent applies them to any framework, any project.

Install

Cursor / Claude Code / Windsurf

npx skills add ofershap/prompt-compression

Or copy skills/ into your .cursor/skills/ or .claude/skills/ directory.

What's Included

Type Name Description
Skill prompt-compression 8 compression rules with before/after examples and output formats
Rule prompt-compression Always-on rule that enforces token-efficient AGENTS.md formatting
Command /compress-prompt Compress any content for AGENTS.md (guides you if no input provided)
Command /audit Scan your existing AGENTS.md for token waste and compression opportunities

How It Works

Run /compress-prompt with your docs, guidelines, or API reference. The agent compresses it using these techniques:

Technique Before After Reduction
Pipe-delimited file index Nested markdown headings with prose descriptions |routing:{defining,dynamic,middleware} ~75%
Single-line directives Multi-paragraph rule explanation |imports: builtin > external > internal > types ~90%
Abbreviated keys Required: true, Type: string, Default: "prod" |env: req str="prod" ~80%
Brace expansion Separate entries per file {Button,Input,Modal}.tsx ~60%
API shorthand Full endpoint documentation GET /users?page,limit -> User[]|auth:bearer ~70%

No input? Just run /compress-prompt and it walks you through choosing what to compress -- framework docs, coding standards, API references, or an existing AGENTS.md.

Real-World Example

Before (640 tokens):

## Authentication

Our application uses JWT-based authentication. When a user logs in, the server generates a JWT token
containing the user's ID and role. This token is sent back to the client in an httpOnly cookie.

On subsequent requests, the auth middleware extracts the token from the cookie, verifies the
signature, and attaches the user object to the request. The token expires after 24 hours.

## Project Structure

### Source Code

#### Components

##### UI Components

- Button.tsx
- Input.tsx
- Modal.tsx

##### Layout Components

- Header.tsx
- Footer.tsx
- Sidebar.tsx

After (120 tokens):

[Auth]|JWT in httpOnly cookie|24h expiry
|middleware: extract token > verify > attach user to req
|login: validate creds > generate JWT(id,role) > set cookie

[Structure]|src/
|components/ui:{Button,Input,Modal}.tsx
|components/layout:{Header,Footer,Sidebar}.tsx

81% token reduction. Same information. Agent produces identical code from both versions.

Why Passive Context Wins

From Vercel's AGENTS.md research (January 2026):

Approach Eval Pass Rate
No documentation 53%
Skills (default, agent must invoke) 53%
Skills (with explicit trigger instructions) 79%
Compressed AGENTS.md index 100%

Three factors explain the gap:

  1. No decision point -- AGENTS.md is always in context. No moment where the agent must decide "should I look this up?"
  2. Consistent availability -- Skills load asynchronously and only when invoked. Passive context is present on every turn.
  3. No ordering issues -- Skills create sequencing decisions (read docs first vs. explore project first). Passive context avoids this entirely.

Related Plugins

  • think-first -- Plan-first workflow (pairs well with compressed context)
  • ai-humanizer -- Prevent AI-detectable patterns in generated content

If this helped your workflow, a star helps others find it.

Author

Made by ofershap

LinkedIn GitHub

License

MIT

Releases

No releases published

Sponsor this project

 

Packages

No packages published

Contributors 2

  •  
  •