Prompt Compression

Compress your docs, prompts, and context into minimal tokens for AGENTS.md. Get 80% fewer tokens with zero information loss.

Vercel's research proved it: an 8KB compressed docs index in AGENTS.md hits 100% eval pass rate, compared to 53% with no docs and 79% with skills. Passive context beats active retrieval -- but only if it fits. This plugin teaches your agent the compression techniques that make it fit.

The Problem

You want your AI agent to follow project conventions, use the right API versions, and reference your docs. You have two options:

Skills (active retrieval) -- agent must decide to invoke them. Vercel's evals show agents skip this 56% of the time. Result: 53% pass rate, same as no docs at all.
AGENTS.md (passive context) -- always loaded, no decision point. 100% pass rate. But raw docs are 40KB+, eating your context window on every request.

The answer is compression. Vercel compressed 40KB of Next.js docs down to 8KB (80% reduction) and maintained the perfect pass rate. This plugin packages those compression techniques so your agent applies them to any framework, any project.

Install

Cursor / Claude Code / Windsurf

npx skills add ofershap/prompt-compression

Or copy skills/ into your .cursor/skills/ or .claude/skills/ directory.

What's Included

Type	Name	Description
Skill	`prompt-compression`	8 compression rules with before/after examples and output formats
Rule	`prompt-compression`	Always-on rule that enforces token-efficient AGENTS.md formatting
Command	`/compress-prompt`	Compress any content for AGENTS.md (guides you if no input provided)
Command	`/audit`	Scan your existing AGENTS.md for token waste and compression opportunities

How It Works

Run /compress-prompt with your docs, guidelines, or API reference. The agent compresses it using these techniques:

Technique	Before	After	Reduction
Pipe-delimited file index	Nested markdown headings with prose descriptions	`\|routing:{defining,dynamic,middleware}`	~75%
Single-line directives	Multi-paragraph rule explanation	`\|imports: builtin > external > internal > types`	~90%
Abbreviated keys	`Required: true, Type: string, Default: "prod"`	`\|env: req str="prod"`	~80%
Brace expansion	Separate entries per file	`{Button,Input,Modal}.tsx`	~60%
API shorthand	Full endpoint documentation	`GET /users?page,limit -> User[]\|auth:bearer`	~70%

No input? Just run /compress-prompt and it walks you through choosing what to compress -- framework docs, coding standards, API references, or an existing AGENTS.md.

Real-World Example

Before (640 tokens):

## Authentication

Our application uses JWT-based authentication. When a user logs in, the server generates a JWT token
containing the user's ID and role. This token is sent back to the client in an httpOnly cookie.

On subsequent requests, the auth middleware extracts the token from the cookie, verifies the
signature, and attaches the user object to the request. The token expires after 24 hours.

## Project Structure

### Source Code

#### Components

##### UI Components

- Button.tsx
- Input.tsx
- Modal.tsx

##### Layout Components

- Header.tsx
- Footer.tsx
- Sidebar.tsx

After (120 tokens):

[Auth]|JWT in httpOnly cookie|24h expiry
|middleware: extract token > verify > attach user to req
|login: validate creds > generate JWT(id,role) > set cookie

[Structure]|src/
|components/ui:{Button,Input,Modal}.tsx
|components/layout:{Header,Footer,Sidebar}.tsx

81% token reduction. Same information. Agent produces identical code from both versions.

Why Passive Context Wins

From Vercel's AGENTS.md research (January 2026):

Approach	Eval Pass Rate
No documentation	53%
Skills (default, agent must invoke)	53%
Skills (with explicit trigger instructions)	79%
Compressed AGENTS.md index	100%

Three factors explain the gap:

No decision point -- AGENTS.md is always in context. No moment where the agent must decide "should I look this up?"
Consistent availability -- Skills load asynchronously and only when invoked. Passive context is present on every turn.
No ordering issues -- Skills create sequencing decisions (read docs first vs. explore project first). Passive context avoids this entirely.

Related Plugins

think-first -- Plan-first workflow (pairs well with compressed context)
ai-humanizer -- Prevent AI-detectable patterns in generated content

If this helped your workflow, a star helps others find it.

Author

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.claude-plugin		.claude-plugin
.cursor-plugin		.cursor-plugin
.github		.github
commands		commands
rules		rules
scripts		scripts
skills/prompt-compression		skills/prompt-compression
.gitignore		.gitignore
.prettierrc		.prettierrc
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Prompt Compression

The Problem

Install

Cursor / Claude Code / Windsurf

What's Included

How It Works

Real-World Example

Why Passive Context Wins

Related Plugins

Author

License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Contributors 2

Uh oh!

Languages

Uh oh!

License

ofershap/prompt-compression

Folders and files

Latest commit

History

Repository files navigation

Prompt Compression

The Problem

Install

Cursor / Claude Code / Windsurf

What's Included

How It Works

Real-World Example

Why Passive Context Wins

Related Plugins

Author

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Contributors 2

Uh oh!

Languages

Packages