diff --git a/README.md b/README.md index a65d288..167a87f 100644 --- a/README.md +++ b/README.md @@ -18,23 +18,6 @@
-## Why mgrep? -- Natural-language search that feels as immediate as `grep`. -- Semantic, multilingual & multimodal (audio, video support coming soon!) -- Web search built-in — query the web alongside your local files with `--web`. -- Smooth background indexing via `mgrep watch`, designed to detect and keep up-to-date everything that matters inside any git repository. -- Friendly device-login flow and first-class coding agent integrations. -- Built for agents and humans alike, and **designed to be a helpful tool**, not a restrictive harness: quiet output, thoughtful defaults, and escape hatches everywhere. -- Reduces the token usage of your agent by 2x while maintaining superior performance - -```bash -# index once -mgrep watch - -# then ask your repo things in natural language -mgrep "where do we set up auth?" -``` - ## Quick Start 1. **Install** @@ -69,51 +52,30 @@ mgrep "where do we set up auth?" ``` Searches default to the current working directory unless you pass a path. -**Today, `mgrep` works great on:** code, text, PDFs, images. +**Today, `mgrep` works great on:** code, text, PDFs, images. **Coming soon:** audio & video. -## Using it with Coding Agents - -`mgrep` supports assisted installation commands for many agents: -- `mgrep install-claude-code` for Claude Code -- `mgrep install-opencode` for OpenCode -- `mgrep install-codex` for Codex -- `mgrep install-droid` for Factory Droid - -These commands sign you in (if needed) and add Mixedbread `mgrep` support to the -agent. After that you only have to start the agent in your project folder, thats -it. - -### More Agents Coming Soon - -More agents (Cursor, Windsurf, etc.) are on the way—this section will grow as soon as each integration lands. - -## Making your agent smarter - -We plugged `mgrep` into Claude Code and ran a benchmark of 50 QA tasks to evaluate the economics of `mgrep` against `grep`. - - - -In our 50-task benchmark, `mgrep`+Claude Code used ~2x fewer tokens than grep-based workflows at similar or better judged quality. - -`mgrep` finds the relevant snippets in a few semantic queries first, and the model spends its capacity on reasoning instead of scanning through irrelevant code from endless `grep` attempts. You can [Try it yourself](http://demo.mgrep.mixedbread.com). - -*Note: Win Rate (%) was calculated by using an LLM as a judge.* +## Why mgrep? -## Why we built mgrep +- Natural-language search that feels as immediate as `grep`. +- Semantic, multilingual & multimodal (audio, video support coming soon!) +- Web search built-in — query the web alongside your local files with `--web`. +- Smooth background indexing via `mgrep watch`, designed to detect and keep up-to-date everything that matters inside any git repository. +- Friendly device-login flow and first-class coding agent integrations. +- Built for agents and humans alike, and **designed to be a helpful tool**, not a restrictive harness: quiet output, thoughtful defaults, and escape hatches everywhere. +- Reduces the token usage of your agent by 2x while maintaining superior performance `grep` is an amazing tool. It's lightweight, compatible with just about every machine on the planet, and will reliably surface any potential match within any target folder. But grep is **from 1973**, and it carries the limitations of its era: you need exact patterns and it slows down considerably in the cases where you need it most, on large codebases. -Worst of all, if you're looking for deeply-buried critical business logic, you cannot describe it: you have to be able to accurately guess what kind of naming patterns would have been used by the previous generations of engineers at your workplace for `grep` to find it. This will often result in watching a coding agent desperately try hundreds of patterns, filling its token window, and your upcoming invoice, with thousands of tokens. - -But it doesn't have to be this way. Everything else in our toolkit is increasingly tailored to understand us, and so should our search tools. `mgrep` is our way to bring `grep` to 2025, integrating all of the advances in semantic understanding and code-search, without sacrificing anything that has made `grep` such a useful tool. +Worst of all, if you're looking for deeply-buried critical business logic, you cannot describe it: you have to be able to accurately guess what kind of naming patterns would have been used by the previous generations of engineers at your workplace for `grep` to find it. This will often result in watching a coding agent desperately try hundreds of patterns, filling its token window, and your upcoming invoice, with thousands of tokens. -Under the hood, `mgrep` is powered by [Mixedbread Search](https://www.mixedbread.com/blog/mixedbread-search), our full-featured search solution. It combines state-of-the-art semantic retrieval models with context-aware parsing and optimized inference methods to provide you with a natural language companion to `grep`. We believe both tools belong in your toolkit: use `grep` for exact matches, `mgrep` for semantic understanding and intent. +But it doesn't have to be this way. Everything else in our toolkit is increasingly tailored to understand us, and so should our search tools. `mgrep` is our way to bring `grep` to 2025, integrating all of the advances in semantic understanding and code-search, without sacrificing anything that has made `grep` such a useful tool. +Under the hood, `mgrep` is powered by [Mixedbread Search](https://www.mixedbread.com/blog/mixedbread-search), our full-featured search solution. It combines state-of-the-art semantic retrieval models with context-aware parsing and optimized inference methods to provide you with a natural language companion to `grep`. -## When to use what +### When to use what We designed `mgrep` to complement `grep`, not replace it. The best code search combines `mgrep` with `grep`. @@ -122,23 +84,35 @@ We designed `mgrep` to complement `grep`, not replace it. The best code search c | **Exact Matches** | **Intent Search** | | Symbol tracing, Refactoring, Regex | Code exploration, Feature discovery, Onboarding | -## Web Search +## Benchmarks -`mgrep` can also search the web alongside your local files. This is useful when -you need to find documentation, tutorials, or answers to programming questions -without leaving your terminal. +We plugged `mgrep` into Claude Code and ran a benchmark of 50 QA tasks to evaluate the economics of `mgrep` against `grep`. -```bash -# Search the web and get a summarized answer -mgrep --web --answer "How do I integrate a JavaScript runtime into Deno?" + -# Get the urls of the search -mgrep --web "best practices for error handling in TypeScript" -``` +In our 50-task benchmark, `mgrep`+Claude Code used ~2x fewer tokens than grep-based workflows at similar or better judged quality. -Web search queries the `mixedbread/web` store in addition to your local store, merging results based on relevance. Use `--answer` (or `-a`) to get a concise summary instead of raw results. +`mgrep` finds the relevant snippets in a few semantic queries first, and the model spends its capacity on reasoning instead of scanning through irrelevant code from endless `grep` attempts. You can [Try it yourself](http://demo.mgrep.mixedbread.com). + +*Note: Win Rate (%) was calculated by using an LLM as a judge.* + +## Using it with Coding Agents + +`mgrep` supports assisted installation commands for many agents: +- `mgrep install-claude-code` for Claude Code +- `mgrep install-opencode` for OpenCode +- `mgrep install-codex` for Codex +- `mgrep install-droid` for Factory Droid + +These commands sign you in (if needed) and add Mixedbread `mgrep` support to the +agent. After that you only have to start the agent in your project folder, thats +it. + +### More Agents Coming Soon + +More agents (Cursor, Windsurf, etc.) are on the way—this section will grow as soon as each integration lands. -## Commands at a Glance +## Command Reference | Command | Purpose | | --- | --- | @@ -168,7 +142,7 @@ directory for a pattern. | `--max-file-count