Summary
There is a direct contradiction between the Terms of Service and the Privacy Policy (both last updated April 13, 2026) regarding whether free tier user data is used for model training. This creates ambiguity that users — especially developers indexing proprietary codebases — cannot resolve without clarification from the team.
The Contradiction
Terms & Conditions §3.1.1 states:
"Under the Free Tier, the Company collects all Usage Data, including but not limited to queries and any other input provided by the User. This data is stored in its native form and may be used for a variety of purposes, including, without limitation, model training, analytics, service improvement, and research activities."
Privacy Policy §2.3 (Free Tier) states:
"Such data is used for service improvement, analytics, and research. We do not use any data from Free Tier users for model training."
One of these statements is inaccurate. Users have no way to determine which reflects actual practice.
Why This Matters
mgrep watch uploads the full contents of a user's codebase to Mixedbread's cloud storage. This can include:
- Proprietary source code
- Internal business logic
- Configuration files with sensitive structure
Under the free tier, this data is retained indefinitely with no guaranteed deletion on request (ToS §3.1.2, Privacy Policy §5). If that data is also used for model training, users indexing private or confidential repositories are unknowingly consenting to training a third-party commercial model on their intellectual property.
This is a meaningful risk that is currently undisclosed at the point of installation or first use (mgrep login, mgrep watch). The README's caution block mentions background sync but makes no mention of data retention or potential training use.
Requests
- Resolve the contradiction — update either the ToS or Privacy Policy so they agree on whether free tier data is used for model training.
- Surface the policy at onboarding — consider printing a one-line notice during
mgrep login or mgrep watch that links to the relevant policy section, so users are informed before any data is uploaded.
- Clarify deletion — either provide a reliable deletion mechanism or clearly state that none exists, so users can make an informed choice before indexing sensitive repos.
Additional Note
ToS §3.2.3 states that if a user upgrades from free to paid, previously collected free-tier data remains governed by free-tier rules. This means users who upgrade after indexing do not get the stronger paid-tier protections retroactively. This should be called out explicitly in the upgrade flow.
I'm genuinely interested in this tool and want to continue using it. Resolving this ambiguity would make it much easier to recommend mgrep to teams working with sensitive codebases. Thanks for considering this.
Summary
There is a direct contradiction between the Terms of Service and the Privacy Policy (both last updated April 13, 2026) regarding whether free tier user data is used for model training. This creates ambiguity that users — especially developers indexing proprietary codebases — cannot resolve without clarification from the team.
The Contradiction
Terms & Conditions §3.1.1 states:
Privacy Policy §2.3 (Free Tier) states:
One of these statements is inaccurate. Users have no way to determine which reflects actual practice.
Why This Matters
mgrep watchuploads the full contents of a user's codebase to Mixedbread's cloud storage. This can include:Under the free tier, this data is retained indefinitely with no guaranteed deletion on request (ToS §3.1.2, Privacy Policy §5). If that data is also used for model training, users indexing private or confidential repositories are unknowingly consenting to training a third-party commercial model on their intellectual property.
This is a meaningful risk that is currently undisclosed at the point of installation or first use (
mgrep login,mgrep watch). The README's caution block mentions background sync but makes no mention of data retention or potential training use.Requests
mgrep loginormgrep watchthat links to the relevant policy section, so users are informed before any data is uploaded.Additional Note
ToS §3.2.3 states that if a user upgrades from free to paid, previously collected free-tier data remains governed by free-tier rules. This means users who upgrade after indexing do not get the stronger paid-tier protections retroactively. This should be called out explicitly in the upgrade flow.
I'm genuinely interested in this tool and want to continue using it. Resolving this ambiguity would make it much easier to recommend
mgrepto teams working with sensitive codebases. Thanks for considering this.