Implement branch-aware hybrid transcrypt secrets encryption and document upgrade steps#1742
Conversation
…ent upgrade steps
# Conflicts: # eval/README.md
a5aebd6 to
009f30c
Compare
Would dropping this reduce complexity? I think if it's just the both of us working on the evals we should be able to deal with the transition since it's just a one off? |
Yes, it would reduce it a little. I was just wanting to make it easy if we wanted to work on "old" branches that were in progress when this lands. Having to merge into an older branch from main once this lands will be a little annoying since you have to (in the middle of a merge) do the dance to re-decrypt. |
Summary
This PR introduces branch-aware hybrid encryption and legacy-fallback decryption to the local secrets management filter (
transcrypt), enabling a smooth transition from legacy MD5 key derivation to the modern PBKDF2 standard without breaking cross-branch compatibility. It also documents the necessary manual steps for developers to upgrade their local filters.Changes
transcryptFilter (eval/bin/transcrypt):git_clean) to inspect the current branch'stranscryptscript. If the active branch supports PBKDF2, it encrypts secrets using the more secure-pbkdf2 -iter 100000settings; otherwise, it falls back to legacy-md MD5.git_smudge) and textconv (git_textconv) filters to attempt decryption using PBKDF2 first. If decryption fails, they fall back to legacy MD5, and finally to raw ciphertext. This ensures that checkouts containing a mix of old (MD5) and new (PBKDF2) encrypted files do not fail./dev/nulland using temporary files to safely check decryption success before piping to stdout.eval/README.md):bin/transcrypt --upgrade) and the checkout command required to force Git to re-decrypt files using the upgraded filters.Impact & Risks
bin/transcrypt --upgradeand re-smudge their files to benefit from the upgraded security settings and avoid OpenSSL deprecation warnings.Testing
bin/transcrypt --upgradefollowed bygit checkout HEAD -- $(git ls-crypt)successfully upgrades the local filters and re-decrypts the datasets.