feat: add OS keyring support for session tokens#307
Conversation
Switch the CLI manager to provide session tokens through the CODER_SESSION_TOKEN environment variable when running `coder login` instead of appending `--token <value>` to the process arguments. We’re making this change for security reasons. Command-line arguments are more likely to be exposed through process listings, shell history, CI/job logs, and command auditing, while environment variables have a smaller exposure surface on typical systems. This aligns the plugin with the Coder CLI guidance that prefers CODER_SESSION_TOKEN over `--token`.
Replace zt-exec with a small internal ProcessBuilder-based runner so subprocess failures stay under our control. We needed to remove ProcessExecutor because its exception messages include the spawned process environment. Now that CLI auth is passed through CODER_SESSION_TOKEN, failures could spill tokens into logs before our own sanitization had a chance to run. That made reliable redaction impossible/ugly at the logging layer. The new runner uses Java 21 ProcessBuilder from Kotlin, captures stdout and stderr, supports expected exit codes and stderr discard-on-success behavior, and throws project-owned process exceptions with centralized secret sanitization. This keeps process execution behavior explicit while avoiding a dependency that could format sensitive environment values into exceptions.
Pass --use-token-as-session when logging the CLI in so the CLI persists the same token the plugin already uses for REST API calls. Without this flag, a supplied token is only used to bootstrap login and the CLI mints and stores a different session token of its own. That leaves the plugin and CLI using different credentials, which makes auth state harder to reason about and prevents Toolbox from managing a single shared token consistently. Using --use-token-as-session keeps both sides on the same credential, improves consistency between REST and CLI behavior, and opens the door to revoking that shared token cleanly from Toolbox later on. Tests were updated to reflect the new login command shape.
Add a new useKeyring setting so Toolbox can let the Coder CLI persist the shared session token in the OS keyring when that storage mode is actually supported. When enabled, Toolbox stops forcing --global-config for authenticated CLI commands only if the CLI supports keyring-backed auth and the platform is macOS or Windows. That keeps REST and CLI auth on the same persisted token without changing Linux behavior, where the plugin continues to use its deployment-specific config directory.
| } | ||
|
|
||
| private fun shouldUseKeyringAuth(feats: Features): Boolean = | ||
| context.settingsStore.useKeyring && feats.keyringAuth && supportsKeyringStorage(currentOs) |
There was a problem hiding this comment.
Is there any visibility for users that enable the feature/setting, but it isn't enabled due to this logic? I am mainly thinking about how we avoid silently falling back to file storage.
There was a problem hiding this comment.
On top of your recommandations I added another commit that hides the checkbox field on unsupported operating systems (Linux)
EhabY
left a comment
There was a problem hiding this comment.
Overall the shape matches what we did in VS Code, nice!
| maybeWarnAboutKeyringFallback(feats) | ||
| val args = mutableListOf( | ||
| "login", | ||
| "--use-token-as-session", |
There was a problem hiding this comment.
Should we only pass this when keyring is enabled and supported to maintain the old behavior?
There was a problem hiding this comment.
From what I've discussed with the team, I think the old behavior is somewhat wrong. Basically the user generated an API token via /cli-auth, which is then used by the REST API client for polling on workspaces and other stuffy and also for the CLI login. However, in the old approach the CLI was using the passed api token to mint a new token that is used for workspace start and ssh. There is mismatch in the token passed by the user what the CLI is using. It would also be harder for us to implement logout the feature (which today doesn't do anything serious in Coder Toolbox). And of course the other argument is that there is a requirement (I can find the ticket) to have the plugin aligned as much as possible with the VS Code extension.
| fun login(token: String): String { | ||
| context.logger.info("Storing CLI credentials in $coderConfigPath") | ||
| return exec( | ||
| fun login(token: String, feats: Features = features): String { |
There was a problem hiding this comment.
This adds the write side of keyring but not the read side. VS Code has a second feature flag keyringTokenRead for CLI 2.31.0+ that calls coder login token --url <url> so the plugin can pick up tokens a user wrote with coder login in their terminal. Worth adding here, or filing a follow-up?
There was a problem hiding this comment.
I would raise a new ticket for this part honestly. Sounds like a new feature, probably some code will have to be reworked. I will check with @matifali if this feature makes sense for Coder Toolbox, but good point - thank you for bringing that up.
There was a problem hiding this comment.
@EhabY I talked with Atif and raised https://linear.app/codercom/issue/DEVEX-403/add-terminal-login-token-pickup-to-coder-toolbox
| context.logger.info("Storing CLI credentials in $coderConfigPath") | ||
| return exec( | ||
| fun login(token: String, feats: Features = features): String { | ||
| maybeWarnAboutKeyringFallback(feats) |
There was a problem hiding this comment.
maybeWarnAboutKeyringFallback runs on every login and every startWorkspace. If keyring is on but the platform or CLI can't support it, the user gets the snackbar on every workspace start. Could we warn once per session, or only at login and on settings save? (same for L295)
There was a problem hiding this comment.
Good point. Warning on settings change makes more sense. Done
| else -> null | ||
| } ?: return | ||
|
|
||
| runBlocking { |
There was a problem hiding this comment.
runBlocking { context.logAndShowWarning(...) } inside login makes me nervous. If login is ever called from a coroutine on the same dispatcher, this can deadlock. Could login be suspend, or could we fire-and-forget on a known scope instead? At the very least we need a timeout here 🤔
There was a problem hiding this comment.
Removed this but just a couple of notes:
- nobody should have called login twice in parallel.
- login runs in coroutines that use the Default dispatcher which is backed by a thread pool. So technically. If one login call blocks a thread the second login call should be scheduled on a different thread. Regardless, this is somewhat dangerous case you can end up easily in deadlocks.
- while revisiting the code it made more sense to run the process execution on the IO dispatcher which is better suited for this kind of tasks. I'll revisit this problem in a separate PR.
| /** | ||
| * Runs a process and waits for it to finish. | ||
| * | ||
| * The wait is intentionally unbounded. Only exit code 0 is accepted by default. |
There was a problem hiding this comment.
For reference, the VS Code side has a 60s timeout
There was a problem hiding this comment.
🤔 even when you start workspaces?
| private val sensitivePatterns = listOf( | ||
| Regex("""(CODER_SESSION_TOKEN=)([^,\s}]+)"""), | ||
| Regex("""(Coder-Session-Token:\s*)([^\s,]+)""", RegexOption.IGNORE_CASE), | ||
| Regex("""(--token\s+)(\S+)"""), |
There was a problem hiding this comment.
This pattern only catches --token <value>, not --token=value. We don't pass it that way anywhere today, but matching both is one extra (?:\s+|=) and costs nothing.
|
|
||
| val stdout = runProcess(command, environment = processEnv).stdout | ||
| val sanitizedStdout = stdout.sanitizeSecrets() | ||
| context.logger.info("`$localBinaryPath ${listOf(*args).joinToString(" ")}`: $sanitizedStdout") |
There was a problem hiding this comment.
args are logged unredacted alongside sanitizedStdout. The token isn't in args anymore so this is fine today, but running .sanitizeSecrets() on the joined args line as well would make it safe by construction if anything else ever lands there
only when the settings are changed, instead of every login/ws startup.
Delegate session token persistence to the Coder CLI's OS keyring backend (macOS Keychain, Windows Credential Manager) instead of always writing them in plaintext under the plugin's data directory. This brings the Toolbox plugin in line with the VS Code extension and reduces on-disk exposure of long-lived credentials.
This change is gated on two prerequisite refactors of how we currently hand the token to the CLI, both of which land in this commit:
coder logininstead of appending--token <value>to the process arguments.We’re making this change for security reasons. Command-line arguments are more likely to be exposed through process listings, shell history, CI/job logs, and command auditing, while environment variables have a smaller exposure surface on typical systems. This aligns the plugin with the Coder CLI guidance that prefers CODER_SESSION_TOKEN over
--token.By default,
coder logintreats a supplied token as a bootstrap credential: it authenticates with it once and then mints a brand-new API key via CreateAPIKey, persisting that new key as the session. The token the user (or our OAuth2 flow) handed us is discarded. For Toolbox this is wrong: we want the CLI and the plugin to share the exact same session token so that revocation, logout, and token lifetime stay consistent across both sides.And additional thing that we had to change was to replace zt-exec with a small internal ProcessBuilder-based runner so subprocess failures stay under our control.
We needed to remove ProcessExecutor because its exception messages include the spawned process environment. Now that CLI auth is passed through CODER_SESSION_TOKEN, failures could spill tokens into logs before our own
sanitization had a chance to run. That made reliable redaction impossible at the logging layer.