Skip to content

checkpoint: into wallentx/termux-target from release/0.129.0 @ b187e52ce92d#93

Merged
wallentx merged 33 commits intowallentx/termux-targetfrom
checkpoint/wallentx_termux-target_from_release_0.129.0_b187e52ce92d
May 6, 2026
Merged

checkpoint: into wallentx/termux-target from release/0.129.0 @ b187e52ce92d#93
wallentx merged 33 commits intowallentx/termux-targetfrom
checkpoint/wallentx_termux-target_from_release_0.129.0_b187e52ce92d

Conversation

@unemployabot
Copy link
Copy Markdown

@unemployabot unemployabot Bot commented May 6, 2026

Termux release checkpoint

  • Source branch: release/0.129.0
  • Source hash: b187e52ce92d6fbfd5eb8c986c594f6ed715c2d4
  • Destination branch: wallentx/termux-target
  • Remaining first-parent commits on source: 0

This PR carries release-train conflict fixes and follow-up changes back into the reusable Termux patch branch.

Release-only workflow files and metadata under .github were restored to the destination branch versions before opening this PR.

fcoury-oai and others added 30 commits May 5, 2026 13:32
## Why

The resume/fork picker is becoming the main way users recover previous
work, but the old fixed table made sessions hard to scan once thread
names, branches, working directories, and timestamps all mattered. This
redesign makes the picker denser by default, easier to search, and safer
to inspect before resuming or forking.

<table>
<tr>
<td>
<img width="1660" height="1103" alt="CleanShot 2026-05-03 at 12 34 10"
src="https://github.com/user-attachments/assets/313ede1d-1da4-4863-acd2-56b3e27e9703"
/>
</td>
<td>
<img width="1662" height="1100" alt="CleanShot 2026-05-03 at 12 34 15"
src="https://github.com/user-attachments/assets/cfde7d5c-bab0-4994-a807-254e53f344ea"
/>
</td>
</tr>
<tr>
<td>
<img width="1664" height="1107" alt="CleanShot 2026-05-03 at 12 39 22"
src="https://github.com/user-attachments/assets/e1ee58ca-4dc5-4a35-ae0f-47562da3974c"
/>
</td>
<td>
<img width="1662" height="1100" alt="CleanShot 2026-05-03 at 12 35 09"
src="https://github.com/user-attachments/assets/9c888072-eedf-4f45-985c-0c14df28bcc7"
/>
</td>
</tr>
</table>

## What Changed

- Replaces the old session table with responsive session rows that
prioritize the session name or preview, then show timestamp, cwd, and
branch metadata.
- Makes dense view the default while keeping comfortable view available
through `Ctrl+O`.
- Persists the picker view preference in `[tui].session_picker_view`,
including active profile-scoped config.
- Adds sort/filter controls for updated time, created time, cwd, and all
sessions.
- Expands search matching across session name, preview, thread id,
branch, and cwd.
- Makes `Esc` safer in search mode: it clears an active query before
starting a new session.
- Adds lazy transcript inspection:
  - `Space` expands recent transcript context inline.
  - `Ctrl+T` opens a transcript overlay.
  - raw reasoning visibility follows `show_raw_agent_reasoning`.
- Keeps remote cwd filtering server-side for remote app-server sessions
so local path normalization does not incorrectly hide remote results.
- Updates snapshots and config schema for the new picker states and
config option.

## How to Test

1. Start Codex in a repo with several saved sessions.
2. Press `Ctrl+R` / resume picker entry point.
3. Confirm the picker opens in dense mode and shows session name or
preview, timestamp, cwd, and branch metadata.
4. Press `Ctrl+O` and confirm it switches between dense and comfortable
views.
5. Restart Codex and confirm the selected view persists.
6. Type a query that matches a branch, cwd, thread id, or session name;
confirm matching sessions appear.
7. Press `Esc` while the query is non-empty and confirm it clears search
instead of starting a new session.
8. Select a session and press `Space`; confirm recent transcript context
expands inline.
9. Press `Ctrl+T`; confirm the transcript overlay opens and respects
raw-reasoning visibility settings.

Targeted tests:
- `cargo test -p codex-tui resume_picker --no-fail-fast`
- `cargo test -p codex-core
runtime_config_resolves_session_picker_view_default_and_override`
- `cargo test -p codex-core profile_tui_rejects_unsupported_settings`
- `cargo check -p codex-thread-manager-sample`
- `cargo insta pending-snapshots`
## Summary

- Propagate Linux bubblewrap argument-construction failures instead of
panicking in the helper
- Keep mutable-symlink carveouts fail-closed while reporting them as
ordinary sandbox build failures
- Add regression coverage for a protected `.codex` symlink inside a
writable workspace root

## Root cause

Linux bubblewrap intentionally rejects read-only carveouts that cross a
symlink the sandboxed process can still rewrite. That is the correct
security behavior for protected metadata paths such as `.codex`.

The bug was one layer higher: `linux_run_main` treated the expected
build failure as impossible and panicked while constructing the
bubblewrap argv. For issue openai#20716, that turned a normal fail-closed
sandbox outcome into a noisy panic in the transcript.

## User impact

Users with a project-local `.codex` symlink inside a writable workspace
still get the conservative sandbox decision, but they no longer see a
Rust panic for that condition. The helper now exits with the concise
sandbox-build error so the normal denial / escalation path can handle
it.


Fixes openai#20716
…r user for shared codex use case (openai#21234)

## Summary
- make the Linux sandbox synthetic mount registry path unique per
effective UID
- keep same-user coordination intact while avoiding collisions between
users sharing `/tmp`
- add a regression test for the registry path contract

## Why
Issue openai#21192 reports that the Linux sandbox currently uses one global
temp path at `/tmp/codex-bwrap-synthetic-mount-targets`. If another user
creates that directory first, later users can fail to open the shared
lock file with `Permission denied`.

## Validation
- `just fmt`
- `cargo test -p codex-linux-sandbox`
- `cargo clippy -p codex-linux-sandbox --all-targets`

Fixes openai#21192
## Why

Tool registration used to bind a tool name to a handler externally,
which left ownership split between the registry plan and the handler
implementation. Some built-in handlers also multiplexed multiple in-core
tools by switching on the invoked tool name internally.

This moves the registry identity onto the handler itself and makes
built-in multi-tool areas use separate concrete handlers, so each
registered handler instance owns exactly one tool name and one dispatch
path.

## What Changed

- Added `ToolHandler::tool_name()` and changed
`ToolRegistryBuilder::register_handler` to derive the registry key from
the handler.
- Split built-in multiplexed handlers into concrete per-tool handlers
for unified exec, shell/local shell/container exec, MCP resources, goal
tools, and agent job tools.
- Kept name-carrying handler instances only where the runtime target is
inherently external or dynamic, such as MCP tools, dynamic tools, and
unavailable placeholders.
- Updated `ToolHandlerKind` and registry-plan construction so plan
entries map directly to concrete handler registrations.

## Verification

- `cargo test -p codex-tools tool_registry_plan`
- `cargo test -p codex-core --lib tools::registry_tests`
- `just fix -p codex-tools`
- `just fix -p codex-core`
## Summary

Xcode 26.4 was built against app-server behavior from before MCP
elicitation requests became client-visible in CLI 0.120.0 via openai#17043.
That client line does not expect the new events/messages, so this PR
restores the old behavior for exactly that client/version combination.

The compatibility handling stays in the app-server layer: when the
initialized client is `Xcode` and its version starts with `26.4`, the
app server marks the live Codex thread so MCP elicitations are
auto-denied. The flag is applied on thread start/resume/fork/turn
attachment, carried through `Codex`/`CodexThread`, and stored on
`McpConnectionManager` so refreshed MCP managers preserve the behavior.

## Notes

This is intentionally narrow and includes a TODO to remove the
compatibility path once Xcode 26.4 ages out.
## Summary

Adds the required `items_view` field to the three session picker `Turn`
test fixtures that populate full turn item lists.

## Root Cause

`openai#21063` added `Turn.items_view` to the app-server protocol type. The
later session picker merge added three test-only
`codex_app_server_protocol::Turn` literals without the new field, which
broke Bazel compilation on `main` with `E0063: missing field
items_view`.

## Validation

- `just fmt`
- `cargo test -p codex-tui resume_picker --no-fail-fast`
- `just argument-comment-lint`

I also ran `cargo test -p codex-tui`; it compiled and ran the suite, but
this local machine failed two pre-existing status permission-profile
tests because `/etc/codex/requirements.toml` disallows
`DangerFullAccess`.
## Summary

This is the first PR in the V8 in-process sandboxing rollout.

It adds the build-system and Rust feature plumbing needed to support
sandboxed V8 builds, then enables sandboxing by default for the
source-built Bazel V8 path that we control directly. It deliberately
keeps the published `rusty_v8` artifact workflows on their current
non-sandboxed contract so this PR can land and ship independently before
we change any released artifacts.

## Rollout plan

- [x] **PR 1: land sandbox plumbing and default source-built Bazel V8 to
sandboxed mode**

- [ ] **PR 2: publish sandbox-enabled release artifacts and add
compatibility validation**
- Produce sandboxed artifact pairs for every released Cargo target that
does not already use the source-built Bazel path.
- Add CI coverage that consumes those sandboxed artifacts and verifies:
    - `codex-v8-poc` reports sandbox enabled
    - `codex-code-mode` builds/tests against the sandboxed path

- [ ] **PR 3: switch release consumers to sandboxed artifacts by
default**
  - Update released artifact selectors/checksums.
- Enable the Rust `v8_enable_sandbox` feature in the default release
path.
- Make the sandboxed artifact family the normal path for published
builds.

- [ ] **PR 4: remove rollout-only compatibility paths**
- Remove the temporary non-sandbox release compatibility config once the
new default has shipped and baked.
  - Keep the invariant tests permanently.
## Why

We want the agent graph store to be passed down the stack as a real
dependency, the same way we already treat the thread store.

This will let us inject the agent graph store as a real dependency and
support implementations other than the local SQLite-backed one. Right
now most code instantiates a state DB and an agent graph store
just-in-time. Ideally, we would not depend on the state DB directly but
only read through the higher-level interfaces.

This change makes the dependency boundaries explicit and moves state DB
initialization to process bootstrap instead of hiding it inside local
store implementations.

## What changed

- `ThreadManager` now requires a `StateDbHandle` and an
`AgentGraphStore` at construction time instead of treating them as
optional internals.
- The local store constructors no longer lazily initialize SQLite.
Callers now initialize the state DB once per process and use that shared
handle to build:
  - `LocalThreadStore`
  - `LocalAgentGraphStore`
- App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the
thread-manager sample) now initialize the state DB up front and inject
the resulting handle down the stack.
- `app-server` now consistently uses its process-scoped state DB handle
instead of reopening SQLite or trying to recover it from loaded threads.
- Device-key storage now reuses the shared state DB handle instead of
maintaining its own lazy opener.
- The thread archive / descendant traversal paths now use the injected
`AgentGraphStore` instead of reaching through local
thread-store-specific state.

## Verification

- `cargo check -p codex-core -p codex-thread-store -p codex-app-server
-p codex-mcp-server -p codex-thread-manager-sample --tests`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-core
thread_manager_accepts_separate_agent_graph_store_and_thread_store --
--nocapture`
- `cargo test -p codex-app-server
thread_archive_archives_spawned_descendants -- --nocapture`
## Summary
This PR adds the first `codex-rs` milestone for remote-exec e2e: a local
`codex exec-server` can now register itself with
`codex-cloud-environments` and attach to the returned rendezvous
websocket.

At a high level, `codex exec-server --cloud ...` now:
- loads ChatGPT auth from normal Codex config
- registers an executor with `codex-cloud-environments`
- receives a signed rendezvous websocket URL
- serves the existing exec-server JSON-RPC protocol over that websocket

## What Changed
- Added `--cloud`, `--cloud-base-url`, `--cloud-environment-id`, and
`--cloud-name` to `codex exec-server`
- Added a new `exec-server/src/cloud.rs` module that handles:
  - registration requests
  - auth/header setup
  - bounded auth retry on `401/403`
  - reconnect/backoff after websocket disconnects
- Reused the existing `ConnectionProcessor` / `ExecServerHandler` path
so cloud mode serves the same exec/filesystem RPC surface as local
websocket mode
- Added cloud-specific error variants and minimal docs for the new mode

## Testing
Manual e2e test that fully goes through exec server flow with our codex
cloud agent as orchestrator
- fork loaded parent threads from `ThreadStore` history in core agent
control paths
- migrate guardian review fork history to loaded session history instead
of rereading rollout files

## Verification
- `cargo test -p codex-core spawn_agent_fork`
I believe a merge race in openai#20689
broke the build, so this is a quick fix.

`cargo check --tests` passed locally.
…enai#21251)

## Why

`codex-rs/app-server-protocol/src/protocol/v2.rs` had grown into a
single ~12k-line definition file for the entire app-server v2 API.

This is purely a mechanical refactor to break up the monolithic `v2.rs`
file that contains all app-server API v2 types into more modular files,
grouped by resource (e.g. account, thread, turn, etc.).

`just write-app-server-schema` shows no real changes, so we can be sure
that this is purely an internal organizational change.

## What changed

- Replaced the monolithic `protocol/v2.rs` with a `protocol/v2/` module
tree and a small `mod.rs` that only declares and reexports modules.
- Grouped v2 API definitions by conceptual owner, including `account`,
`apps`, `collaboration_mode`, `command_exec`, `config`, `device_key`,
`experimental_feature`, `feedback`, `fs`, `hook`, `item`, `mcp`,
`model`, `notification`, `permissions`, `plugin`, `process`, `realtime`,
`review`, `thread`, `thread_data`, `turn`, and `windows_sandbox`.
- Moved v2 tests into `protocol/v2/tests.rs` so `mod.rs` stays small.
- Kept shared protocol helpers in `protocol/v2/shared.rs`, including the
enum mirroring macro and common cross-resource types.
- Co-located resource-specific notifications and server-request payloads
with the modules that own those resources.
- Regenerated app-server protocol schema fixtures. The schema diffs are
non-semantic newline-only changes after the refactor.

## Verification

- `cargo check -p codex-app-server-protocol`
- `cargo test -p codex-app-server-protocol`
- `just write-app-server-schema`
Swap to tag based releasing and allow tags of type `rusty-v8-v*.*.*`
**Summary**
- Add `codex-bwrap`, a standalone `bwrap` binary built from the existing
vendored bubblewrap sources.
- Remove the linked vendored bwrap path from `codex-linux-sandbox`;
runtime now prefers system `bwrap` and falls back to bundled
`codex-resources/bwrap`.
- Add bundled SHA-256 verification with missing/all-zero digest as the
dev-mode skip value, then exec the verified file through
`/proc/self/fd`.
- Keep `launcher.rs` focused on choosing and dispatching the preferred
launcher. Bundled lookup, digest verification, and bundled exec now live
in `linux-sandbox/src/bundled_bwrap.rs`; Bazel runfiles lookup lives in
`linux-sandbox/src/bazel_bwrap.rs`; shared argv/fd exec helpers live in
`linux-sandbox/src/exec_util.rs`.
- Teach Bazel tests to surface the Bazel-built `//codex-rs/bwrap:bwrap`
through `CARGO_BIN_EXE_bwrap`; `codex-linux-sandbox` only honors that
fallback in debug Bazel runfiles environments so release/user runtime
lookup stays tied to `codex-resources/bwrap`.
- Allow `codex-exec-server` filesystem helpers to preserve just the
Bazel bwrap/runfiles variables they need in debug Bazel builds, since
those helpers intentionally rebuild a small environment before spawning
`codex-linux-sandbox`.
- Verify the Bazel bwrap target in Linux release CI with a build-only
check. Running `bwrap --version` is too strong for GitHub runners
because bubblewrap still attempts namespace setup there.

**Verification**
- Latest update: `cargo test -p codex-linux-sandbox`
- Latest update: `just fix -p codex-linux-sandbox`
- `cargo check --target x86_64-unknown-linux-gnu -p codex-linux-sandbox`
could not run locally because this macOS machine does not have
`x86_64-linux-gnu-gcc`; GitHub Linux Bazel CI is expected to cover the
Linux-only modules.
- Earlier in this PR: `cargo test -p codex-bwrap`
- Earlier in this PR: `cargo test -p codex-exec-server`
- Earlier in this PR: `cargo check --release -p codex-exec-server`
- Earlier in this PR: `just fix -p codex-linux-sandbox -p
codex-exec-server`
- Earlier in this PR: `bazel test --nobuild
//codex-rs/linux-sandbox:linux-sandbox-all-test
//codex-rs/core:core-all-test
//codex-rs/exec-server:exec-server-file_system-test
//codex-rs/app-server:app-server-all-test` (analysis completed; Bazel
then refuses to run tests under `--nobuild`)
- Earlier in this PR: `bazel build --nobuild //codex-rs/bwrap:bwrap`
- Prior to this update: `just bazel-lock-update`, `just
bazel-lock-check`, and YAML parse check for
`.github/workflows/bazel.yml`


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21255).
* openai#21257
* openai#21256
* __->__ openai#21255
**Summary**
- Build Linux `bwrap` before the main release binaries.
- Export the release `bwrap` SHA-256 as `CODEX_BWRAP_SHA256` so the
Codex binary can verify the bundled fallback.
- Sign, stage, and upload `bwrap` alongside the primary Linux release
artifacts.

**Verification**
- YAML parse check for `.github/workflows/rust-release.yml`











---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21256).
* openai#21257
* __->__ openai#21256
## Why

Thread names are app-server metadata now, backed by the thread store and
sqlite state database. Keeping a core `SetThreadName` op plus a rollout
`thread_name_updated` event made rename persistence live in the wrong
layer and required historical replay support for an event that new
app-server flows should not write.

## What changed

- Removed `Op::SetThreadName` and `EventMsg::ThreadNameUpdated` from the
core protocol and deleted the core handler path that appended rename
events to rollouts.
- Updated app-server `thread/name/set` so both loaded and unloaded
threads write through thread-store metadata and app-server emits
`thread/name/updated` notifications.
- Updated local thread-store name metadata updates to write sqlite title
metadata and the legacy thread-name index without appending rollout
events.
- Removed state extraction and rollout handling for the deleted
thread-name event.

## Validation

- `cargo test -p codex-app-server thread_name_updated_broadcasts`
- `cargo test -p codex-app-server
thread_name_set_is_reflected_in_read_list_and_resume`
- `cargo test -p codex-thread-store
update_thread_metadata_sets_name_on_active_rollout_and_indexes_name`
- `cargo test -p codex-state`
- `cargo check -p codex-mcp-server -p codex-rollout-trace`
- `just fix -p codex-app-server -p codex-thread-store -p codex-state -p
codex-mcp-server -p codex-rollout-trace`

## Docs

No external documentation update is expected for this internal ownership
change.
## Why
- Similar change as openai#19473.
- Without change: MCP tool calls receive
`_meta["x-codex-turn-metadata"]` with `session_id`, `turn_id`, and
`turn_started_at_unix_ms`.
- Issue: MCP servers may want the model and reasoning effort to better
understand tool-call behavior and latency relative to turn start.

## What Changed
- With change: MCP turn metadata now includes `model` and
`reasoning_effort`, propagated in `_meta["x-codex-turn-metadata"]`.
- Normal `/responses` turn metadata headers are unchanged.

## Verification
- `codex-rs/core/src/mcp_tool_call_tests.rs`
- `codex-rs/core/src/turn_metadata_tests.rs`
- `codex-rs/core/tests/suite/search_tool.rs`
## Why

BUGB-15601 showed that the Windows safe-command path had drifted from
the generic Git classifier. The Windows-specific Git parser could
classify a PowerShell-wrapped `git` command as safe as soon as it found
a safelisted subcommand, without applying the generic checks for unsafe
subcommand options such as `--output`, `--ext-diff`, `--textconv`,
`--paginate`, or `cat-file --filters`.

The generic classifier already models the Git command boundary and the
read-only argument checks more carefully, so Windows should reuse that
logic instead of maintaining a smaller parallel parser.

## What Changed

- Extracted the existing generic Git classification logic into
`is_safe_git_command`.
- Updated `windows_safe_commands.rs` to call that shared helper for
parsed PowerShell `git` commands.
- Removed the Windows-only Git subcommand safelist, including the
`cat-file` allowance that was part of the reported bypass.
- Added a Windows regression test that keeps PowerShell-wrapped Git
commands with side-effecting options classified unsafe.
- Made the full-path PowerShell test discover the installed PowerShell
executable instead of depending on one hard-coded `pwsh.exe` path.

## Verification

- `cargo test -p codex-shell-command
rejects_git_subcommand_options_with_side_effects`
- `cargo test -p codex-shell-command
git_global_override_flags_are_not_safe`
- `cargo test -p codex-shell-command
windows_powershell_full_path_is_safe -- --nocapture`

Co-authored-by: Codex <codex@openai.com>
## Why

The core protocol still exposed a `ListModels` submission op even though
no client sends it and the core submission loop treated it as an ignored
unknown op. Keeping the dead variant made the protocol surface look
supported while the active model listing API is the app-server
`model/list` JSON-RPC request.

## What Changed

- Removed the unused `Op::ListModels` variant from `codex-rs/protocol`.
- Removed its `Op::kind()` mapping.

The existing app-server `model/list` endpoint is unchanged.

## Verification

- `cargo test -p codex-protocol`
## Why

`skills/list` is already exposed through app-server v2 and covered by
the app-server test suite. Keeping the separate core `Op::ListSkills`
path leaves a duplicate legacy protocol surface that no longer needs to
be maintained.

## What Changed

- Removed `Op::ListSkills` and `EventMsg::ListSkillsResponse` from the
core protocol.
- Deleted the corresponding core session handler and stale core
integration tests.
- Removed rollout/MCP ignore branches and protocol v1 docs references
for the deleted event/op.
- Left app-server `skills/list` and its existing coverage intact.

## Validation

- `cargo test -p codex-protocol`
- `cargo test -p codex-core --test all suite::skills`
- `cargo check -p codex-mcp-server -p codex-rollout -p
codex-rollout-trace`
- `just fix -p codex-core`
## Summary
- Add plugin manifest keywords to core plugin marketplace/detail models
- Expose keywords on app-server v2 PluginSummary and generated
schema/types
- Populate keywords in plugin/list and plugin/read responses for local
plugins

Depends on openai/openai#891087

## Validation
- just fmt
- just write-app-server-schema
- cargo test -p codex-app-server-protocol
- cargo test -p codex-core-plugins
- cargo test -p codex-app-server
plugin_list_keeps_valid_marketplaces_when_another_marketplace_fails_to_load
- cargo test -p codex-app-server
plugin_read_returns_plugin_details_with_bundle_contents
…0949)

## Summary
- make `thread_source` an explicit optional thread-level field on
`thread/start`, `thread/fork`, and returned thread payloads
- persist `thread_source` in rollout/session metadata so resumed live
threads retain the original value
- replace the old best-effort `session_source` -> `thread_source`
mapping with an explicit caller-supplied analytics classification

## Why
Before this change, analytics `thread_source` was populated by a
best-effort mapping from `session_source`. `session_source` describes
the runtime/client surface, not the actual thread-level origin, so that
projection was not accurate enough to distinguish cases such as `user`,
`subagent`, `memory_consolidation`, and future thread origins reliably.

Making `thread_source` explicit keeps one thread-level analytics field
while letting callers provide the real classification directly instead
of recovering it indirectly from `session_source`.

## Impact
For new analytics events, `thread_source` now reflects the explicit
thread-level classification supplied by the caller rather than an
inferred value derived from `session_source`. Existing protocol fields
remain optional; callers that omit `threadSource` now produce `null`
instead of a best-effort inferred value.

## Validation
- `just write-app-server-schema`
- `cargo test -p codex-analytics -p codex-core -p
codex-app-server-protocol --no-run`
- `cargo test -p codex-app-server-protocol
generated_ts_optional_nullable_fields_only_in_params`
- `cargo test -p codex-analytics
thread_initialized_event_serializes_expected_shape`
- `cargo test -p codex-core
resume_stopped_thread_from_rollout_preserves_thread_source`
Extends `plugin/share/save` to accept optional discoverability and
shareTargets while uploading plugin contents, and adds
`plugin/share/updateTargets` for share-only target updates without
re-uploading.
…#20724)

## Why

Codex currently accepts dynamic tool names and namespaces that the
upstream Responses function-tool path does not actually support. In
practice, that means app-server can register a dynamic tool successfully
and only discover later that the LLM-facing tool contract will reject or
mishandle it.

This PR tightens the app-server-side dynamic tool contract to match the
Responses API before we stack dynamic tool hook support on top of it.

## What changed

- validate dynamic tool `name` against the Responses function-tool
identifier contract: `^[a-zA-Z0-9_-]+$`, length `1..128`
- validate dynamic tool `namespace` the same way, with the Responses
namespace length limit `1..64`
- reject namespaces that collide with the always-reserved Responses
runtime namespaces such as `functions`, `multi_tool_use`, `file_search`,
`web`, `browser`, `image_gen`, `computer`, `container`, `terminal`,
`python`, `python_user_visible`, `api_tool`, `tool_search`, and
`submodel_delegator`
- escape invalid identifiers in error messages so control characters do
not spill raw into logs or client-visible error text
- document the tightened dynamic tool identifier contract in
`codex-rs/app-server/README.md`
- add both unit coverage for the validator and an app-server integration
test that rejects a `thread/start` request with Responses-incompatible
dynamic tool identifiers

## Verification

- `cargo test -p codex-app-server validate_dynamic_tools_`
- `cargo test -p codex-app-server --test all
thread_start_rejects_dynamic_tools_not_supported_by_responses`
# Overview
MCP refreshes were rebuilding active threads from fresh disk-backed
config only, which dropped thread-start session overlays such as
app-injected MCP servers. This keeps refreshes current with disk config
while preserving the thread-local config that only the active thread
knows about.

# Changes
- Rebuild refreshed config per active thread using that thread's current
`cwd`, rather than fanning out one app-server config to every thread.
- Preserve each thread's `SessionFlags` layer while replacing reloadable
config layers with freshly loaded config, then derive the MCP refresh
payload from the rebuilt result.
- Move MCP refresh orchestration into app-server so manual refreshes
fail loudly while background refreshes remain best-effort, and route
plugin-triggered refreshes through the same per-thread reload path.
- Add regression coverage for session overlays, fresh project config,
plugin-derived MCP config, current requirements, and strict vs
best-effort refresh behavior.

# Verification
- Passed focused Rust coverage for the thread-config rebuild behavior
and deferred MCP refresh flow, plus `cargo test -p codex-app-server
--lib`.
- Verified end to end in the Codex dev app against the locally built
CLI: registered an MCP via thread config, verified that it could be used
successfully before refresh, manually triggered MCP refresh, and
verified that it continued to be available afterward.
- [x] Return Accept early when auto_deny is enabled per feedback.
## Why

openai#21255 added the standalone `codex-bwrap` binary. In the Cargo build,
[`pkg_config::probe("libcap")`](https://github.com/openai/codex/blob/a736cb55a2bce57b4c8e5a4fe56f70c2b2ad892b/codex-rs/bwrap/build.rs#L37-L39)
emits `-lcap` before
[`cc::Build::compile("standalone_bwrap")`](https://github.com/openai/codex/blob/a736cb55a2bce57b4c8e5a4fe56f70c2b2ad892b/codex-rs/bwrap/build.rs#L50-L67)
adds the static bwrap archive. The Linux musl link then sees `-lcap
-lstandalone_bwrap`; because static archives are resolved left-to-right,
`cap_from_name` is still undefined once `standalone_bwrap` introduces
that reference.

The musl setup already builds `libcap.a` and exposes it through
[`libcap.pc`](https://github.com/openai/codex/blob/a736cb55a2bce57b4c8e5a4fe56f70c2b2ad892b/.github/scripts/install-musl-build-tools.sh#L78-L88),
so the failure is link ordering rather than a missing dependency.

## What changed

- probe `libcap` with `cargo_metadata(false)` so `pkg-config` does not
emit its link flags early
- emit the discovered `libcap` search paths and libraries after
`standalone_bwrap` is compiled, preserving the needed static-link order

## Verification

- `cargo test -p codex-bwrap`
- `cargo clippy -p codex-bwrap --all-targets`

The affected Linux musl release link is exercised by CI, which is the
path this fix targets.
github-actions Bot and others added 3 commits May 6, 2026 07:19
…nt/wallentx_termux-target_from_release_0.129.0_b187e52ce92d

# Conflicts:
#	.github/workflows/rust-release.yml
#	.github/workflows/shell-tool-mcp.yml
#	codex-rs/Cargo.toml
#	scripts/termux-create-checkpoint-pr.sh
@unemployabot unemployabot Bot requested a review from wallentx May 6, 2026 10:10
@unemployabot unemployabot Bot added checkpoint Checkpoint merge termux-release Termux release automation labels May 6, 2026
@wallentx wallentx merged commit a862d1c into wallentx/termux-target May 6, 2026
1 check passed
@wallentx wallentx deleted the checkpoint/wallentx_termux-target_from_release_0.129.0_b187e52ce92d branch May 6, 2026 10:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

checkpoint Checkpoint merge termux-release Termux release automation

Projects

None yet

Development

Successfully merging this pull request may close these issues.