solx 0.5.0: thin spine — stdlib argparse dispatch, 6–13× faster startup#30
Merged
Conversation
The legacy keep-list fallback stays supported through the 0.x line; one release was not enough migration runway, so removal now lands with 1.0.0. The deprecation nudge in solx keep names the new version via SOLKEEP_REMOVED_IN. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Completion no longer shells back into solx: one data structure describes the CLI surface (commands, subcommands, flags, descriptions) and bash/zsh/fish scripts are rendered from it as fully static text, so the first Tab of a session costs no interpreter start — which on Sol's NFS home is the difference between instant and a ~1s stall. The zsh script keeps the dual-mode footer (eval/source registers via compdef; fpath autoload calls the completer directly) so both install modes complete on the first Tab. Tests assert the script shapes, that every command is listed in all three shells, and run each shell's syntax checker over the emitted script when the shell is installed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
On Sol's NFS home, importing typer alone costs ~1s per invocation and is the entire startup-latency floor. The new solx.main builds an argparse tree (allow_abbrev=False everywhere, matching the old no-prefix-match behavior) and keeps module-level imports to the stdlib plus __version__; command bodies, rich, and pathspec load only inside the handler that needs them. typer, click, and shellingham drop out of the dependency set entirely. Dispatch preserves the v0.4.0 surface byte-for-byte on the parity matrix: bare --version/version fast path, hidden jobs/ls aliases rewritten before parsing, help-on-stdout + exit 2 for the root and bare groups, and a hand-rolled 'job start' tail parser (options consumed anywhere, first unconsumed token names the template — including after '--' — first '--' swallowed, everything else passed through to salloc in order). One deliberate superset: every leaf except 'job start' now also accepts a trailing --json; after 'job start' it remains salloc passthrough. CLI dispatch tests are ported off CliRunner onto main([...]) + SystemExit with the same monkeypatch seams, plus import-hygiene checks (no typer in sys.modules after a dispatch; importing solx.main never pulls rich). The zipapp entry point follows to solx.main:main. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
zipapp -p makes the artifact directly executable in place (./dist/solx.pyz) instead of requiring an explicit interpreter. install.sh strips that build-machine shebang before stamping one bound to the destination machine's uv-resolved interpreter, so the installed binary carries exactly one shebang and stays bound to the interpreter version the bytecode was compiled for. Shebang-less artifacts still install unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
solx --version and solx version short-circuit in main() before the parser tree is built, but importing solx.main still paid module-scope argparse, pathlib, and typing -- on Sol's NFS home, and under CPU contention, that import cost is most of the command's wall time. Import argparse and pathlib where the parser tree is built (and in the one handler that uses Path), and replace typing.TYPE_CHECKING with a module-level constant, so importing solx.main loads nothing beyond the interpreter's startup set. Measured on a 4-core allocation at load ~126: warm venv solx --version median 0.23s over 56 runs (batch medians 0.11-0.27s), down from 0.41-0.54s before this change. python -X importtime -c "import solx.main" now lists only the solx package and __future__. Parity matrix vs golden-v040: 65/67 pass, 2 expected-diff, 0 fail; all 220 tests pass; ruff clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Port the harness that verified the dispatch-layer rewrite into the repo as a durable eval asset: 67 cases over the full command surface, each in an isolated fake HOME with deterministic SLURM mocks, captured as stdout/stderr/exit code and compared byte-for-byte between two solx builds. Future surface-preserving rewrites (the native single-binary port is already in development) need the same proof, and the harness is useless if it lives in /tmp. Goldens stay uncommitted because they are environment-captured; the README documents the capture/compare workflow. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The thin-spine rewrite changed what the docs should tell people, in three ways: - Latency guidance inverted. A warm solx job read now costs ~0.13s with the .pyz install vs ~0.08s for raw squeue (measured on a Sol compute node, warm median of 9), so SKILL.md, references/solx.md, docs/solx.md, coverage.md, and the bench script's takeaway stop steering agents to raw squeue/scancel for one-off reads; raw commands stay documented as equivalents and as the no-solx fallback. The full measured table (v0.4.0 vs v0.5.0, venv vs pyz, with the cold-ish and filesystem- placement caveats) lives in ROADMAP.md. - Roadmap: Stage 4 is shipped; Stage 5 is the native single-binary rewrite (Rust) targeting v1.0 — cold-start immunity on NFS, no Python/uv runtime, single-file install. Decisions confirmed now record argparse as the CLI framework, static generated completion scripts, and the solkeep removal moving from 0.5.0 to 1.0.0 (every doc that named 0.5.0 as the removal release is updated to match keep.SOLKEEP_REMOVED_IN). - Behavior/manual updates: --json is accepted after the subcommand too (except job start, whose tail passes through to salloc); completions are fully static, with the zsh fpath install mode documented alongside eval/source; solx/DEVELOPMENT.md's architecture, aliases, and coverage sections describe main.py/_completions.py and the runtime dependency list (rich + pathspec). CHANGELOG records all of it under Unreleased. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pin the review-confirmed dispatch divergences with goldens: `--` shielding for sbatch passthrough, repeated `--`, bundled short flags, `--dry-run=true`, version-command junk args, `keep -j 0`, `help job`, and `-h`. The js-* shielding cases are STRICT (byte parity); the error-wording cases are RELAXED (exit-code parity only, wording may differ from Click's); `-h` is EXPECTED_DIFF as a documented v0.5.0 superset (v0.4.0 exits 2, v0.5.0 prints help and exits 0). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- job start: after the first `--` no token is parsed as a flag — the first leftover still names the template, later `--` tokens forward literally — and pre-`--` short bundles (-nn) expand when every letter is a known short flag; --dry-run=X is a usage error. Without the shield, `gpu -- -n` silently flipped a submit into a dry-run and -n/--timeout could not be forwarded to salloc at all. - version: fast-path only bare --version/version; everything else goes through argparse with a deferred --version action, and the version subcommand takes no arguments, so junk argv exits 2 again instead of printing the version. - keep: -j/--jobs must be >= 1 (exit 2), restoring the min=1 bound the Typer option enforced. - _SOLX_COMPLETE: exit 0 silently, so completion scripts generated by solx <= 0.4.0 (Typer's runtime protocol, persisted by fpath installs) get zero candidates instead of parsing help text as completions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Parser/table drift: config edit, completions, version, and help no longer take a trailing --json (matching the COMMANDS table, which never offered it — their output is one fixed text); a pinning test now walks the argparse tree and asserts COMMANDS mirrors it (commands, subcommands, flag forms, positionals), with the --stage/shell choice tuples pinned to their owners, so the two can never drift silently again. The module docstring claims correspondence, not byte-equal help strings. bash script: COMPREPLY is filled via mapfile (no IFS word splitting, no glob expansion — a path with spaces is one candidate) with a guarded 'compopt -o filenames' on the --solkeep/--csv-dir sites so readline escapes inserted paths; mid-word Tab completes against the part of the word left of the cursor (COMP_LINE/COMP_POINT); leaf and subcommand flag lists include -h; group commands offer -h/--help; positional choices are not re-offered once filled. zsh: group functions (_solx_job/_solx_config) offer -h/--help. fish: group level and every leaf offer -h/--help; the completions shell argument is guarded so bash/zsh/fish are not re-offered. Functional bash probes (simulated COMP_WORDS/COMP_LINE) cover the space/glob/mid-word/re-offer cases; fish behavior verified with complete -C. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CHANGELOG: an Upgrading note — file-installed completion scripts (zsh fpath) must be regenerated after upgrading because <=0.4.0 scripts use the _SOLX_COMPLETE runtime protocol that 0.5.0 answers with zero candidates; a superset entry for -h alongside the --json placement one, and the parity claim now points at both; --json's no-trailing-flag commands named; parity matrix case count 67 -> 80 (here and in evals/parity/README.md). docs/solx.md: the completions section tells fpath installs to rerun the redirect after every upgrade, and the scripting section names the four commands that take no trailing --json. keep.py: the comment above SOLKEEP_REMOVED_IN said the fallback loses support 'in this release line' while the constant says 1.0.0; it now describes the current schedule. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a Highlights speedup table at the top of the unreleased section and align the latency figures across CHANGELOG and ROADMAP on the apples-to-apples NFS-home install (both .pyz on ~/.local/bin), the location install.sh actually writes to. The node-local /tmp figure stays documented as the best case rather than as the recommended install. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The deferred client side isn't laptop-specific; "local machine" covers any workstation a user SSHes to Sol from. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a build job that runs build-pyz.sh and uploads dist/solx.pyz, so a reviewer can install and test a PR's solx on Sol without building it. DEVELOPMENT.md documents building/installing locally and fetching the artifact from a PR (install via install.sh, which re-stamps the shebang for the local interpreter). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
v0.5.0 — thin spine
Rewrites the CLI's dispatch layer on the Python standard library (
argparse,replacing Typer/
click/rich) so startup latency drops to the same order as araw SLURM call. The skill no longer steers agents to raw
squeuefor one-offreads —
solxand raw SLURM reads are now treated as equivalent.Startup latency
Warm median on a Sol compute node (NFS
$HOME, single-file.pyzinstall):squeuesolx --versionsolx job listsolx job timeThe win is removing the Typer/
click/richimport tree:--version/versionshort-circuit before the parser tree is built, command bodies import inside
their handlers, and
--json/piped runs never loadrich.Changes
argparse— entry pointsolx.main:main(replacing
solx.cli:app). Command surface, aliases, exit codes, and theoutput contract are unchanged apart from two documented supersets (
--jsonplacement and
-h).solx completions <bash|zsh|fish>rendersfully static scripts from one description; nothing execs
solxat completiontime, so the first Tab of a session costs no interpreter start.
evals/parity/) — 80 cases over the fullcommand surface, each run in an isolated fake
$HOMEagainst deterministicSLURM mocks and compared byte-for-byte against a captured golden run. Used to
verify the rewrite reproduces v0.4.0 behavior.
the
solx/raw-SLURM equivalence;~/.solkeepremoval deferred to 1.0.0.Upgrading
Completion scripts installed as files must be regenerated after upgrading (e.g.
zsh fpath:
solx completions zsh > ~/.zfunc/_solx). Scripts from solx ≤ 0.4.0use the Typer completion protocol, which 0.5.0 answers with zero candidates.
Eval/source install modes regenerate each shell and need no action.
Release
Merge this PR, then push the unprefixed
v0.5.0tag to trigger the CI release.🤖 Generated with Claude Code