Conversation
Single [[bin]] crate named solx at version 1.0.0-dev, with the dependency set the port needs (clap derive for the command tree, serde_json with preserve_order for byte-stable JSON, toml with preserve_order so [jobs.*] tables keep file order, ignore for gitignore-semantics matching and enumeration, csv/filetime/shlex for the keep and config-edit paths). Cargo.lock is committed and stable is pinned via rust-toolchain.toml so builds resolve identically on Sol and in CI. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
output.rs renders JSON byte-identically to Python's json.dumps(indent=2) — two-space indent, \uXXXX escapes for everything outside printable ASCII, insertion-ordered keys — because agents diff solx output across implementations and the goldens compare byte-strict. Diagnostics always go to stderr; prompting is gated on stdin being a TTY, separately from the stdout format choice. config.rs ports the TOML schema with the exact user-facing validation messages, [keep] rules compiled to gitignore matchers rooted at /, the .solkeep loader/splitter with the order-sensitivity probe, and the starter-config text verbatim. slurm.rs ports verb-aware jobid resolution (read/attach verbs auto-pick most recent, stop never auto-picks, inside an allocation is the default target with a self-action flag), the argv builders, and salloc execution with a wall-clock timeout. Unit tests port the Python suite's vectors for row parsing, resolution branches, argv shapes, durations, validation errors, solkeep import, and keep matching. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
list/start/stop/jump/time mirror the Python bodies: identical JSON payloads and stderr strings, exit 1 for runtime failures, exit 2 for refusals (ambiguous stop, non-interactive without -y). jump exec-replaces the process with `srun --jobid=N --overlap --pty SHELL`. `job start` gets a hand-written tail parser because its grammar predates clap conventions: -n/--dry-run and --timeout V are consumed wherever they appear before the first `--`, the first `--` is dropped, and the first unconsumed bare token — even one after `--` that looks like a flag — becomes the template, with every other leftover token passed through to salloc in original order. The unit tests pin each branch of that split. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
build_plan reads the Directory column of Sol's warning CSVs (missing file = empty stage), dedupes across stages, and intersects with the [keep] rules; only flagged directories are ever renewed. Enumeration is an in-process walk with every ignore facility disabled so the file set equals `find DIR -type f` — skipping hidden or git-ignored files would silently under-protect them. Touch sets atime+mtime to now with touch -c semantics: a vanished path is a silent skip and nothing is ever created. -j runs a worker pool over one task queue holding both enumerate and touch tasks; a huge directory shards into BATCH-sized touch tasks so the whole run scales with -j, not the directory count. JSON plan/summary documents cap inlined lists at JSON_LIST_CAP with exact counts and spill the complete plan to a temp file. The ~/.solkeep fallback stays, with a deprecation notice naming 1.0.0 as the removal version. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
main.rs builds the clap tree (job/jobs groups, hidden ls alias, top-level jump, config show/edit/import-solkeep) with a raw pre-pass for the pieces clap can't express: eager leading --version printing the bare version string, leading --json, no-args group help on stdout with exit 2, and interception of `job start` so its tail reaches the hand-written parser with the `--` separator intact. --json is global, so it is accepted trailing on every leaf; on `job start` a non-leading --json is salloc passthrough by design. init writes the starter config at mode 0600 (interactive walkthrough picks the shell and offers the ~/.solkeep import; non-interactive runs write defaults with no prompts), and config import-solkeep performs the validated, lossiness-checked migration. completions embeds static bash/zsh/fish scripts from assets/ — the zsh script carries the dual-mode footer so both eval/source and fpath/autoload installs work. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
tests/cli.rs runs the compiled solx in an isolated fake HOME with the deterministic SLURM mocks committed under tests/mocks/bin (the crate's tests are self-contained), asserting stdout, stderr, and exit codes for the core flows: version, list, time, stop preview/refusal, the start template/passthrough split, jump exec, keep planning and a real renewal over stale files, config show key order, config edit argv handling, init, the import-solkeep lossy refusal, and completions validation. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
rust-ci runs fmt --check, clippy -D warnings, and the test suite on every push/PR touching solx-rs/, with rust-cache keeping the dependency graph warm. README covers build/install and the output contract; DEVELOPMENT maps each Rust module to its Python counterpart and explains the parity-first workflow (the golden matrix is the spec, completion scripts are synced from the Python generator). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The embedded bash, zsh, and fish completion scripts now match the golden-v050 `solx completions <shell>` output byte for byte. The zsh script names its entry function _solx, completes per-subcommand flags via _arguments, and adds _solx_job/_solx_config helpers for the nested command groups; the embedded-asset test asserts the matching `compdef _solx solx` footer. Verified: cargo test (126 passed) and the parity matrix against golden-v050 (67/67). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The command reference lives at docs/solx.md in the repo root, not under the Python package directory, and rust-toolchain.toml selects the stable channel rather than pinning a version. Also state the no-color rendering as current behavior. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…r config errors
The keep matcher must make byte-identical include/exclude decisions to the
Python reference, which compiles [keep]/.solkeep rules with pathspec's
GitIgnoreSpec. The ignore-crate matcher diverged on real inputs: it expanded
{a,b} brace alternates (renewing directories the rules never selected — and
skipping a literal {} directory the user opted in), matched every path for a
stray '/' include line, matched unclosed-'[' patterns that git discards, and
dropped a flagged directory written with a trailing slash, silently leaving
opted-in data to age into the purge. src/gitwild.rs is a faithful port of
pathspec 1.1.1 (pattern translation + last-match/exact-over-ancestor
resolution), pinned by vectors generated from the reference implementation.
Config diagnostics now match the reference's plain (non-TTY) renderer, which
strips style-tag lookalikes such as [jobs.default]/[keep] from messages
(output::strip_markup), and TOML parse errors collapse to the one-line
'message (at line L, column C)' form everywhere solx reports them — every
solx diagnostic is a single stderr line.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… -j default An existing warning CSV that can't be opened or decoded now exits 1 with one line naming the file instead of planning zero directories — for the command whose job is preventing scratch deletion, a read failure must never look like 'nothing flagged'. A BOM-prefixed header keeps the BOM as part of the first header cell's name (the reference CSV reader's behavior), so a BOM'd Directory header yields no directories rather than a divergent plan. The full-plan spill goes through the tempfile crate: created 0600 (the document enumerates the user's flagged scratch layout and lands in shared /tmp), bounded creation retries instead of an unbounded loop that spins forever on an unwritable temp dir, and surfaced write errors so a truncated spill is never advertised as complete. The -j default derives from ONLINE system CPUs (sysconf(_SC_NPROCESSORS_ONLN), os.cpu_count semantics) rather than cgroup-aware available_parallelism, so a run inside a Slurm allocation still defaults to the same worker count as the reference — not a serial fallback on the exact nodes keep is meant for. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
shlex::try_join quotes any token containing '=' or '%', which broke stderr byte-parity on every job start whose argv carries --gres/--mem/--cpus-style tokens (the submitting: line, dry-run renders, and the salloc-timeout Argv tail). A token is now quoted only when it contains a character outside [A-Za-z0-9_@%+=:,./-], with single-quote wrapping and '"'"' for embedded quotes — exactly shlex.join. Pinned with the gpu-template argv from the parity goldens. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…back The kernel hostname on Sol compute nodes is the short name (scc041), so a missing/failing/hanging `hostname -a` plus the bare kernel-name fallback locked the gate on a genuine Sol node. The fallback now mirrors Python socket.getfqdn: forward-resolve the kernel hostname, reverse-resolve the address (gethostbyaddr, whose aliases carry the .sol.rc.asu.edu form), and take the first dotted name — falling back to the kernel hostname only when resolution itself fails. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t usage parity
- Only bare '--version' / 'version' print the version: the eager fast path
is gone, so junk around either form is a clap usage error (exit 2), like
'version bogus', '--bogus --version', and '--version --bogus'.
- 'help' is solx's own argument-less subcommand (clap's auto help subcommand
is disabled), so 'solx help job' exits 2 instead of succeeding.
- 'job start --dry-run=VALUE' is a usage error (exit 2,
"Option '--dry-run' does not take a value.") instead of being read as a
template name; the tail parser also accepts the bare '-h' help token
(outside '--', matching the rest of the surface).
- Group help renders with the binary name ('Usage: solx job ...'), and
'job start --help' is bespoke text documenting -n/--dry-run, --timeout,
TEMPLATE, and the salloc passthrough under the full 'solx job start'
usage line. The Sol gate runs before the tail parse, in reference order.
- An unparseable $EDITOR is a runtime failure: exit 1 with one clean line.
- An invocation with _SOLX_COMPLETE set (the runtime-completion callback
protocol installed completion scripts use) exits 0 silently instead of
running a command.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The bash, zsh, and fish completion scripts embedded in the Rust binary must match the output of the v0.5.0 Python CLI, which is the behavioral spec for the port. Regenerated from the fixed branch A binary. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Split the README for two audiences: users install a prebuilt static binary from a CI release (no cargo, Python, or uv on the box), while contributors get a Toolchain on Sol section — sudo-free rustup user-install, CARGO_TARGET_DIR on node-local storage to avoid NFS build artifacts, a real-GET crates.io connectivity check (bare HEAD probes 403), and the glibc-2.28-on-Sol vs musl-in-CI split. DEVELOPMENT.md cross-references the README so the Sol notes have one home. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a Highlights speedup table comparing the Rust build against the v0.5.0 Python baseline (warm medians, NFS home), plus Changed/Added entries for the rewrite and the prebuilt static-binary release. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a build job that compiles the statically linked x86_64-unknown-linux-musl target and uploads it, so a reviewer can download the binary from a PR and run it on Sol with no Rust toolchain and no glibc-version dependency. DEVELOPMENT.md documents the native and musl builds and fetching the artifact from a PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Rust rewrite becomes the only solx. Rename the crate solx-rs/ → solx/ and delete the Python package (CLI, tests, .pyz build, install.sh, uv channel); bump to 1.0.0 (Cargo + SKILL.md). - CI: fold Rust lint/test/build into ci.yml; release.yml now builds and publishes the static musl binary on a vX.Y.Z tag (verifies the tag against Cargo.toml + SKILL.md). - keep: drop the implicit ~/.solkeep fallback — the config [keep] block is the only automatic keep-list source; `config import-solkeep` migrates an existing file and `--solkeep <file>` is an explicit per-run override. - Docs: binary-only install (download + chmod) everywhere; CHANGELOG [1.0.0]; ROADMAP made forward-facing with the laptop-side promoted to the next focus; strip retired-tooling narration from the skill. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per the clean-slate cut, solx no longer touches a legacy ~/.solkeep at all. The config [keep] block is the single keep-list source. - Remove the `solx config import-solkeep` subcommand and the `solx keep --solkeep <file>` flag; `solx keep` reads only `[keep]` and errors (pointing at `solx config edit`) when it's absent. The `solx init` walkthrough no longer offers to import ~/.solkeep. - Drop the now-dead config helpers (load_solkeep, import_solkeep, solkeep_is_order_sensitive, render_keep_block) and GitIgnoreSpec::empty; simplify starter_config_text. - Regenerate the embedded bash/zsh/fish completion scripts (no import-solkeep / --solkeep). - Tests: replace the fallback/import tests with a regression test that a ~/.solkeep on disk is ignored; rewrite the negation test to build rules from the include list. - Parity matrix: drop the solkeep/import cases and fixtures; the run_case helper loses its HOME_SOLKEEP parameter. - Eval mocks: the fake $HOME now ships a config.toml with a [keep] block instead of a ~/.solkeep. - Docs: purge import-solkeep / --solkeep / migration guidance from the manual, skill, and references; the ROADMAP records the removal. cargo fmt/clippy/test green (103 unit + 39 e2e); bash/zsh/fish completions syntax-check clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…docs Address review on PR #32: - release.yml now runs `cargo test --locked` before building/publishing, so a `vX.Y.Z` tag can't ship a binary that skipped the suite (ci.yml only runs on main pushes/PRs, not tags). Use `--locked` for build/clippy/test in both workflows so CI fails on a stale Cargo.lock rather than silently updating it. - Every documented install snippet now runs `mkdir -p ~/.local/bin` before the download — a fresh Sol account may not have the directory, which made `curl -fLo ~/.local/bin/solx` (and the `mv` variant) fail. README, docs/solx.md, SKILL.md, references/solx.md, solx/README.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The behavioral-parity matrix (evals/parity/) was cross-version migration scaffolding — it diffed a new build against a captured golden of the prior implementation. With Python retired there's no second implementation to compare against, and the crate's own end-to-end + unit tests (which run in CI) now lock behavior. Remove it; git history keeps it recoverable if a future refactor wants the capture-and-diff approach again. - Delete evals/parity/ (harness, fixtures, duplicate SLURM mocks). - Repoint the "spec" references to solx/tests/cli.rs + unit vectors: solx/DEVELOPMENT.md (a behavior contract instead of "parity is the spec"), solx/README.md, docs/coverage.md, evals/README.md, CHANGELOG v1.0 entry. Fresh-start docs pass: v1.0 is the starting point, so the user- and contributor-facing docs no longer narrate the v0.x/Python lineage. - ROADMAP: replace the version-by-version stages table and the v0.5.0 Python latency deep-dive with a "What solx does today" overview; drop v0.4.0/v0.5.0 stamps from the design principles and confirmed decisions. - coverage.md: reframe the header around the current Rust suite; drop "New in v0.4.0 / Updated for v0.5.0" provenance from the cells. - solx/README.md + solx/DEVELOPMENT.md: present a native CLI, not a port. - CHANGELOG keeps the history (per prior call); only dead cross-refs to the removed ROADMAP table were trimmed from the 0.5.0 entry. cargo fmt/clippy/test green (103 unit + 39 e2e); user-facing docs carry no version lineage. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Fixes eval #4 (it encoded the bug: expected -p public for a 4h GPU job) and adds #8 (30-min ablation -> htc), #9 (multi-day -> public/general), #10 (smoke test -> debug QOS on public/general). Adds an L3 l3_sbatch_test_only check that validates the recommended header against the live scheduler, catching invalid combos like -p htc -q debug that regex alone misses. Regexes hardened: canonical 4h forms, a100:1 vs a100:10, day and HH:MM:SS walls.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
v1.0 — the native Rust binary is now the only
solxThis promotes the Rust rewrite to v1.0 and makes a clean break: the Python implementation is gone and a single static binary is the entire product. Install is download +
chmod +x— no Python, nouv, no toolchain.Built on the 18
solx-rscommits (rebased ontomain@ 0.5.1); the commits on top do the promotion, docs refresh, and the~/.solkeepremoval.Code
solx-rs/→solx/and deleted the Python package (CLI, tests,.pyzbuild,install.sh,pyproject.toml/uv.lock,uv toolchannel).1.0.0(Cargo.toml+SKILL.md; lock synced).~/.solkeepend to end. The config[keep]block is the only keep-list source.solx keepnever reads a legacy~/.solkeep; theconfig import-solkeepsubcommand and thekeep --solkeep <file>flag are gone;solx initno longer offers to import one. Dead config helpers removed, embedded completions regenerated.CI
ci.yml(replaces the Python workflow).release.ymlbuilds + publishes the staticx86_64-unknown-linux-muslbinary on avX.Y.Ztag, verifying the tag againstCargo.toml+SKILL.md.Docs
references/,docs/solx.md); Python badge → Rust.[1.0.0]entry (Python retired +~/.solkeepremoved). Historical Python/.pyzreferences kept as release history.DEVELOPMENT.md,docs/coverage.md, eval harness docs + mocks (the fake$HOMEnow ships a[keep]config instead of a.solkeep), and the parity matrix (solkeep cases/fixtures dropped).sol_renew.py, deprecation/removal history) from the skill.Verification
cargo fmt --check,cargo clippy -D warnings,cargo testall green: 103 unit + 39 end-to-end tests pass.solkeepreferences.🤖 Generated with Claude Code