Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
7a0b028
solx-rs: scaffold the Rust crate
Shu-Wan Jun 9, 2026
9938d2e
solx-rs: output, side, config, and slurm core
Shu-Wan Jun 9, 2026
f74cf07
solx-rs: job command bodies and the start-tail parser
Shu-Wan Jun 9, 2026
fcdccdb
solx-rs: keep — CSV plan, enumeration, touch pipeline
Shu-Wan Jun 9, 2026
2b4f1ec
solx-rs: CLI dispatch, init, completions
Shu-Wan Jun 9, 2026
ea07d59
solx-rs: end-to-end tests over the real binary
Shu-Wan Jun 9, 2026
7aef94a
solx-rs: CI workflow and docs
Shu-Wan Jun 9, 2026
c6d1f6e
solx-rs: sync completion assets with golden-v050 scripts
Shu-Wan Jun 9, 2026
407cea0
solx-rs: correct README doc path and toolchain wording
Shu-Wan Jun 9, 2026
8a40a28
solx-rs: port pathspec GitIgnoreSpec for keep matching; plain-rendere…
Shu-Wan Jun 10, 2026
5817bc8
solx-rs: keep — fail loudly on unreadable CSVs, 0600 plan spill, ONLN…
Shu-Wan Jun 10, 2026
bdb80fc
solx-rs: shell_join — quote like Python shlex.join
Shu-Wan Jun 10, 2026
9ae13a4
solx-rs: Sol gate — DNS-resolved FQDN before the kernel-hostname fall…
Shu-Wan Jun 10, 2026
44e9aaa
solx-rs: dispatch — strict version forms, real help command, job-star…
Shu-Wan Jun 10, 2026
46d2a15
solx-rs: resync completion assets with fixed v0.5.0 scripts
Shu-Wan Jun 10, 2026
7ee613c
solx-rs: document install and Sol toolchain setup
Shu-Wan Jun 10, 2026
97c1445
docs: lead the changelog with the v0.5.0 → v1.0 latency table
Shu-Wan Jun 10, 2026
d937ab4
ci: build and attach the musl binary as a per-PR artifact
Shu-Wan Jun 10, 2026
9e18daa
v1.0: retire Python, ship the native Rust binary
Shu-Wan Jun 11, 2026
7cbb2ee
v1.0: remove ~/.solkeep entirely — config [keep] is the only keep-list
Shu-Wan Jun 11, 2026
ecb2a6b
v1.0: gate releases on the test suite; mkdir ~/.local/bin in install …
Shu-Wan Jun 11, 2026
28f6208
v1.0: drop the parity matrix; treat v1.0 as a fresh start in the docs
Shu-Wan Jun 11, 2026
c8e971f
Route GPU jobs by wall-time + Sol cheat sheet (CLI / PDF / skill) (#34)
Shu-Wan Jun 11, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 31 additions & 60 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
name: ci

# Lint + test the solx CLI on every push to main and every PR. The agent
# skill itself is prose + references (no build step); its evals run out of
# band (see DEVELOPMENT.md), so CI guards the code that ships as an artifact.
# Lint, test, and build the solx binary on every push to main and every PR.
# The agent skill itself is prose + references (no build step); its evals run
# out of band (see DEVELOPMENT.md), so CI guards the code that ships as the
# release artifact.

on:
push:
Expand All @@ -17,80 +18,50 @@ concurrency:
cancel-in-progress: true

jobs:
test:
name: test + lint (py${{ matrix.python }})
check:
name: test + lint
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python: ["3.10", "3.11", "3.12", "3.13"]
defaults:
run:
working-directory: solx
steps:
- uses: actions/checkout@v5

- name: Install uv (Python ${{ matrix.python }})
uses: astral-sh/setup-uv@v7
- uses: dtolnay/rust-toolchain@stable
with:
python-version: ${{ matrix.python }}
enable-cache: true
working-directory: solx # the uv project + lockfile live here

- name: Sync dependencies (frozen)
run: uv sync --frozen

- name: Lint
run: uv run --frozen ruff check src tests # pinned via uv.lock (dev group)

components: rustfmt, clippy
- uses: Swatinem/rust-cache@v2
with:
workspaces: solx
- name: Format
run: cargo fmt --all --check
- name: Clippy
run: cargo clippy --locked --all-targets -- -D warnings
- name: Test
run: uv run --frozen pytest -q
run: cargo test --locked

build:
# Build the single-file zipapp and attach it to the run, so a reviewer
# can install and test the PR's solx on Sol without building it: download
# solx.pyz from Checks -> Artifacts, then run it through install.sh (which
# re-stamps the shebang for the local interpreter — see DEVELOPMENT.md).
name: build solx.pyz
# Build the portable release binary and attach it to the run, so a
# reviewer can download it from Checks -> Artifacts and test the PR's
# solx on Sol without a Rust toolchain. The musl target links libc
# statically, so the artifact runs on any x86-64 Linux — Sol's RHEL 8
# included — regardless of the host glibc.
runs-on: ubuntu-latest
env:
SOLX_PYTHON: "3.11" # the zipapp's embedded bytecode is 3.11-specific
defaults:
run:
working-directory: solx
steps:
- uses: actions/checkout@v5

- name: Install uv (Python 3.11)
uses: astral-sh/setup-uv@v7
- uses: dtolnay/rust-toolchain@stable
with:
python-version: "3.11"
enable-cache: true
working-directory: solx

- name: Build zipapp
run: bash scripts/build-pyz.sh

- name: Smoke-test the artifact and installer
# The build's only check used to be that zipfile could open the
# archive — which tolerates the corruption the 0.5.0 installer
# produced. Actually *run* it: in place, and end-to-end through
# install.sh. The installer rebinds the interpreter, so point it at a
# path whose length differs from the build shebang (a symlink under
# $RUNNER_TEMP) — that is the condition under which an in-place shebang
# swap corrupts the offsets, so a regression here fails the build
# instead of shipping. Same runner, same interpreter, so no fallback.
run: |
set -eux
./dist/solx.pyz --version
ln -sf "$(uv python find 3.11)" "$RUNNER_TEMP/python3.11"
SOLX_INSTALL_DIR="$RUNNER_TEMP/bin" SOLX_PYTHON="$RUNNER_TEMP/python3.11" \
sh scripts/install.sh ./dist/solx.pyz
"$RUNNER_TEMP/bin/solx" --version

- name: Upload zipapp
targets: x86_64-unknown-linux-musl
- uses: Swatinem/rust-cache@v2
with:
workspaces: solx
- name: Build (musl static)
run: cargo build --locked --release --target x86_64-unknown-linux-musl
- name: Upload binary
uses: actions/upload-artifact@v4
with:
name: solx-pyz
path: solx/dist/solx.pyz
name: solx-x86_64-linux-musl
path: solx/target/x86_64-unknown-linux-musl/release/solx
if-no-files-found: error
60 changes: 30 additions & 30 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
name: release

# CLI-first release. A pushed `vX.Y.Z` tag builds the single-file zipapp
# (solx.pyz) and publishes a GitHub Release with it + install.sh attached, so
# curl -fsSL .../releases/latest/download/install.sh | sh
# A pushed `vX.Y.Z` tag builds the static `solx` binary and publishes a
# GitHub Release with it attached, so
# curl -fLo solx .../releases/latest/download/solx-x86_64-unknown-linux-musl
# always fetches the build matching the tag. The skill rides the same tag
# (one version line — see CHANGELOG.md), installed from the repo tree.

Expand All @@ -12,31 +12,30 @@ on:
workflow_dispatch:
inputs:
tag:
description: "Existing tag to (re)build a release for, e.g. v0.4.0"
description: "Existing tag to (re)build a release for, e.g. v1.0.0"
required: true

permissions:
contents: write

jobs:
release:
name: build .pyz + publish release
name: build binary + publish release
runs-on: ubuntu-latest
env:
# Must match build-pyz.sh / install.sh: the embedded bytecode is
# interpreter-specific, so the build and the install shebang agree.
SOLX_PYTHON: "3.11"
defaults:
run:
working-directory: solx
steps:
- uses: actions/checkout@v5
with:
ref: ${{ github.event.inputs.tag || github.ref }}

- name: Install uv (Python 3.11)
uses: astral-sh/setup-uv@v7
- uses: dtolnay/rust-toolchain@stable
with:
targets: x86_64-unknown-linux-musl
- uses: Swatinem/rust-cache@v2
with:
python-version: "3.11"
enable-cache: true
working-directory: solx # the uv project + lockfile live here
workspaces: solx

- name: Resolve tag
id: tag
Expand All @@ -47,28 +46,30 @@ jobs:
REF_NAME: ${{ github.ref_name }}
run: echo "tag=${INPUT_TAG:-$REF_NAME}" >> "$GITHUB_OUTPUT"

- name: Verify the tag matches the one version line (CLI, package, skill)
working-directory: solx
- name: Verify the tag matches the one version line (crate + skill)
env:
TAG: ${{ steps.tag.outputs.tag }}
run: |
uv sync --frozen
want="${TAG#v}"
cli="$(uv run --frozen solx --version)"
pkg="$(sed -nE 's/^version = "([^"]+)".*/\1/p' pyproject.toml | head -1)"
crate="$(sed -nE 's/^version = "([^"]+)".*/\1/p' Cargo.toml | head -1)"
skill="$(sed -nE 's/^version:[[:space:]]*([^[:space:]]+).*/\1/p' ../skills/sol-skill/SKILL.md | head -1)"
echo "tag=$want solx=$cli pyproject=$pkg SKILL.md=$skill"
if [ "$cli" != "$want" ] || [ "$pkg" != "$want" ] || [ "$skill" != "$want" ]; then
echo "::error::version mismatch — tag=$want solx=$cli pyproject=$pkg SKILL.md=$skill. Bump all three (and uv lock) or retag." >&2
echo "tag=$want Cargo.toml=$crate SKILL.md=$skill"
if [ "$crate" != "$want" ] || [ "$skill" != "$want" ]; then
echo "::error::version mismatch — tag=$want Cargo.toml=$crate SKILL.md=$skill. Bump both (and the lockfile) or retag." >&2
exit 1
fi

- name: Run tests
working-directory: solx
run: uv run --frozen pytest -q
- name: Test (locked)
run: cargo test --locked

- name: Build (musl static)
run: cargo build --locked --release --target x86_64-unknown-linux-musl

- name: Build single-file zipapp
run: bash solx/scripts/build-pyz.sh
- name: Stage the release asset
run: |
install -m 755 \
target/x86_64-unknown-linux-musl/release/solx \
solx-x86_64-unknown-linux-musl

- name: Publish GitHub Release
env:
Expand All @@ -77,11 +78,10 @@ jobs:
run: |
# Create on first run; on a re-run (workflow_dispatch) refresh assets.
if gh release view "$TAG" >/dev/null 2>&1; then
gh release upload "$TAG" solx/dist/solx.pyz solx/scripts/install.sh --clobber
gh release upload "$TAG" solx-x86_64-unknown-linux-musl --clobber
else
gh release create "$TAG" \
--title "$TAG" \
--generate-notes \
solx/dist/solx.pyz \
solx/scripts/install.sh
solx-x86_64-unknown-linux-musl
fi
92 changes: 86 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,88 @@ This project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html)
and the [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) format.

From v0.4.0 the CLI and the skill share **one version line**: each entry's
version matches `solx/src/solx/__init__.py`, the `version` field in
[`skills/sol-skill/SKILL.md`](skills/sol-skill/SKILL.md), and the git tag,
and a pushed `vX.Y.Z` tag builds and publishes the release.
version matches the `version` field in [`solx/Cargo.toml`](solx/Cargo.toml)
and in [`skills/sol-skill/SKILL.md`](skills/sol-skill/SKILL.md), and the git
tag, and a pushed `vX.Y.Z` tag builds and publishes the release.

## [1.0.0] — 2026-06-10

solx is now a single native binary (Rust); the Python implementation is
retired. Every command starts in ~1ms with no Python interpreter and no
per-module NFS reads, so startup no longer degrades under node load or a
cold NFS cache. Install is one static file — download and `chmod +x` — with
no `uv`, no Python, and no toolchain on the box.

### Highlights

Startup latency, warm median on a Sol compute node (NFS `$HOME`):

| command | raw `squeue` | v0.5.0 (Python) | **v1.0 (Rust)** | speedup |
|---|---|---|---|---|
| `solx --version` | — | 0.10s | **0.010s** | 10× |
| `solx job list` | 0.08s | 0.39s | **0.12s** | 3.3× |
| `solx job time` | 0.08s | 0.31s | **0.12s** | 2.6× |

The binary tracks raw `squeue` — its residual over `squeue` is just the
`squeue` subprocess it spawns — and, unlike the Python builds, its startup
is flat regardless of node load or cache state. ~4.9MB, no runtime
dependencies (no Python, `uv`, or `rustc` on the target).

### Added

- **`solx cheatsheet`** — prints the Sol quick reference (SLURM basics,
`solx` ↔ raw SLURM, the partition/QOS table, Sol's `my*`/`show*`
wrappers, laptop tunnels) as text. It's embedded from the skill's single
source `skills/sol-skill/references/cheatsheet.md`, so the CLI, the
rendered [`docs/cheatsheet.pdf`](docs/cheatsheet.pdf), and the skill
reference can't drift. Wired into the bash/zsh/fish completions.
- **The Sol cheat sheet** in the skill —
`skills/sol-skill/references/cheatsheet.md`, with a centered README nav
and a `scripts/build-cheatsheet.sh` PDF build.
- **Eval-harness L3 grader `l3_sbatch_test_only`** — validates an agent's
recommended `#SBATCH` header against the live scheduler (`sbatch
--test-only`), catching partition/QOS combos that read plausibly but the
scheduler rejects (e.g. `-p htc -q debug`).

### Changed

- **The CLI is rewritten in Rust** (the `solx/` crate), preserving the
v0.5.0 command surface, output contract, and exit codes; behavioral
parity was verified during the port and is locked going forward by the
crate's test suite (`solx/tests/cli.rs` + unit vectors). The agent
skill's operational guidance is unchanged apart from the install steps,
the dropped `~/.solkeep` fallback (below), and the partition/QOS rework
(next).
- **SLURM partition/QOS guidance reworked.** The skill routes jobs by
wall-time and priority, not CPU-vs-GPU: ≤4h work (GPUs included) → `htc`;
a ≤15-minute urgent check → `-p public -q debug`; longer runs → `public`
(or `general` with `-q private` for preemptible buy-in nodes). This
fixes the "GPU → `public`" reflex that parked short GPU jobs behind
multi-day ones. The Submitting-Jobs section is promoted ahead of storage
and gains a personalized "know your access" step (`sacctmgr show assoc`).
Factual corrections verified against the live scheduler: `htc` carries
H200 nodes; `highmem`'s wall is 7 days; there is no `myquota` wrapper
(use `beegfs-ctl --getquota`); `sq` is the whole-cluster queue, not
`squeue --me`.
- **Install is a prebuilt static binary.** Download
`solx-x86_64-unknown-linux-musl` from the release, `chmod +x`, and drop
it on `PATH`. The `curl install.sh | sh` and `uv tool install` channels
are gone, along with their `uv`/Python requirement. See
[`solx/README.md`](solx/README.md).

### Removed

- **The Python implementation.** The Typer-then-`argparse` CLI that lived
at `solx/` — its test suite, the `.pyz` zipapp build (`build-pyz.sh`),
`install.sh`, and the `uv tool` install channel — is deleted. `solx/`
now holds the Rust crate, the only solx; the `.pyz` and `uv` install
channels no longer exist.
- **`~/.solkeep` support, end to end.** The config `[keep]` block is now
the only keep-list source: `solx keep` never reads a `~/.solkeep` (the
implicit fallback, deprecated since 0.4.0, was slated for 1.0.0), and the
`solx config import-solkeep` command and the `--solkeep <file>` flag are
removed with it. With no `[keep]` block, `keep` errors and points at
`solx config edit`.

## [0.5.1] — 2026-06-10

Expand Down Expand Up @@ -51,7 +130,7 @@ A `solx job` read now costs the same order as a raw SLURM call. Absolute
startup over NFS scales with node load — Python pays a per-module open
storm, so v0.4.0 can reach ~2.5s under contention — and the win is
removing that import tree. On node-local disk the floor is lower still
(`--version` ~0.02s). Full table in `docs/ROADMAP.md`.
(`--version` ~0.02s).

### Upgrading

Expand Down Expand Up @@ -85,7 +164,7 @@ removing that import tree. On node-local disk the floor is lower still
output contract are unchanged apart from the two documented supersets
below (`--json` placement and `-h`); verified with `evals/parity/`.
- **Startup latency** drops to the order of a raw SLURM call (see
Highlights above; full table in `docs/ROADMAP.md`): removing the
Highlights above): removing the
Typer/`click`/`rich` import tree cuts a `solx job` read from seconds to
~0.1–0.4s warm on the NFS `$HOME` install, ~13× / 6.4× / 8.1× over
v0.4.0 on `--version` / `job list` / `job time`. On node-local disk the
Expand Down Expand Up @@ -427,7 +506,8 @@ agentskills.io-compatible layout (skill content under
CSV-driven `/scratch` renewal, and shipped the original references
(`module.md`, `scratch.md`, `sharing.md`, `slurm.md`).

[Unreleased]: https://github.com/Shu-Wan/solx/compare/v0.5.1...HEAD
[Unreleased]: https://github.com/Shu-Wan/solx/compare/v1.0.0...HEAD
[1.0.0]: https://github.com/Shu-Wan/solx/releases/tag/v1.0.0
[0.5.1]: https://github.com/Shu-Wan/solx/releases/tag/v0.5.1
[0.5.0]: https://github.com/Shu-Wan/solx/releases/tag/v0.5.0
[0.4.0]: https://github.com/Shu-Wan/solx/releases/tag/v0.4.0
Expand Down
Loading
Loading