Skip to content

Add Linux/macOS test matrix and portability fallbacks#37

Open
jstar0 wants to merge 2 commits into
nano-step:mainfrom
jstar0:ci/test-matrix
Open

Add Linux/macOS test matrix and portability fallbacks#37
jstar0 wants to merge 2 commits into
nano-step:mainfrom
jstar0:ci/test-matrix

Conversation

@jstar0

@jstar0 jstar0 commented Jun 4, 2026

Copy link
Copy Markdown

Add Linux/macOS test matrix and portability fallbacks

Summary

Fixes #24.

  • Add a GitHub Actions workflow that runs every scripts/eval/tests/*.sh suite on ubuntu-latest and macos-latest.
  • Make the harness test path work on macOS Bash 3.2 by replacing mapfile usage and removing grep -P from the pre-push hook.
  • Keep the documented python3 stdlib path working without PyYAML by adding stdlib fallback coverage for the yq shim.
  • Add portable timeout and SHA-256 helpers so the test path can use python3/shasum fallbacks when GNU tools are unavailable.

Changes

  • Add .github/workflows/tests.yml with a Linux/macOS matrix for the shell test suite.
  • Add scripts/eval/lib/portable.sh for SHA-256 and timeout fallbacks.
  • Update manifest, stability, spawn, run, and hook paths to use portable helpers and Bash 3-compatible array loading.
  • Extend test coverage for stdlib YAML parsing, portable tool fallbacks, and pre-push hook skill detection.

Why

Issue #24 asks for a CI matrix that verifies the full shell test suite on Linux and macOS. Running the suite locally on macOS exposed a few portability blockers that would make that matrix fail or depend on non-base tools:

  • macOS /bin/bash 3.2 does not provide mapfile.
  • grep -P is not portable to BSD grep.
  • a clean Python environment may not have PyYAML even though the project documents a python3 stdlib fallback.
  • timeout/sha256sum availability differs across macOS environments.

The workflow and portability fixes are kept together so the new macOS job has a realistic chance of validating the existing suite instead of immediately failing on known toolchain assumptions.

Verification

for t in ./scripts/eval/tests/*.sh; do echo "== $(basename "$t")"; bash "$t"; done
PATH=/usr/bin:/bin:/usr/sbin:/sbin bash -lc 'for t in ./scripts/eval/tests/*.sh; do echo "== $(basename "$t")"; bash "$t"; done'
npm test

Risk

The largest change is the stdlib fallback path in the yq shim. I added focused coverage for the YAML shapes the harness currently relies on: nested maps, inline lists, block scalars, inline comments, list iteration, indexed objects, and default expressions. The workflow provisions jq when missing; timeout and SHA-256 portability are handled inside the harness rather than hidden by installing GNU coreutils in CI.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces portability improvements to the evaluation harness scripts, making them compatible with macOS and environments lacking GNU coreutils or PyYAML. Key changes include replacing GNU-specific commands (like timeout and mapfile) with portable shell alternatives, implementing a fallback YAML parser in Python's standard library, and introducing a portable.sh utility library. The review feedback highlights a few remaining compatibility and parsing issues: the use of the GNU-specific sort -z option in manifest.sh which will fail on macOS, and edge cases in the custom YAML parser (_yq.py) where inline comments on keys or list items without values are not correctly stripped, leading to parsing errors.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines 38 to 42
skill_bundle_sha="$(cd "$skills_root" && find . -type f \( -name "*.md" -o -name "*.sh" -o -name "*.yaml" -o -name "*.json" \) -print0 \
| sort -z \
| xargs -0 sha256sum 2>/dev/null \
| sha256sum \
| while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \
| portable_sha256_stdin \
| cut -d' ' -f1)"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The sort -z option is a GNU extension and is not supported by the default BSD sort utility on macOS. Since this PR specifically aims to support macOS without requiring GNU coreutils, using sort -z will cause the manifest generation to fail on macOS.

A highly portable alternative to sort null-terminated lines is to translate null bytes to newlines, sort them, and translate them back using tr.

Suggested change
skill_bundle_sha="$(cd "$skills_root" && find . -type f \( -name "*.md" -o -name "*.sh" -o -name "*.yaml" -o -name "*.json" \) -print0 \
| sort -z \
| xargs -0 sha256sum 2>/dev/null \
| sha256sum \
| while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \
| portable_sha256_stdin \
| cut -d' ' -f1)"
skill_bundle_sha="$(cd "$skills_root" && find . -type f \( -name "*.md" -o -name "*.sh" -o -name "*.yaml" -o -name "*.json" \) -print0 \
| tr '\0' '\n' | sort | tr '\n' '\0' \
| while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \
| portable_sha256_stdin \
| cut -d' ' -f1)"

Comment on lines 50 to 54
skill_sha="$(cd "$skill_dir" && find . -type f -print0 \
| sort -z \
| xargs -0 sha256sum 2>/dev/null \
| sha256sum \
| while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \
| portable_sha256_stdin \
| cut -d' ' -f1)"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The sort -z option is a GNU extension and is not supported by the default BSD sort utility on macOS. Use tr to portably sort null-terminated lines.

Suggested change
skill_sha="$(cd "$skill_dir" && find . -type f -print0 \
| sort -z \
| xargs -0 sha256sum 2>/dev/null \
| sha256sum \
| while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \
| portable_sha256_stdin \
| cut -d' ' -f1)"
skill_sha="$(cd "$skill_dir" && find . -type f -print0 \
| tr '\0' '\n' | sort | tr '\n' '\0' \
| while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \
| portable_sha256_stdin \
| cut -d' ' -f1)"

Comment on lines 60 to 64
fixture_sha="$(cd "$EVAL_FIXTURE_DIR" && find . -type f -print0 \
| sort -z \
| xargs -0 sha256sum 2>/dev/null \
| sha256sum \
| while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \
| portable_sha256_stdin \
| cut -d' ' -f1)"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The sort -z option is a GNU extension and is not supported by the default BSD sort utility on macOS. Use tr to portably sort null-terminated lines.

Suggested change
fixture_sha="$(cd "$EVAL_FIXTURE_DIR" && find . -type f -print0 \
| sort -z \
| xargs -0 sha256sum 2>/dev/null \
| sha256sum \
| while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \
| portable_sha256_stdin \
| cut -d' ' -f1)"
fixture_sha="$(cd "$EVAL_FIXTURE_DIR" && find . -type f -print0 \
| tr '\0' '\n' | sort | tr '\n' '\0' \
| while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \
| portable_sha256_stdin \
| cut -d' ' -f1)"

Comment thread scripts/eval/lib/_yq.py
Comment on lines +216 to +226
key, raw_value = split_key_value(text)
if key is None:
break
key = parse_scalar(key)
if raw_value in ("|", ">"):
out[key], i = parse_block_scalar(lines, i + 1, ind, raw_value)
continue
if raw_value != "":
out[key] = parse_scalar(raw_value)
i += 1
continue

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If a key has an inline comment but no value (e.g., unsafe_shell: # comment), raw_value will contain the comment string (e.g., "# comment"). Because raw_value != "" is true, it will parse the comment as an empty string instead of None/null.

We should strip the inline comment from raw_value before checking if it is empty.

Suggested change
key, raw_value = split_key_value(text)
if key is None:
break
key = parse_scalar(key)
if raw_value in ("|", ">"):
out[key], i = parse_block_scalar(lines, i + 1, ind, raw_value)
continue
if raw_value != "":
out[key] = parse_scalar(raw_value)
i += 1
continue
key, raw_value = split_key_value(text)
if key is None:
break
key = parse_scalar(key)
val_clean = strip_inline_comment(raw_value).strip()
if val_clean in ("|", ">"):
out[key], i = parse_block_scalar(lines, i + 1, ind, val_clean)
continue
if val_clean != "":
out[key] = parse_scalar(val_clean)
i += 1
continue

Comment thread scripts/eval/lib/_yq.py
Comment on lines +271 to +279
if raw_value == "":
item = {key: None}
if j < len(lines) and indent_of(lines[j]) > ind:
item[key], i = parse_block(lines, j, indent_of(lines[j]))
else:
item = {key: parse_scalar(raw_value)}
if j < len(lines) and indent_of(lines[j]) > ind:
rest, i = parse_dict(lines, j, indent_of(lines[j]))
item.update(rest)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If a list item key has an inline comment (e.g., - shell: # comment), raw_value will contain the comment string. This causes the parser to incorrectly execute the else block, setting the value to "" and flattening any nested block structures (e.g., cmd: echo under shell will be merged into the parent dictionary instead of nested under shell).

We should strip the inline comment from raw_value before checking if it is empty.

Suggested change
if raw_value == "":
item = {key: None}
if j < len(lines) and indent_of(lines[j]) > ind:
item[key], i = parse_block(lines, j, indent_of(lines[j]))
else:
item = {key: parse_scalar(raw_value)}
if j < len(lines) and indent_of(lines[j]) > ind:
rest, i = parse_dict(lines, j, indent_of(lines[j]))
item.update(rest)
val_clean = strip_inline_comment(raw_value).strip()
if val_clean == "":
item = {key: None}
if j < len(lines) and indent_of(lines[j]) > ind:
item[key], i = parse_block(lines, j, indent_of(lines[j]))
else:
item = {key: parse_scalar(val_clean)}
if j < len(lines) and indent_of(lines[j]) > ind:
rest, i = parse_dict(lines, j, indent_of(lines[j]))
item.update(rest)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Verify portability on macOS (BSD coreutils) — set up a CI matrix

1 participant