Add Linux/macOS test matrix and portability fallbacks#37
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces portability improvements to the evaluation harness scripts, making them compatible with macOS and environments lacking GNU coreutils or PyYAML. Key changes include replacing GNU-specific commands (like timeout and mapfile) with portable shell alternatives, implementing a fallback YAML parser in Python's standard library, and introducing a portable.sh utility library. The review feedback highlights a few remaining compatibility and parsing issues: the use of the GNU-specific sort -z option in manifest.sh which will fail on macOS, and edge cases in the custom YAML parser (_yq.py) where inline comments on keys or list items without values are not correctly stripped, leading to parsing errors.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| skill_bundle_sha="$(cd "$skills_root" && find . -type f \( -name "*.md" -o -name "*.sh" -o -name "*.yaml" -o -name "*.json" \) -print0 \ | ||
| | sort -z \ | ||
| | xargs -0 sha256sum 2>/dev/null \ | ||
| | sha256sum \ | ||
| | while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \ | ||
| | portable_sha256_stdin \ | ||
| | cut -d' ' -f1)" |
There was a problem hiding this comment.
The sort -z option is a GNU extension and is not supported by the default BSD sort utility on macOS. Since this PR specifically aims to support macOS without requiring GNU coreutils, using sort -z will cause the manifest generation to fail on macOS.
A highly portable alternative to sort null-terminated lines is to translate null bytes to newlines, sort them, and translate them back using tr.
| skill_bundle_sha="$(cd "$skills_root" && find . -type f \( -name "*.md" -o -name "*.sh" -o -name "*.yaml" -o -name "*.json" \) -print0 \ | |
| | sort -z \ | |
| | xargs -0 sha256sum 2>/dev/null \ | |
| | sha256sum \ | |
| | while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \ | |
| | portable_sha256_stdin \ | |
| | cut -d' ' -f1)" | |
| skill_bundle_sha="$(cd "$skills_root" && find . -type f \( -name "*.md" -o -name "*.sh" -o -name "*.yaml" -o -name "*.json" \) -print0 \ | |
| | tr '\0' '\n' | sort | tr '\n' '\0' \ | |
| | while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \ | |
| | portable_sha256_stdin \ | |
| | cut -d' ' -f1)" |
| skill_sha="$(cd "$skill_dir" && find . -type f -print0 \ | ||
| | sort -z \ | ||
| | xargs -0 sha256sum 2>/dev/null \ | ||
| | sha256sum \ | ||
| | while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \ | ||
| | portable_sha256_stdin \ | ||
| | cut -d' ' -f1)" |
There was a problem hiding this comment.
The sort -z option is a GNU extension and is not supported by the default BSD sort utility on macOS. Use tr to portably sort null-terminated lines.
| skill_sha="$(cd "$skill_dir" && find . -type f -print0 \ | |
| | sort -z \ | |
| | xargs -0 sha256sum 2>/dev/null \ | |
| | sha256sum \ | |
| | while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \ | |
| | portable_sha256_stdin \ | |
| | cut -d' ' -f1)" | |
| skill_sha="$(cd "$skill_dir" && find . -type f -print0 \ | |
| | tr '\0' '\n' | sort | tr '\n' '\0' \ | |
| | while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \ | |
| | portable_sha256_stdin \ | |
| | cut -d' ' -f1)" |
| fixture_sha="$(cd "$EVAL_FIXTURE_DIR" && find . -type f -print0 \ | ||
| | sort -z \ | ||
| | xargs -0 sha256sum 2>/dev/null \ | ||
| | sha256sum \ | ||
| | while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \ | ||
| | portable_sha256_stdin \ | ||
| | cut -d' ' -f1)" |
There was a problem hiding this comment.
The sort -z option is a GNU extension and is not supported by the default BSD sort utility on macOS. Use tr to portably sort null-terminated lines.
| fixture_sha="$(cd "$EVAL_FIXTURE_DIR" && find . -type f -print0 \ | |
| | sort -z \ | |
| | xargs -0 sha256sum 2>/dev/null \ | |
| | sha256sum \ | |
| | while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \ | |
| | portable_sha256_stdin \ | |
| | cut -d' ' -f1)" | |
| fixture_sha="$(cd "$EVAL_FIXTURE_DIR" && find . -type f -print0 \ | |
| | tr '\0' '\n' | sort | tr '\n' '\0' \ | |
| | while IFS= read -r -d '' file; do portable_sha256_file "$file"; done \ | |
| | portable_sha256_stdin \ | |
| | cut -d' ' -f1)" |
| key, raw_value = split_key_value(text) | ||
| if key is None: | ||
| break | ||
| key = parse_scalar(key) | ||
| if raw_value in ("|", ">"): | ||
| out[key], i = parse_block_scalar(lines, i + 1, ind, raw_value) | ||
| continue | ||
| if raw_value != "": | ||
| out[key] = parse_scalar(raw_value) | ||
| i += 1 | ||
| continue |
There was a problem hiding this comment.
If a key has an inline comment but no value (e.g., unsafe_shell: # comment), raw_value will contain the comment string (e.g., "# comment"). Because raw_value != "" is true, it will parse the comment as an empty string instead of None/null.
We should strip the inline comment from raw_value before checking if it is empty.
| key, raw_value = split_key_value(text) | |
| if key is None: | |
| break | |
| key = parse_scalar(key) | |
| if raw_value in ("|", ">"): | |
| out[key], i = parse_block_scalar(lines, i + 1, ind, raw_value) | |
| continue | |
| if raw_value != "": | |
| out[key] = parse_scalar(raw_value) | |
| i += 1 | |
| continue | |
| key, raw_value = split_key_value(text) | |
| if key is None: | |
| break | |
| key = parse_scalar(key) | |
| val_clean = strip_inline_comment(raw_value).strip() | |
| if val_clean in ("|", ">"): | |
| out[key], i = parse_block_scalar(lines, i + 1, ind, val_clean) | |
| continue | |
| if val_clean != "": | |
| out[key] = parse_scalar(val_clean) | |
| i += 1 | |
| continue |
| if raw_value == "": | ||
| item = {key: None} | ||
| if j < len(lines) and indent_of(lines[j]) > ind: | ||
| item[key], i = parse_block(lines, j, indent_of(lines[j])) | ||
| else: | ||
| item = {key: parse_scalar(raw_value)} | ||
| if j < len(lines) and indent_of(lines[j]) > ind: | ||
| rest, i = parse_dict(lines, j, indent_of(lines[j])) | ||
| item.update(rest) |
There was a problem hiding this comment.
If a list item key has an inline comment (e.g., - shell: # comment), raw_value will contain the comment string. This causes the parser to incorrectly execute the else block, setting the value to "" and flattening any nested block structures (e.g., cmd: echo under shell will be merged into the parent dictionary instead of nested under shell).
We should strip the inline comment from raw_value before checking if it is empty.
| if raw_value == "": | |
| item = {key: None} | |
| if j < len(lines) and indent_of(lines[j]) > ind: | |
| item[key], i = parse_block(lines, j, indent_of(lines[j])) | |
| else: | |
| item = {key: parse_scalar(raw_value)} | |
| if j < len(lines) and indent_of(lines[j]) > ind: | |
| rest, i = parse_dict(lines, j, indent_of(lines[j])) | |
| item.update(rest) | |
| val_clean = strip_inline_comment(raw_value).strip() | |
| if val_clean == "": | |
| item = {key: None} | |
| if j < len(lines) and indent_of(lines[j]) > ind: | |
| item[key], i = parse_block(lines, j, indent_of(lines[j])) | |
| else: | |
| item = {key: parse_scalar(val_clean)} | |
| if j < len(lines) and indent_of(lines[j]) > ind: | |
| rest, i = parse_dict(lines, j, indent_of(lines[j])) | |
| item.update(rest) |
Add Linux/macOS test matrix and portability fallbacks
Summary
Fixes #24.
scripts/eval/tests/*.shsuite onubuntu-latestandmacos-latest.mapfileusage and removinggrep -Pfrom the pre-push hook.python3stdlib path working without PyYAML by adding stdlib fallback coverage for the yq shim.python3/shasumfallbacks when GNU tools are unavailable.Changes
.github/workflows/tests.ymlwith a Linux/macOS matrix for the shell test suite.scripts/eval/lib/portable.shfor SHA-256 and timeout fallbacks.Why
Issue #24 asks for a CI matrix that verifies the full shell test suite on Linux and macOS. Running the suite locally on macOS exposed a few portability blockers that would make that matrix fail or depend on non-base tools:
/bin/bash3.2 does not providemapfile.grep -Pis not portable to BSD grep.python3stdlib fallback.timeout/sha256sumavailability differs across macOS environments.The workflow and portability fixes are kept together so the new macOS job has a realistic chance of validating the existing suite instead of immediately failing on known toolchain assumptions.
Verification
PATH=/usr/bin:/bin:/usr/sbin:/sbin bash -lc 'for t in ./scripts/eval/tests/*.sh; do echo "== $(basename "$t")"; bash "$t"; done'npm testRisk
The largest change is the stdlib fallback path in the yq shim. I added focused coverage for the YAML shapes the harness currently relies on: nested maps, inline lists, block scalars, inline comments, list iteration, indexed objects, and default expressions. The workflow provisions
jqwhen missing; timeout and SHA-256 portability are handled inside the harness rather than hidden by installing GNU coreutils in CI.