Skip to content

Integrate Rust crates with docs and CI/CD#7

Open
countradooku wants to merge 2 commits into
unjs:mainfrom
countradooku:feat/rust-crate-integration
Open

Integrate Rust crates with docs and CI/CD#7
countradooku wants to merge 2 commits into
unjs:mainfrom
countradooku:feat/rust-crate-integration

Conversation

@countradooku

@countradooku countradooku commented Mar 7, 2026

Copy link
Copy Markdown

Summary by CodeRabbit

  • New Features
    • Rust crate providing safe bindings for markdown rendering (HTML, AST, ANSI, text, metadata) and a bundled native dependency (no system install required)
  • CI/CD
    • Rust workspace tests added to CI and automated publish steps to crates.io (guarded by registry token)
  • Documentation
    • README and changelog updated to document the Rust bindings
  • Chores
    • Ignore rules updated for Rust build artifacts

@coderabbitai

coderabbitai Bot commented Mar 7, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Adds a Rust workspace with safe and sys crates for md4x, vendors libyaml 0.2.5 (C sources, build system, tests, docs), and updates CI/release workflows to build, test, and optionally publish the Rust crates to crates.io; also updates ignores and documentation to reflect the new Rust bindings.

Changes

Cohort / File(s) Summary
CI/CD Workflows
./github/workflows/ci.yml, ./github/workflows/release.yml
Inserted Rust toolchain setup, rust dependency cache, workspace tests (cargo test --workspace --locked in rust/) and crates.io publish steps gated by CARGO_REGISTRY_TOKEN (publish md4x-sys then md4x, with index wait).
Repository ignores & formatting
.gitignore, .prettierignore
Added ignores for rust/target/ and rust/md4x-sys/vendor/ paths; minor cleanup of a stray line.
Docs & metadata
AGENTS.md, CHANGELOG.md, README.md
Documented Rust workspace, build/test instructions, changelog entries for v0.0.18, and README examples/usage for the Rust crate.
Rust workspace manifests
rust/Cargo.toml, rust/md4x-sys/Cargo.toml, rust/md4x/Cargo.toml
Added workspace root and individual crate manifests (md4x-sys, md4x) with package metadata and dependencies.
md4x-sys: build & FFI
rust/md4x-sys/build.rs, rust/md4x-sys/src/lib.rs
Added build script to compile bundled C sources (including vendored libyaml) and comprehensive #[repr(C)] FFI bindings (types, flags, structs, and extern "C" functions) exposing md4x APIs.
md4x: safe Rust wrapper
rust/md4x/src/lib.rs
Added safe high-level API: Error type, bitflags for parser/renderer options, HtmlOpts, and public rendering functions (render_html, render_html_full, render_ast, render_ansi, render_text, render_meta, heal) with unit tests.
Vendored libyaml sources
rust/md4x-sys/vendor/libyaml/...
Full libyaml 0.2.5 vendor tree: public header (yaml.h), C sources (api.c, parser.c, emitter.c, loader.c, reader.c, dumper.c, writer.c), build scripts (Autotools, CMake), CI configs (.travis.yml, .appveyor.yml, GH Actions), tests, docs, examples, Dockerfiles, and packaging templates.
Publishing helper
rust/publish.sh
Script to vendor C sources into md4x-sys, publish md4x-sys and md4x to crates.io and wait for index propagation.

Sequence Diagram(s)

sequenceDiagram
  participant Dev as Developer
  participant CI as GitHub Actions (ci.yml)
  participant Rust as Rust Toolchain
  participant Cargo as Cargo (workspace)
  participant Crates as crates.io

  Dev->>CI: push PR / tag
  CI->>Rust: setup toolchain (dtolnay/rust-toolchain)
  CI->>CI: cache Rust deps (swatinem/rust-cache)
  CI->>Cargo: run `cargo test --workspace --locked` (in rust/)
  alt release workflow with token
    CI->>Cargo: publish md4x-sys (`cargo publish --locked`)
    Cargo->>Crates: push crate index entry
    CI->>Crates: poll/wait for index propagation
    CI->>Cargo: publish md4x (`cargo publish --locked`)
    Cargo->>Crates: push crate index entry
  end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐇 I bundled C and Rust with cheer,

vendored yaml, tests appear.
From FFI roots to safe crate song,
CI sings publish all night long.
A hopping rabbit seals the tag—so strong!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main changes: adding Rust crate integration with documentation and CI/CD pipeline support, which are the primary objectives of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 91.35% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
AGENTS.md (1)

137-150: ⚠️ Potential issue | 🟡 Minor

Update the build-output count in this section.

Line 137 still says “four static libraries”, but the list below now includes seven C libraries before the executable/WASM/NAPI/Rust targets. The heading is stale and will mislead contributors reading the build docs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@AGENTS.md` around lines 137 - 150, The heading "Produces four static
libraries" is stale; update that summary to reflect the actual number of C
libraries listed (libmd4x, libmd4x-html, libmd4x-ast, libmd4x-ansi,
libmd4x-meta, libmd4x-text, libmd4x-heal) so the doc accurately states seven C
libraries are produced before the executable/WASM/NAPI/Rust targets; locate the
heading above the list and change the count wording accordingly while leaving
the individual items (md4x, md4x.wasm, md4x.{platform}-{arch}[-musl].node,
rust/md4x-sys, rust/md4x) unchanged.
🟡 Minor comments (15)
rust/md4x-sys/vendor/libyaml/examples/strings.yaml-1-1 (1)

1-1: ⚠️ Potential issue | 🟡 Minor

Typo in key name: "unqouted" → "unquoted".

The key name has a spelling error. Since this is an example file demonstrating YAML string types, it should use the correct spelling for clarity.

📝 Proposed fix
-unqouted: string
+unquoted: string
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/examples/strings.yaml` at line 1, Fix the
spelling mistake in the YAML example by renaming the key "unqouted" to
"unquoted" in strings.yaml; update the example entry so the key is correctly
spelled ("unquoted: string") to keep the documentation/example consistent and
clear.
rust/md4x-sys/vendor/libyaml/docker/README.mkd-6-16 (1)

6-16: ⚠️ Potential issue | 🟡 Minor

Fix the README example path.

The example uses dockerfiles/ubuntu-16.04, but this tree is under docker/, so the command as written does not match the files added in this PR. Line 6 also reads awkwardly (The current list is supports:), so this snippet could use a quick docs pass.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/docker/README.mkd` around lines 6 - 16, Update
the README.mkd to correct the awkward header and fix the example Dockerfile
path: replace the sentence "The current list is supports:" with a clear heading
like "Supported platforms:" and change the example docker build command to point
at the repo's docker directory (e.g., use the docker/dockerfiles/ubuntu-16.04
path in the -f argument instead of dockerfiles/ubuntu-16.04) so the example
matches the files added in this PR.
rust/md4x-sys/vendor/libyaml/.makefile-31-37 (1)

31-37: ⚠️ Potential issue | 🟡 Minor

Point SOURCE_FILES at the real source tree, not a generated test artifact.

find tests/run-test-suite fails on a fresh checkout since that directory is missing. The current implementation leaves indent with no inputs, defeating the "pre-bootstrap" helper use case described at the top of the file. The commented original line or a pattern like find . -type f -name '*.c' would correctly capture the source files in src/ and tests/.

Suggested fix
-# SOURCE_FILES := $(shell find . | grep '\.c$$')
-SOURCE_FILES := $(shell find tests/run-test-suite | grep '\.c$$')
+SOURCE_FILES := $(shell find . -type f -name '*.c')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/.makefile` around lines 31 - 37, The MAKEFILE
sets SOURCE_FILES to the wrong tree (tests/run-test-suite) causing fresh
checkouts to miss sources; change the SOURCE_FILES assignment to search the real
source tree (e.g., use find . -type f -name '*.c' or the original commented
pattern) so it captures src/ and tests/, and keep the INDENT fallback behavior
intact by ensuring SOURCE_FILES is non-empty before invoking INDENT; update the
SOURCE_FILES variable (and remove the hard-coded tests/run-test-suite) and
verify functions referencing SOURCE_FILES and INDENT still work during
pre-bootstrap.
rust/md4x-sys/vendor/libyaml/CMakeLists.txt-84-88 (1)

84-88: ⚠️ Potential issue | 🟡 Minor

Honor INSTALL_INCLUDE_DIR in the header install rule.

The target's install interface uses ${INSTALL_INCLUDE_DIR}, but the actual header install step hard-codes include. Any non-default include-dir override produces an inconsistent package layout.

♻️ Suggested fix
 install(
   FILES
     include/yaml.h
-  DESTINATION include COMPONENT Development
+  DESTINATION "${INSTALL_INCLUDE_DIR}" COMPONENT Development
   )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/CMakeLists.txt` around lines 84 - 88, The
install rule currently hard-codes "include" instead of using the same variable
as the install interface; update the install(FILES ... DESTINATION ...) call
that installs include/yaml.h to use ${INSTALL_INCLUDE_DIR} (the project install
include dir variable) so header installation honors non-default include-dir
overrides and matches the target's install interface.
rust/md4x-sys/vendor/libyaml/src/emitter.c-483-485 (1)

483-485: ⚠️ Potential issue | 🟡 Minor

assert(1) always passes and provides no protection for invalid emitter states.

This assertion is meant to catch invalid states but always succeeds. Use assert(0) to trigger a failure when an unexpected state is encountered.

Proposed fix
         default:
-            assert(1);      /* Invalid state. */
+            assert(0);      /* Invalid state. */
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/src/emitter.c` around lines 483 - 485, The
default branch in the emitter state switch uses assert(1) which never fails;
update the invalid-state guard in emitter.c (the switch's default case in the
emitter state handling, e.g., within the function containing that switch) to
assert(0) so an unexpected emitter state triggers a failure (or alternatively
call abort()/yaml_emitter_set_error with context if you prefer explicit error
handling), ensuring the invalid-state path actually fails during debugging.
rust/md4x-sys/vendor/libyaml/src/reader.c-396-398 (1)

396-398: ⚠️ Potential issue | 🟡 Minor

assert(1) always passes and provides no protection for invalid encoding states.

The comment says "Impossible" but the assertion assert(1) always succeeds. Use assert(0) to catch unexpected encoding values during debugging.

Proposed fix
                 default:
-                    assert(1);      /* Impossible. */
+                    assert(0);      /* Impossible. */
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/src/reader.c` around lines 396 - 398, The switch
default in reader.c currently uses assert(1) which always succeeds; update the
default case's assertion to assert(0) (or an equivalent failing assertion) so
unexpected encoding states are caught during debugging—locate the default:
branch in the switch where assert(1) is used and replace it with assert(0) to
ensure the "Impossible" path actually triggers an assertion failure.
rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c-77-77 (1)

77-77: ⚠️ Potential issue | 🟡 Minor

Typo in error message: "inialize" should be "initialize".

Proposed fix
-        fprintf(stderr, "Could not inialize the emitter object\n");
+        fprintf(stderr, "Could not initialize the emitter object\n");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c` at line 77, Fix
the typo in the fprintf error message: change the string passed to
fprintf(stderr, ...) that currently reads "Could not inialize the emitter
object\n" to use the correct spelling "initialize" so the message becomes "Could
not initialize the emitter object\n"; update the fprintf call in
example-deconstructor.c (the error logging for the emitter object) accordingly.
rust/md4x-sys/vendor/libyaml/tests/run-emitter.c-71-73 (1)

71-73: ⚠️ Potential issue | 🟡 Minor

assert(1) always passes and provides no protection for unexpected event types.

Similar to other files, this assertion is meant to catch unexpected event types but assert(1) always succeeds. Use assert(0) to fail on unexpected cases during testing.

Proposed fix
         default:
-            assert(1);
+            assert(0);      /* Unexpected event type. */
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/tests/run-emitter.c` around lines 71 - 73, The
default branch in the switch handling event types in run-emitter.c currently
uses assert(1) which always succeeds; change that assertion to assert(0) (or an
equivalent failing assertion such as assert(!"unexpected event type")) so the
test fails on unexpected event types — locate the default: case in the switch
and replace the assert(1) call accordingly.
rust/md4x-sys/vendor/libyaml/src/parser.c-299-301 (1)

299-301: ⚠️ Potential issue | 🟡 Minor

assert(1) always passes and provides no protection for invalid states.

This assertion is meant to catch invalid parser states during debugging, but assert(1) always evaluates to true and does nothing. It should be assert(0) to trigger an assertion failure when an unexpected state is encountered.

Proposed fix
         default:
-            assert(1);      /* Invalid state. */
+            assert(0);      /* Invalid state. */
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/src/parser.c` around lines 299 - 301, The
default case in the switch inside parser.c currently uses assert(1) which is a
no-op; change that assertion to fail on invalid states by replacing assert(1)
with assert(0) (or an equivalent failing assertion like assert(!"Invalid
state")) in the default: branch so unexpected parser states trigger a
diagnostic; locate the default case in the parser.c switch and update the assert
accordingly.
rust/md4x-sys/vendor/libyaml/src/emitter.c-1340-1342 (1)

1340-1342: ⚠️ Potential issue | 🟡 Minor

assert(1) always passes and provides no protection for unexpected scalar styles.

Same issue as above - the assertion does nothing. Use assert(0) for impossible cases.

Proposed fix
         default:
-            assert(1);      /* Impossible. */
+            assert(0);      /* Impossible. */
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/src/emitter.c` around lines 1340 - 1342, The
default branch in the switch handling scalar styles uses assert(1), which always
passes; replace that with assert(0) (or an explicit failing assertion) so the
impossible case actually triggers a failure—locate the default: assert(1);
occurrence in emitter.c (the switch over scalar styles) and change it to
assert(0) to enforce the impossible-case check.
rust/md4x-sys/vendor/libyaml/src/dumper.c-117-118 (1)

117-118: ⚠️ Potential issue | 🟡 Minor

Comment typo: refers to "emitter" but checks document parameter.

The assertion comment on Line 118 appears to be copy-pasted from Line 117. It should refer to the document parameter being checked.

Proposed fix
     assert(emitter);            /* Non-NULL emitter object is required. */
-    assert(document);           /* Non-NULL emitter object is expected. */
+    assert(document);           /* Non-NULL document object is expected. */
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/src/dumper.c` around lines 117 - 118, The
comment on the second assert is incorrect: it refers to "emitter" while the
check is for the document parameter; update the comment next to
assert(document); to mention "document" (e.g., "Non-NULL document object is
required."), keeping the surrounding asserts and semantics in dumper.c intact so
the assertions read consistently for emitters and documents.
rust/md4x-sys/vendor/libyaml/include/Makefile.am-16-17 (1)

16-17: ⚠️ Potential issue | 🟡 Minor

dist-hook depends on html target which is only defined when DOXYGEN is enabled.

If DOXYGEN is not enabled (line 6 conditional), the html target won't exist, causing make dist to fail. The dependency should be conditional or the html prerequisite should be moved inside the conditional block.

Proposed fix
-dist-hook: html
-	cp -a $(top_builddir)/doc/html $(top_distdir)/doc
+if DOXYGEN
+dist-hook: html
+	cp -a $(top_builddir)/doc/html $(top_distdir)/doc
+endif
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/include/Makefile.am` around lines 16 - 17, The
dist-hook target currently lists html as a prerequisite even when DOXYGEN is not
enabled, causing make dist to fail; update the Makefile.am so that the cp -a
$(top_builddir)/doc/html $(top_distdir)/doc step is only added when the DOXYGEN
conditional is active (i.e., move the html prerequisite and the dist-hook
install/copy command inside the same DOXYGEN conditional block) or make the html
dependency conditional (using Automake conditionals) so dist-hook does not
depend on a non-existent html target when DOXYGEN is disabled; locate the
dist-hook target and the DOXYGEN conditional in the file and adjust accordingly.
rust/md4x/src/lib.rs-264-277 (1)

264-277: ⚠️ Potential issue | 🟡 Minor

CString::new().unwrap_or_default() silently handles interior NUL bytes.

If title or css_url contains an interior NUL byte, CString::new() returns Err, and unwrap_or_default() silently produces an empty CString. This may be surprising to users. Consider either documenting this behavior or returning an error.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x/src/lib.rs` around lines 264 - 277, The current use of
std::ffi::CString::new(...).unwrap_or_default() for title_cstr and css_cstr
(constructed from opts.title and opts.css_url) silently converts strings
containing interior NULs into empty CStrings; instead, explicitly handle the
Result from CString::new for both title_cstr and css_cstr: propagate an error
(update the enclosing function to return Result and return a descriptive error)
or at minimum map the Err into a clear runtime error/log message rather than
unwrap_or_default; update the title_cstr/css_cstr construction and any function
signature that needs to change to surface this validation failure to callers.
rust/md4x/src/lib.rs-229-238 (1)

229-238: ⚠️ Potential issue | 🟡 Minor

Potential truncation when casting input.len() to u32.

For inputs larger than u32::MAX (4 GiB), this cast will silently truncate, causing only a portion of the input to be parsed. While extremely rare in practice, consider adding a length check or documenting the limitation.

🛡️ Optional: Add length validation
 pub fn render_html(
     input: &str,
     parser_flags: ParserFlags,
     renderer_flags: HtmlFlags,
 ) -> Result<String, Error> {
+    if input.len() > u32::MAX as usize {
+        return Err(Error);
+    }
     let mut buf: Vec<u8> = Vec::new();
     let rc = unsafe {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x/src/lib.rs` around lines 229 - 238, The call to sys::md_html casts
input.len() to u32 which can truncate for inputs > u32::MAX; update the code
around the sys::md_html invocation to validate input.len() before casting (e.g.,
check if input.len() > u32::MAX) and return or propagate a clear error instead
of calling sys::md_html with a truncated length; reference the input variable
and the sys::md_html call so you add the length check immediately before
creating rc and only cast to u32 after confirming the size is within bounds.
rust/md4x-sys/vendor/libyaml/src/api.c-404-409 (1)

404-409: ⚠️ Potential issue | 🟡 Minor

Typo: empty should be emitter in STACK_EMPTY macro call.

The first argument to STACK_EMPTY should be emitter, not empty. This is inconsistent with the original libyaml source code and other uses of the macro throughout the function. Although the macro doesn't use this parameter in its expansion, the typo should be corrected for consistency with the upstream source.

Fix
-    while (!STACK_EMPTY(empty, emitter->tag_directives)) {
+    while (!STACK_EMPTY(emitter, emitter->tag_directives)) {
         yaml_tag_directive_t tag_directive = POP(emitter, emitter->tag_directives);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/src/api.c` around lines 404 - 409, The macro
call uses a typo: replace the incorrect first argument `empty` with `emitter` in
the STACK_EMPTY invocation that guards popping tag_directives; specifically
update the line containing STACK_EMPTY(empty, emitter->tag_directives) so it
reads STACK_EMPTY(emitter, emitter->tag_directives) to match the other uses (see
POP(emitter, emitter->tag_directives), yaml_tag_directive_t handling, and
STACK_DEL(emitter, emitter->tag_directives)).
🧹 Nitpick comments (12)
.github/workflows/release.yml (1)

63-65: Consider increasing the crates.io index wait time.

A 20-second sleep may not be sufficient for the crates.io index to propagate, especially under load. The md4x crate depends on md4x-sys, and if the index hasn't updated, the publish will fail.

♻️ Suggested improvement with retry logic or longer wait
      - name: Wait for crates.io index
        if: ${{ env.CARGO_REGISTRY_TOKEN != '' }}
-       run: sleep 20
+       run: sleep 30

Or use a more robust polling approach:

      - name: Wait for crates.io index
        if: ${{ env.CARGO_REGISTRY_TOKEN != '' }}
        run: |
          for i in {1..10}; do
            cargo search md4x-sys 2>/dev/null | grep -q "^md4x-sys" && break
            echo "Waiting for crates.io index... (attempt $i)"
            sleep 10
          done
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 63 - 65, The "Wait for crates.io
index" step currently uses a fixed 20s sleep (if: ${{
secrets.CARGO_REGISTRY_TOKEN != '' }}), which is often too short for index
propagation and causes downstream publishes of md4x/md4x-sys to fail; replace
the single sleep with a robust polling/retry loop that runs cargo search
md4x-sys (or similar index probe) up to N attempts with short sleeps between
attempts, or at minimum increase the wait time substantially (e.g., multiple 10s
intervals), and ensure the step still guards on the same token condition.
rust/md4x-sys/Cargo.toml (1)

1-10: Consider adding recommended crates.io metadata.

The core Cargo.toml configuration is correct with proper links directive and build dependency. However, for better discoverability on crates.io, consider adding:

📦 Suggested metadata additions
 [package]
 name = "md4x-sys"
 version = "0.0.18"
 edition = "2021"
 description = "Raw FFI bindings to md4x, a CommonMark markdown parser"
 license = "MIT"
 links = "md4x"
+repository = "https://github.com/unjs/md4x"
+homepage = "https://github.com/unjs/md4x"
+readme = "README.md"
+keywords = ["markdown", "commonmark", "parser", "ffi"]
+categories = ["parsing", "text-processing"]

 [build-dependencies]
 cc = "1"

You may also want to add an include or exclude directive to control what gets published to crates.io, especially given the vendored libyaml directory.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/Cargo.toml` around lines 1 - 10, Add recommended crates.io
metadata to Cargo.toml for the md4x-sys package: include fields such as
repository, homepage, documentation, readme, keywords, categories, and
license-file to improve discoverability and linking; also add an include or
exclude directive to control published files (exclude the vendored libyaml
directory or explicitly include only src, Cargo.toml, README.md, LICENSE) so the
crate archive doesn't ship unnecessary vendored sources. Ensure you update the
[package] section in Cargo.toml and keep names like md4x-sys and the existing
links field intact.
rust/md4x-sys/build.rs (2)

13-25: Consider adding optimization flag to libyaml build for consistency.

The md4x build (line 41) includes -O2 optimization, but the libyaml build does not have any optimization flags. For consistent performance characteristics, consider adding the same optimization level to the libyaml build.

♻️ Proposed fix
     cc::Build::new()
         .include(libyaml.join("include"))
         .include(libyaml.join("src"))
         .define("YAML_DECLARE_STATIC", None)
         .define("YAML_VERSION_MAJOR", "0")
         .define("YAML_VERSION_MINOR", "2")
         .define("YAML_VERSION_PATCH", "5")
         .define("YAML_VERSION_STRING", "\"0.2.5\"")
+        .flag_if_supported("-O2")
         .file(libyaml.join("src/api.c"))
         .file(libyaml.join("src/reader.c"))
         .file(libyaml.join("src/scanner.c"))
         .file(libyaml.join("src/parser.c"))
         .compile("yaml");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/build.rs` around lines 13 - 25, The libyaml cc build invocation
(cc::Build::new() ... .compile("yaml")) lacks an optimization setting; add the
same optimization level used elsewhere (e.g., -O2) by configuring the cc build
(use .opt_level("2") or add .flag("-O2")) on the cc::Build chain before
.compile("yaml") so libyaml is compiled with consistent optimization as md4x's
build.

52-53: cargo:rerun-if-changed on directories may not trigger rebuilds as expected.

Using cargo:rerun-if-changed with directory paths only detects changes to the directory's metadata (e.g., adding/removing files), not changes to file contents within. For more reliable rebuild detection, consider using a build dependency like glob to enumerate files, or list specific source files.

♻️ Suggested improvement using explicit file tracking
-    println!("cargo:rerun-if-changed={}", src.display());
-    println!("cargo:rerun-if-changed={}", libyaml.display());
+    // Track specific source files for reliable rebuild detection
+    for entry in std::fs::read_dir(&src).unwrap() {
+        let path = entry.unwrap().path();
+        if path.extension().map_or(false, |e| e == "c" || e == "h") {
+            println!("cargo:rerun-if-changed={}", path.display());
+        }
+    }
+    for entry in std::fs::read_dir(&renderers).unwrap() {
+        let path = entry.unwrap().path();
+        if path.extension().map_or(false, |e| e == "c" || e == "h") {
+            println!("cargo:rerun-if-changed={}", path.display());
+        }
+    }
+    println!("cargo:rerun-if-changed={}", libyaml.join("src").display());
+    println!("cargo:rerun-if-changed={}", libyaml.join("include").display());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/build.rs` around lines 52 - 53, The two
println!("cargo:rerun-if-changed={}", src.display()) and
println!("cargo:rerun-if-changed={}", libyaml.display()) emit directory paths
which only track metadata changes; update build.rs (e.g., in the main build
script where src and libyaml are used) to enumerate the actual source files
(using a glob or walking the directories) and call
println!("cargo:rerun-if-changed={}", path.display()) for each file found so
that Cargo will rerun the build when file contents change rather than only when
directory metadata changes.
rust/md4x-sys/vendor/libyaml/tests/example-reformatter.c (1)

2-5: Add the direct <string.h> include.

Lines 22-24 and Lines 30-41 use memset and strcmp, so depending on yaml.h to provide those declarations is brittle and can break once indirect includes change.

Proposed fix
 `#include` <yaml.h>
 
 `#include` <stdlib.h>
 `#include` <stdio.h>
+#include <string.h>

Also applies to: 22-24, 30-41

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/tests/example-reformatter.c` around lines 2 - 5,
The file example-reformatter.c relies on C string functions but doesn't include
<string.h>; add a direct `#include` <string.h> alongside the existing includes at
the top of example-reformatter.c so declarations for memset and strcmp are
guaranteed (these functions are used in the code paths around the memset and
strcmp calls).
rust/md4x-sys/vendor/libyaml/.github/workflows/main.yml (1)

1-57: If this is meant to add CI here, this workflow is inert from this path.

GitHub Actions only discovers workflows from the repository-root .github/workflows, so the copy under rust/md4x-sys/vendor/libyaml/... will never execute in this repo. Mirror the job into the top-level workflow if you want libyaml coverage in md4x.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/.github/workflows/main.yml` around lines 1 - 57,
The workflow file (name: linux/mac) lives under a nested vendor directory so
GitHub Actions won't discover it; to fix, move or duplicate this workflow into
the repository-root workflows directory (.github/workflows/) or add an
equivalent top-level workflow that contains the same "jobs: build" job and steps
(bootstrap/configure/make/test, matrix strategy, install steps and compiler
matrix) so libyaml tests run in CI; ensure the workflow name, matrix settings
(matrix.compiler, matrix.os), and triggers (on: push/pull_request) are preserved
in the top-level main.yml.
rust/md4x-sys/vendor/libyaml/tests/CMakeLists.txt (1)

7-26: Register the new suite runners with CTest or another explicit CI entrypoint.

ctest only knows about test-version and test-reader here. Unless another job invokes them, run-parser-test-suite and run-emitter-test-suite are build-only in CMake-based CI, so regressions in the new harnesses can slip through.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/tests/CMakeLists.txt` around lines 7 - 26, CTest
is only registering test-version and test-reader, so the new harness runners
(run-parser-test-suite, run-emitter-test-suite, and any other run-*-test-suite
executables) are not executed by CI; update the CMakeLists.txt to call add_test
for each runner you want executed (e.g., add_test(NAME run-parser-test-suite
COMMAND run-parser-test-suite) and add_test(NAME run-emitter-test-suite COMMAND
run-emitter-test-suite)), and ensure any other run-*-test-suite executables
produced by add_yaml_executable are likewise registered so CTest will run them.
rust/md4x-sys/vendor/libyaml/Makefile.am (1)

24-33: Use $(MAKE) for recursive invocations.

The bare make calls here won't reliably forward the parent make's jobserver and flags, so -j, V=1, DESTDIR, and similar CI/release options can get lost.

♻️ Suggested fix
 bootstrap: maintainer-clean
 	./bootstrap
 	./configure
-	make
+	$(MAKE)
@@
 test: all
-	make -C tests check-TESTS
+	$(MAKE) -C tests check-TESTS
@@
 test-suite: tests/run-test-suite all
-	make -C $< test
+	$(MAKE) -C $< test
@@
 docker-dist: packaging
-	make -C $</docker libyaml-dist
+	$(MAKE) -C $</docker libyaml-dist
@@
 docker-test-pyyaml: packaging
-	make -C $</docker test-pyyaml
+	$(MAKE) -C $</docker test-pyyaml

Also applies to: 50-54

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/Makefile.am` around lines 24 - 33, Replace bare
make invocations in the Makefile targets with $(MAKE) so jobserver and make
flags propagate: in the bootstrap target use $(MAKE) instead of make; in the
test target use $(MAKE) -C tests check-TESTS; and in the test-suite target use
$(MAKE) -C $< test. Update any other occurrences (lines noted around the
bootstrap, test, and test-suite targets) to use $(MAKE) consistently.
rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c (1)

2-5: Missing #include <string.h> for memset and strcmp.

The code uses memset (lines 23-26) and strcmp (lines 32, 37, 42) but doesn't explicitly include <string.h>. While this may work if yaml.h transitively includes it, it's better to explicitly include required headers for portability.

Proposed fix
 `#include` <yaml.h>

 `#include` <stdlib.h>
 `#include` <stdio.h>
+#include <string.h>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c` around lines 2 -
5, Add an explicit `#include` <string.h> to example-deconstructor.c so the
declarations for memset and strcmp are available; place it with the other
includes (near `#include` <stdlib.h> and `#include` <stdio.h>) to ensure uses of
memset (lines invoking buffer zeroing) and strcmp (lines comparing YAML event
strings) are properly declared.
rust/md4x-sys/vendor/libyaml/include/Makefile.am (1)

1-4: Misleading variable name: AM_CPPFLAGS typically holds preprocessor flags, not file lists.

AM_CPPFLAGS conventionally holds C preprocessor flags like -I and -D. Using it to store a header file name is confusing and non-idiomatic. Consider renaming to something like LIBYAML_HEADERS for clarity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/include/Makefile.am` around lines 1 - 4, The
variable AM_CPPFLAGS is misused to list a header file; rename it to a clear
identifier like LIBYAML_HEADERS and update usages accordingly: replace the
definition AM_CPPFLAGS = yaml.h with LIBYAML_HEADERS = yaml.h and change
nobase_include_HEADERS = $(AM_CPPFLAGS) to nobase_include_HEADERS =
$(LIBYAML_HEADERS), making sure no other rules or files rely on AM_CPPFLAGS for
preprocessor flags (if they do, move actual flags into AM_CPPFLAGS and keep
header list in LIBYAML_HEADERS).
rust/md4x/src/lib.rs (2)

33-43: Consider adding context to the Error type.

The current Error is a unit struct with no information about what failed. Consider adding optional context (e.g., function name or error code) to aid debugging.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x/src/lib.rs` around lines 33 - 43, Change the unit `Error` into a
contextual error struct (e.g., `pub struct Error { context: Option<String> }`)
and update the `Display` impl to include the context when present (so `fmt`
prints "md4x rendering failed" plus the context); add small helpers like
`Error::new()` and `Error::with_context<C: Into<String>>(c: C)` to construct
errors with optional context, and keep `impl std::error::Error for Error {}`
as-is; update any call sites that construct `Error` to use the new constructors.

426-430: collect_heal duplicates collect_output.

The collect_heal function is identical to collect_output. Consider reusing the existing callback.

♻️ Reuse existing callback
 pub fn heal(input: &str) -> Result<String, Error> {
     let mut buf: Vec<u8> = Vec::new();
-
-    unsafe extern "C" fn collect_heal(data: *const c_char, size: u32, userdata: *mut c_void) {
-        let buf = &mut *(userdata as *mut Vec<u8>);
-        let slice = std::slice::from_raw_parts(data as *const u8, size as usize);
-        buf.extend_from_slice(slice);
-    }

     let rc = unsafe {
         sys::md_heal(
             input.as_ptr() as *const c_char,
             input.len() as u32,
-            collect_heal,
+            collect_output,
             &mut buf as *mut Vec<u8> as *mut c_void,
         )
     };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x/src/lib.rs` around lines 426 - 430, collect_heal is a duplicate of
collect_output; remove the duplicated unsafe extern "C" fn collect_heal and
reuse the existing collect_output callback instead: delete collect_heal
definition and update any places that passed collect_heal (callback
registrations or FFI calls) to pass collect_output, making sure the signature
matches (unsafe extern "C" fn(data: *const c_char, size: u32, userdata: *mut
c_void)) and adjust imports/visibility if necessary so collect_output is visible
where the callback is used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/release.yml:
- Around line 56-72: Update the three steps named "Publish md4x-sys to
crates.io", "Wait for crates.io index", and "Publish md4x to crates.io" to stop
using the secrets context in their if conditions; instead set
CARGO_REGISTRY_TOKEN as an env at the job level or in a prior step and change
the if expressions to use env (e.g. if: ${{ env.CARGO_REGISTRY_TOKEN != '' }}),
and remove the direct use of ${{ secrets.CARGO_REGISTRY_TOKEN }} inside the if
so the conditional works correctly in GitHub Actions.

In `@rust/md4x-sys/vendor/libyaml/src/api.c`:
- Around line 1129-1146: In yaml_document_delete, add the missing local
declaration for the variable named context (use the same declaration used in
yaml_document_add_scalar); place that declaration at the start of the
yaml_document_delete function so the STACK_EMPTY, POP, and STACK_DEL macros can
reference it (i.e., copy the context variable declaration from
yaml_document_add_scalar into yaml_document_delete).

In `@rust/md4x-sys/vendor/libyaml/src/writer.c`:
- Around line 116-120: The low-surrogate high byte is taking the wrong bits (it
uses ((value >> 8) & 0xFF)) so most scalars > U+10000 produce invalid UTF-16;
update the assignment that writes the low-surrogate high byte (the
emitter->raw_buffer.last[high+2] write using value) to mask down to the top 2
bits of the 10-bit low-surrogate payload (use ((value >> 8) & 0x03) instead of &
0xFF) so the low surrogate becomes 0xDC00 + (value & 0x3FF) and U+10400 will end
with 0xDC00.

In `@rust/md4x-sys/vendor/libyaml/tests/example-deconstructor-alt.c`:
- Around line 426-435: The code is creating the scalar implicitness mapping but
uses the wrong top-level key "version"; change the yaml_document_add_scalar call
that sets the key to use "implicit" instead of "version" so the mapping appended
with yaml_document_append_mapping_pair(&output_document, properties, key, map)
correctly names the section; update the scalar creation where key is assigned
(yaml_document_add_scalar) to "implicit" to restore the expected schema for
scalar events.

In `@rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c`:
- Around line 58-60: The code sets foundfile immediately after calling
fopen(argv[i], "rb") even when fopen returns NULL; change the logic in
run-emitter-test-suite.c so you first call fopen and check its return (input)
for NULL, only then set foundfile and proceed; update both occurrences where
fopen/ foundfile are used (the block with input = fopen(argv[i], "rb") and the
similar block at the other occurrence around the 72-line area) to avoid marking
the file as found when fopen fails and to handle/propagate the fopen error
appropriately.
- Around line 85-86: The test harness extracts token spans into fixed-size stack
buffers anchor[256] and tag[256] without bounds checking; ensure any computed
length for these copies is validated to be >=0 and < sizeof(anchor) (and
sizeof(tag)), clamp the copy length to sizeof-1, perform the copy with a bounded
routine and explicitly null-terminate, and handle the case where a '>' appears
before '<' by treating it as an error/empty span; apply the same fixes to the
other copy site around the 206-230 region so both anchor and tag extractions are
safe.
- Around line 131-138: get_value currently may write embedded NULs but the
caller in run-emitter-test-suite.c passes -1 to yaml_scalar_event_initialize,
causing truncation; change get_value to return the actual scalar length (update
its prototype and definition) and capture that length at the call site, then
pass that length instead of -1 to yaml_scalar_event_initialize (keep using
get_anchor, get_tag and the same value buffer); ensure any callers of get_value
use the new return value.

In `@rust/md4x-sys/vendor/libyaml/tests/run-parser-test-suite.c`:
- Around line 36-38: The code sets foundfile before checking fopen's result, so
a failed fopen can leave input NULL and later call
yaml_parser_set_input_file(NULL); change the logic in the main loop (where
variables input and foundfile are used and where yaml_parser_set_input_file is
called) to only set foundfile = 1 after verifying input != NULL, and if fopen
fails handle the error (e.g., print a message and continue or exit) rather than
proceeding; ensure both occurrences (the block around the current fopen and the
similar block at lines 43-46) perform the same
NULL-check-before-setting-foundfile and avoid passing a NULL FILE* to
yaml_parser_set_input_file.

---

Outside diff comments:
In `@AGENTS.md`:
- Around line 137-150: The heading "Produces four static libraries" is stale;
update that summary to reflect the actual number of C libraries listed (libmd4x,
libmd4x-html, libmd4x-ast, libmd4x-ansi, libmd4x-meta, libmd4x-text,
libmd4x-heal) so the doc accurately states seven C libraries are produced before
the executable/WASM/NAPI/Rust targets; locate the heading above the list and
change the count wording accordingly while leaving the individual items (md4x,
md4x.wasm, md4x.{platform}-{arch}[-musl].node, rust/md4x-sys, rust/md4x)
unchanged.

---

Minor comments:
In `@rust/md4x-sys/vendor/libyaml/.makefile`:
- Around line 31-37: The MAKEFILE sets SOURCE_FILES to the wrong tree
(tests/run-test-suite) causing fresh checkouts to miss sources; change the
SOURCE_FILES assignment to search the real source tree (e.g., use find . -type f
-name '*.c' or the original commented pattern) so it captures src/ and tests/,
and keep the INDENT fallback behavior intact by ensuring SOURCE_FILES is
non-empty before invoking INDENT; update the SOURCE_FILES variable (and remove
the hard-coded tests/run-test-suite) and verify functions referencing
SOURCE_FILES and INDENT still work during pre-bootstrap.

In `@rust/md4x-sys/vendor/libyaml/CMakeLists.txt`:
- Around line 84-88: The install rule currently hard-codes "include" instead of
using the same variable as the install interface; update the install(FILES ...
DESTINATION ...) call that installs include/yaml.h to use ${INSTALL_INCLUDE_DIR}
(the project install include dir variable) so header installation honors
non-default include-dir overrides and matches the target's install interface.

In `@rust/md4x-sys/vendor/libyaml/docker/README.mkd`:
- Around line 6-16: Update the README.mkd to correct the awkward header and fix
the example Dockerfile path: replace the sentence "The current list is
supports:" with a clear heading like "Supported platforms:" and change the
example docker build command to point at the repo's docker directory (e.g., use
the docker/dockerfiles/ubuntu-16.04 path in the -f argument instead of
dockerfiles/ubuntu-16.04) so the example matches the files added in this PR.

In `@rust/md4x-sys/vendor/libyaml/examples/strings.yaml`:
- Line 1: Fix the spelling mistake in the YAML example by renaming the key
"unqouted" to "unquoted" in strings.yaml; update the example entry so the key is
correctly spelled ("unquoted: string") to keep the documentation/example
consistent and clear.

In `@rust/md4x-sys/vendor/libyaml/include/Makefile.am`:
- Around line 16-17: The dist-hook target currently lists html as a prerequisite
even when DOXYGEN is not enabled, causing make dist to fail; update the
Makefile.am so that the cp -a $(top_builddir)/doc/html $(top_distdir)/doc step
is only added when the DOXYGEN conditional is active (i.e., move the html
prerequisite and the dist-hook install/copy command inside the same DOXYGEN
conditional block) or make the html dependency conditional (using Automake
conditionals) so dist-hook does not depend on a non-existent html target when
DOXYGEN is disabled; locate the dist-hook target and the DOXYGEN conditional in
the file and adjust accordingly.

In `@rust/md4x-sys/vendor/libyaml/src/api.c`:
- Around line 404-409: The macro call uses a typo: replace the incorrect first
argument `empty` with `emitter` in the STACK_EMPTY invocation that guards
popping tag_directives; specifically update the line containing
STACK_EMPTY(empty, emitter->tag_directives) so it reads STACK_EMPTY(emitter,
emitter->tag_directives) to match the other uses (see POP(emitter,
emitter->tag_directives), yaml_tag_directive_t handling, and STACK_DEL(emitter,
emitter->tag_directives)).

In `@rust/md4x-sys/vendor/libyaml/src/dumper.c`:
- Around line 117-118: The comment on the second assert is incorrect: it refers
to "emitter" while the check is for the document parameter; update the comment
next to assert(document); to mention "document" (e.g., "Non-NULL document object
is required."), keeping the surrounding asserts and semantics in dumper.c intact
so the assertions read consistently for emitters and documents.

In `@rust/md4x-sys/vendor/libyaml/src/emitter.c`:
- Around line 483-485: The default branch in the emitter state switch uses
assert(1) which never fails; update the invalid-state guard in emitter.c (the
switch's default case in the emitter state handling, e.g., within the function
containing that switch) to assert(0) so an unexpected emitter state triggers a
failure (or alternatively call abort()/yaml_emitter_set_error with context if
you prefer explicit error handling), ensuring the invalid-state path actually
fails during debugging.
- Around line 1340-1342: The default branch in the switch handling scalar styles
uses assert(1), which always passes; replace that with assert(0) (or an explicit
failing assertion) so the impossible case actually triggers a failure—locate the
default: assert(1); occurrence in emitter.c (the switch over scalar styles) and
change it to assert(0) to enforce the impossible-case check.

In `@rust/md4x-sys/vendor/libyaml/src/parser.c`:
- Around line 299-301: The default case in the switch inside parser.c currently
uses assert(1) which is a no-op; change that assertion to fail on invalid states
by replacing assert(1) with assert(0) (or an equivalent failing assertion like
assert(!"Invalid state")) in the default: branch so unexpected parser states
trigger a diagnostic; locate the default case in the parser.c switch and update
the assert accordingly.

In `@rust/md4x-sys/vendor/libyaml/src/reader.c`:
- Around line 396-398: The switch default in reader.c currently uses assert(1)
which always succeeds; update the default case's assertion to assert(0) (or an
equivalent failing assertion) so unexpected encoding states are caught during
debugging—locate the default: branch in the switch where assert(1) is used and
replace it with assert(0) to ensure the "Impossible" path actually triggers an
assertion failure.

In `@rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c`:
- Line 77: Fix the typo in the fprintf error message: change the string passed
to fprintf(stderr, ...) that currently reads "Could not inialize the emitter
object\n" to use the correct spelling "initialize" so the message becomes "Could
not initialize the emitter object\n"; update the fprintf call in
example-deconstructor.c (the error logging for the emitter object) accordingly.

In `@rust/md4x-sys/vendor/libyaml/tests/run-emitter.c`:
- Around line 71-73: The default branch in the switch handling event types in
run-emitter.c currently uses assert(1) which always succeeds; change that
assertion to assert(0) (or an equivalent failing assertion such as
assert(!"unexpected event type")) so the test fails on unexpected event types —
locate the default: case in the switch and replace the assert(1) call
accordingly.

In `@rust/md4x/src/lib.rs`:
- Around line 264-277: The current use of
std::ffi::CString::new(...).unwrap_or_default() for title_cstr and css_cstr
(constructed from opts.title and opts.css_url) silently converts strings
containing interior NULs into empty CStrings; instead, explicitly handle the
Result from CString::new for both title_cstr and css_cstr: propagate an error
(update the enclosing function to return Result and return a descriptive error)
or at minimum map the Err into a clear runtime error/log message rather than
unwrap_or_default; update the title_cstr/css_cstr construction and any function
signature that needs to change to surface this validation failure to callers.
- Around line 229-238: The call to sys::md_html casts input.len() to u32 which
can truncate for inputs > u32::MAX; update the code around the sys::md_html
invocation to validate input.len() before casting (e.g., check if input.len() >
u32::MAX) and return or propagate a clear error instead of calling sys::md_html
with a truncated length; reference the input variable and the sys::md_html call
so you add the length check immediately before creating rc and only cast to u32
after confirming the size is within bounds.

---

Nitpick comments:
In @.github/workflows/release.yml:
- Around line 63-65: The "Wait for crates.io index" step currently uses a fixed
20s sleep (if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }}), which is often too
short for index propagation and causes downstream publishes of md4x/md4x-sys to
fail; replace the single sleep with a robust polling/retry loop that runs cargo
search md4x-sys (or similar index probe) up to N attempts with short sleeps
between attempts, or at minimum increase the wait time substantially (e.g.,
multiple 10s intervals), and ensure the step still guards on the same token
condition.

In `@rust/md4x-sys/build.rs`:
- Around line 13-25: The libyaml cc build invocation (cc::Build::new() ...
.compile("yaml")) lacks an optimization setting; add the same optimization level
used elsewhere (e.g., -O2) by configuring the cc build (use .opt_level("2") or
add .flag("-O2")) on the cc::Build chain before .compile("yaml") so libyaml is
compiled with consistent optimization as md4x's build.
- Around line 52-53: The two println!("cargo:rerun-if-changed={}",
src.display()) and println!("cargo:rerun-if-changed={}", libyaml.display()) emit
directory paths which only track metadata changes; update build.rs (e.g., in the
main build script where src and libyaml are used) to enumerate the actual source
files (using a glob or walking the directories) and call
println!("cargo:rerun-if-changed={}", path.display()) for each file found so
that Cargo will rerun the build when file contents change rather than only when
directory metadata changes.

In `@rust/md4x-sys/Cargo.toml`:
- Around line 1-10: Add recommended crates.io metadata to Cargo.toml for the
md4x-sys package: include fields such as repository, homepage, documentation,
readme, keywords, categories, and license-file to improve discoverability and
linking; also add an include or exclude directive to control published files
(exclude the vendored libyaml directory or explicitly include only src,
Cargo.toml, README.md, LICENSE) so the crate archive doesn't ship unnecessary
vendored sources. Ensure you update the [package] section in Cargo.toml and keep
names like md4x-sys and the existing links field intact.

In `@rust/md4x-sys/vendor/libyaml/.github/workflows/main.yml`:
- Around line 1-57: The workflow file (name: linux/mac) lives under a nested
vendor directory so GitHub Actions won't discover it; to fix, move or duplicate
this workflow into the repository-root workflows directory (.github/workflows/)
or add an equivalent top-level workflow that contains the same "jobs: build" job
and steps (bootstrap/configure/make/test, matrix strategy, install steps and
compiler matrix) so libyaml tests run in CI; ensure the workflow name, matrix
settings (matrix.compiler, matrix.os), and triggers (on: push/pull_request) are
preserved in the top-level main.yml.

In `@rust/md4x-sys/vendor/libyaml/include/Makefile.am`:
- Around line 1-4: The variable AM_CPPFLAGS is misused to list a header file;
rename it to a clear identifier like LIBYAML_HEADERS and update usages
accordingly: replace the definition AM_CPPFLAGS = yaml.h with LIBYAML_HEADERS =
yaml.h and change nobase_include_HEADERS = $(AM_CPPFLAGS) to
nobase_include_HEADERS = $(LIBYAML_HEADERS), making sure no other rules or files
rely on AM_CPPFLAGS for preprocessor flags (if they do, move actual flags into
AM_CPPFLAGS and keep header list in LIBYAML_HEADERS).

In `@rust/md4x-sys/vendor/libyaml/Makefile.am`:
- Around line 24-33: Replace bare make invocations in the Makefile targets with
$(MAKE) so jobserver and make flags propagate: in the bootstrap target use
$(MAKE) instead of make; in the test target use $(MAKE) -C tests check-TESTS;
and in the test-suite target use $(MAKE) -C $< test. Update any other
occurrences (lines noted around the bootstrap, test, and test-suite targets) to
use $(MAKE) consistently.

In `@rust/md4x-sys/vendor/libyaml/tests/CMakeLists.txt`:
- Around line 7-26: CTest is only registering test-version and test-reader, so
the new harness runners (run-parser-test-suite, run-emitter-test-suite, and any
other run-*-test-suite executables) are not executed by CI; update the
CMakeLists.txt to call add_test for each runner you want executed (e.g.,
add_test(NAME run-parser-test-suite COMMAND run-parser-test-suite) and
add_test(NAME run-emitter-test-suite COMMAND run-emitter-test-suite)), and
ensure any other run-*-test-suite executables produced by add_yaml_executable
are likewise registered so CTest will run them.

In `@rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c`:
- Around line 2-5: Add an explicit `#include` <string.h> to
example-deconstructor.c so the declarations for memset and strcmp are available;
place it with the other includes (near `#include` <stdlib.h> and `#include`
<stdio.h>) to ensure uses of memset (lines invoking buffer zeroing) and strcmp
(lines comparing YAML event strings) are properly declared.

In `@rust/md4x-sys/vendor/libyaml/tests/example-reformatter.c`:
- Around line 2-5: The file example-reformatter.c relies on C string functions
but doesn't include <string.h>; add a direct `#include` <string.h> alongside the
existing includes at the top of example-reformatter.c so declarations for memset
and strcmp are guaranteed (these functions are used in the code paths around the
memset and strcmp calls).

In `@rust/md4x/src/lib.rs`:
- Around line 33-43: Change the unit `Error` into a contextual error struct
(e.g., `pub struct Error { context: Option<String> }`) and update the `Display`
impl to include the context when present (so `fmt` prints "md4x rendering
failed" plus the context); add small helpers like `Error::new()` and
`Error::with_context<C: Into<String>>(c: C)` to construct errors with optional
context, and keep `impl std::error::Error for Error {}` as-is; update any call
sites that construct `Error` to use the new constructors.
- Around line 426-430: collect_heal is a duplicate of collect_output; remove the
duplicated unsafe extern "C" fn collect_heal and reuse the existing
collect_output callback instead: delete collect_heal definition and update any
places that passed collect_heal (callback registrations or FFI calls) to pass
collect_output, making sure the signature matches (unsafe extern "C" fn(data:
*const c_char, size: u32, userdata: *mut c_void)) and adjust imports/visibility
if necessary so collect_output is visible where the callback is used.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8f4a689f-7daa-4b06-9578-f704c0cc5669

📥 Commits

Reviewing files that changed from the base of the PR and between bf6c08f and 1083f97.

⛔ Files ignored due to path filters (1)
  • rust/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (74)
  • .github/workflows/ci.yml
  • .github/workflows/release.yml
  • .gitignore
  • .prettierignore
  • AGENTS.md
  • CHANGELOG.md
  • README.md
  • rust/Cargo.toml
  • rust/md4x-sys/Cargo.toml
  • rust/md4x-sys/build.rs
  • rust/md4x-sys/src/lib.rs
  • rust/md4x-sys/vendor/libyaml/.appveyor.yml
  • rust/md4x-sys/vendor/libyaml/.github/workflows/main.yml
  • rust/md4x-sys/vendor/libyaml/.gitignore
  • rust/md4x-sys/vendor/libyaml/.indent.pro
  • rust/md4x-sys/vendor/libyaml/.makefile
  • rust/md4x-sys/vendor/libyaml/.travis.yml
  • rust/md4x-sys/vendor/libyaml/CMakeLists.txt
  • rust/md4x-sys/vendor/libyaml/Changes
  • rust/md4x-sys/vendor/libyaml/License
  • rust/md4x-sys/vendor/libyaml/Makefile.am
  • rust/md4x-sys/vendor/libyaml/ReadMe.md
  • rust/md4x-sys/vendor/libyaml/announcement.msg
  • rust/md4x-sys/vendor/libyaml/bootstrap
  • rust/md4x-sys/vendor/libyaml/configure.ac
  • rust/md4x-sys/vendor/libyaml/doc/doxygen.cfg
  • rust/md4x-sys/vendor/libyaml/docker/README.mkd
  • rust/md4x-sys/vendor/libyaml/docker/alpine-3.7
  • rust/md4x-sys/vendor/libyaml/docker/fedora-25
  • rust/md4x-sys/vendor/libyaml/docker/ubuntu-14.04
  • rust/md4x-sys/vendor/libyaml/docker/ubuntu-16.04
  • rust/md4x-sys/vendor/libyaml/examples/anchors.yaml
  • rust/md4x-sys/vendor/libyaml/examples/array.yaml
  • rust/md4x-sys/vendor/libyaml/examples/global-tag.yaml
  • rust/md4x-sys/vendor/libyaml/examples/json.yaml
  • rust/md4x-sys/vendor/libyaml/examples/mapping.yaml
  • rust/md4x-sys/vendor/libyaml/examples/numbers.yaml
  • rust/md4x-sys/vendor/libyaml/examples/strings.yaml
  • rust/md4x-sys/vendor/libyaml/examples/tags.yaml
  • rust/md4x-sys/vendor/libyaml/examples/yaml-version.yaml
  • rust/md4x-sys/vendor/libyaml/include/Makefile.am
  • rust/md4x-sys/vendor/libyaml/include/yaml.h
  • rust/md4x-sys/vendor/libyaml/regression-inputs/clusterfuzz-testcase-minimized-5607885063061504.yml
  • rust/md4x-sys/vendor/libyaml/src/Makefile.am
  • rust/md4x-sys/vendor/libyaml/src/api.c
  • rust/md4x-sys/vendor/libyaml/src/dumper.c
  • rust/md4x-sys/vendor/libyaml/src/emitter.c
  • rust/md4x-sys/vendor/libyaml/src/loader.c
  • rust/md4x-sys/vendor/libyaml/src/parser.c
  • rust/md4x-sys/vendor/libyaml/src/reader.c
  • rust/md4x-sys/vendor/libyaml/src/scanner.c
  • rust/md4x-sys/vendor/libyaml/src/writer.c
  • rust/md4x-sys/vendor/libyaml/src/yaml_private.h
  • rust/md4x-sys/vendor/libyaml/tests/CMakeLists.txt
  • rust/md4x-sys/vendor/libyaml/tests/Makefile.am
  • rust/md4x-sys/vendor/libyaml/tests/ReadMe.md
  • rust/md4x-sys/vendor/libyaml/tests/example-deconstructor-alt.c
  • rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c
  • rust/md4x-sys/vendor/libyaml/tests/example-reformatter-alt.c
  • rust/md4x-sys/vendor/libyaml/tests/example-reformatter.c
  • rust/md4x-sys/vendor/libyaml/tests/run-all-tests.sh
  • rust/md4x-sys/vendor/libyaml/tests/run-dumper.c
  • rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c
  • rust/md4x-sys/vendor/libyaml/tests/run-emitter.c
  • rust/md4x-sys/vendor/libyaml/tests/run-loader.c
  • rust/md4x-sys/vendor/libyaml/tests/run-parser-test-suite.c
  • rust/md4x-sys/vendor/libyaml/tests/run-parser.c
  • rust/md4x-sys/vendor/libyaml/tests/run-scanner.c
  • rust/md4x-sys/vendor/libyaml/tests/test-reader.c
  • rust/md4x-sys/vendor/libyaml/tests/test-version.c
  • rust/md4x-sys/vendor/libyaml/yaml-0.1.pc.in
  • rust/md4x-sys/vendor/libyaml/yamlConfig.cmake.in
  • rust/md4x/Cargo.toml
  • rust/md4x/src/lib.rs

Comment on lines +56 to +72
- name: Publish md4x-sys to crates.io
if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }}
run: cargo publish -p md4x-sys --locked --token "${CARGO_REGISTRY_TOKEN}"
working-directory: rust
env:
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}

- name: Wait for crates.io index
if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }}
run: sleep 20

- name: Publish md4x to crates.io
if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }}
run: cargo publish -p md4x --locked --token "${CARGO_REGISTRY_TOKEN}"
working-directory: rust
env:
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Invalid use of secrets context in if expressions.

The secrets context is not available in if conditions. GitHub Actions only allows env, github, inputs, job, matrix, needs, runner, steps, strategy, and vars contexts in if expressions. These conditions will not work as intended.

A common workaround is to set the secret as an environment variable in a prior step and check for its presence, or use env context instead:

🐛 Proposed fix using env context
      - name: Publish md4x-sys to crates.io
-       if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }}
+       if: ${{ env.CARGO_REGISTRY_TOKEN != '' }}
        run: cargo publish -p md4x-sys --locked --token "${CARGO_REGISTRY_TOKEN}"
        working-directory: rust
        env:
          CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}

      - name: Wait for crates.io index
-       if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }}
+       if: ${{ env.CARGO_REGISTRY_TOKEN != '' }}
-       run: sleep 20
+       run: sleep 30
+       env:
+         CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}

      - name: Publish md4x to crates.io
-       if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }}
+       if: ${{ env.CARGO_REGISTRY_TOKEN != '' }}
        run: cargo publish -p md4x --locked --token "${CARGO_REGISTRY_TOKEN}"
        working-directory: rust
        env:
          CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}

Alternatively, you can define the env at the job level to avoid repetition:

jobs:
  release:
    env:
      CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}

Then use if: ${{ env.CARGO_REGISTRY_TOKEN != '' }} in each step without repeating the env block.

🧰 Tools
🪛 actionlint (1.7.11)

[error] 57-57: context "secrets" is not allowed here. available contexts are "env", "github", "inputs", "job", "matrix", "needs", "runner", "steps", "strategy", "vars". see https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability for more details

(expression)


[error] 64-64: context "secrets" is not allowed here. available contexts are "env", "github", "inputs", "job", "matrix", "needs", "runner", "steps", "strategy", "vars". see https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability for more details

(expression)


[error] 68-68: context "secrets" is not allowed here. available contexts are "env", "github", "inputs", "job", "matrix", "needs", "runner", "steps", "strategy", "vars". see https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability for more details

(expression)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 56 - 72, Update the three steps
named "Publish md4x-sys to crates.io", "Wait for crates.io index", and "Publish
md4x to crates.io" to stop using the secrets context in their if conditions;
instead set CARGO_REGISTRY_TOKEN as an env at the job level or in a prior step
and change the if expressions to use env (e.g. if: ${{ env.CARGO_REGISTRY_TOKEN
!= '' }}), and remove the direct use of ${{ secrets.CARGO_REGISTRY_TOKEN }}
inside the if so the conditional works correctly in GitHub Actions.

Comment on lines +1129 to +1146
while (!STACK_EMPTY(&context, document->nodes)) {
yaml_node_t node = POP(&context, document->nodes);
yaml_free(node.tag);
switch (node.type) {
case YAML_SCALAR_NODE:
yaml_free(node.data.scalar.value);
break;
case YAML_SEQUENCE_NODE:
STACK_DEL(&context, node.data.sequence.items);
break;
case YAML_MAPPING_NODE:
STACK_DEL(&context, node.data.mapping.pairs);
break;
default:
assert(0); /* Should not happen. */
}
}
STACK_DEL(&context, document->nodes);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if yaml_document_delete compiles correctly
rg -n "yaml_document_delete" --type c -A 10

Repository: unjs/md4x

Length of output: 10444


🏁 Script executed:

# Get the full yaml_document_delete function
sed -n '1123,1150p' rust/md4x-sys/vendor/libyaml/src/api.c

Repository: unjs/md4x

Length of output: 1030


🏁 Script executed:

# Check yaml_document_add_scalar function mentioned in the review
sed -n '1199,1230p' rust/md4x-sys/vendor/libyaml/src/api.c

Repository: unjs/md4x

Length of output: 979


🏁 Script executed:

# Search for context variable declarations in api.c to understand the pattern
rg -n "struct.*context|yaml_error_type_t" rust/md4x-sys/vendor/libyaml/src/api.c | head -20

Repository: unjs/md4x

Length of output: 317


Missing context variable declaration in yaml_document_delete.

The variable context is used in STACK_EMPTY, POP, and STACK_DEL macro calls but is never declared in this function, causing a compilation error. This pattern is inconsistently applied—other functions in the same file (e.g., yaml_document_add_scalar at line 1199) properly declare context. Add the missing declaration at the beginning of the function.

Add missing context declaration
 YAML_DECLARE(void)
 yaml_document_delete(yaml_document_t *document)
 {
+    struct {
+        yaml_error_type_t error;
+    } context;
     yaml_tag_directive_t *tag_directive;

     assert(document);   /* Non-NULL document object is expected. */
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/src/api.c` around lines 1129 - 1146, In
yaml_document_delete, add the missing local declaration for the variable named
context (use the same declaration used in yaml_document_add_scalar); place that
declaration at the start of the yaml_document_delete function so the
STACK_EMPTY, POP, and STACK_DEL macros can reference it (i.e., copy the context
variable declaration from yaml_document_add_scalar into yaml_document_delete).

Comment on lines +116 to +120
value -= 0x10000;
emitter->raw_buffer.last[high] = 0xD8 + (value >> 18);
emitter->raw_buffer.last[low] = (value >> 10) & 0xFF;
emitter->raw_buffer.last[high+2] = 0xDC + ((value >> 8) & 0xFF);
emitter->raw_buffer.last[low+2] = value & 0xFF;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Mask the low-surrogate high byte to 2 bits.

Line 119 currently mixes bits 10-15 of value into the second UTF-16 word, so most > U+10000 scalars are emitted as invalid UTF-16. U+10400, for example, should end with 0xDC00, not 0xE000.

Proposed fix
-            emitter->raw_buffer.last[high+2] = 0xDC + ((value >> 8) & 0xFF);
+            emitter->raw_buffer.last[high+2] = 0xDC + ((value >> 8) & 0x03);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
value -= 0x10000;
emitter->raw_buffer.last[high] = 0xD8 + (value >> 18);
emitter->raw_buffer.last[low] = (value >> 10) & 0xFF;
emitter->raw_buffer.last[high+2] = 0xDC + ((value >> 8) & 0xFF);
emitter->raw_buffer.last[low+2] = value & 0xFF;
value -= 0x10000;
emitter->raw_buffer.last[high] = 0xD8 + (value >> 18);
emitter->raw_buffer.last[low] = (value >> 10) & 0xFF;
emitter->raw_buffer.last[high+2] = 0xDC + ((value >> 8) & 0x03);
emitter->raw_buffer.last[low+2] = value & 0xFF;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/src/writer.c` around lines 116 - 120, The
low-surrogate high byte is taking the wrong bits (it uses ((value >> 8) & 0xFF))
so most scalars > U+10000 produce invalid UTF-16; update the assignment that
writes the low-surrogate high byte (the emitter->raw_buffer.last[high+2] write
using value) to mask down to the top 2 bits of the 10-bit low-surrogate payload
(use ((value >> 8) & 0x03) instead of & 0xFF) so the low surrogate becomes
0xDC00 + (value & 0x3FF) and U+10400 will end with 0xDC00.

Comment on lines +426 to +435
/* Add 'implicit': {} */

key = yaml_document_add_scalar(&output_document, NULL,
(yaml_char_t *)"version", -1, YAML_PLAIN_SCALAR_STYLE);
if (!key) goto document_error;
map = yaml_document_add_mapping(&output_document, NULL,
YAML_FLOW_MAPPING_STYLE);
if (!map) goto document_error;
if (!yaml_document_append_mapping_pair(&output_document,
properties, key, map)) goto document_error;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Rename this mapping from version to implicit.

This block is building the scalar implicitness map (plain / quoted), but the top-level key is "version". That changes the emitted schema for scalar events and will break consumers expecting an "implicit" section.

🐛 Suggested fix
                 key = yaml_document_add_scalar(&output_document, NULL,
-                    (yaml_char_t *)"version", -1, YAML_PLAIN_SCALAR_STYLE);
+                    (yaml_char_t *)"implicit", -1, YAML_PLAIN_SCALAR_STYLE);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/* Add 'implicit': {} */
key = yaml_document_add_scalar(&output_document, NULL,
(yaml_char_t *)"version", -1, YAML_PLAIN_SCALAR_STYLE);
if (!key) goto document_error;
map = yaml_document_add_mapping(&output_document, NULL,
YAML_FLOW_MAPPING_STYLE);
if (!map) goto document_error;
if (!yaml_document_append_mapping_pair(&output_document,
properties, key, map)) goto document_error;
/* Add 'implicit': {} */
key = yaml_document_add_scalar(&output_document, NULL,
(yaml_char_t *)"implicit", -1, YAML_PLAIN_SCALAR_STYLE);
if (!key) goto document_error;
map = yaml_document_add_mapping(&output_document, NULL,
YAML_FLOW_MAPPING_STYLE);
if (!map) goto document_error;
if (!yaml_document_append_mapping_pair(&output_document,
properties, key, map)) goto document_error;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/tests/example-deconstructor-alt.c` around lines
426 - 435, The code is creating the scalar implicitness mapping but uses the
wrong top-level key "version"; change the yaml_document_add_scalar call that
sets the key to use "implicit" instead of "version" so the mapping appended with
yaml_document_append_mapping_pair(&output_document, properties, key, map)
correctly names the section; update the scalar creation where key is assigned
(yaml_document_add_scalar) to "implicit" to restore the expected schema for
scalar events.

Comment on lines +58 to +60
else if (!foundfile) {
input = fopen(argv[i], "rb");
foundfile = 1;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle fopen() failures before marking the file as found.

foundfile is set even when fopen() returns NULL. In release builds the assert(input) check disappears, so the next read path can dereference a null FILE *.

🐛 Suggested fix
         else if (!foundfile) {
-            input = fopen(argv[i], "rb");
-            foundfile = 1;
+            input = fopen(argv[i], "rb");
+            if (!input) {
+                perror(argv[i]);
+                return 1;
+            }
+            foundfile = 1;
         }
@@
-    assert(input);

Also applies to: 72-72

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c` around lines 58
- 60, The code sets foundfile immediately after calling fopen(argv[i], "rb")
even when fopen returns NULL; change the logic in run-emitter-test-suite.c so
you first call fopen and check its return (input) for NULL, only then set
foundfile and proceed; update both occurrences where fopen/ foundfile are used
(the block with input = fopen(argv[i], "rb") and the similar block at the other
occurrence around the 72-line area) to avoid marking the file as found when
fopen fails and to handle/propagate the fopen error appropriately.

Comment on lines +85 to +86
char anchor[256];
char tag[256];

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bound-check anchor and tag extraction before copying.

Both helpers copy unbounded spans into 256-byte stack buffers. A long token, or a > that appears before <, can turn the computed length into an oversized copy and overrun the stack.

🛡️ Suggested fix
-char *get_anchor(char sigil, char *line, char *anchor);
-char *get_tag(char *line, char *tag);
+char *get_anchor(char sigil, char *line, char *anchor, size_t anchor_cap);
+char *get_tag(char *line, char *tag, size_t tag_cap);
@@
-char *get_anchor(char sigil, char *line, char *anchor)
+char *get_anchor(char sigil, char *line, char *anchor, size_t anchor_cap)
 {
     char *start;
     char *end;
@@
-    memcpy(anchor, start, end - start);
-    anchor[end - start] = '\0';
+    size_t len = (size_t)(end - start);
+    if (len >= anchor_cap)
+        return NULL;
+    memcpy(anchor, start, len);
+    anchor[len] = '\0';
     return anchor;
 }
@@
-char *get_tag(char *line, char *tag)
+char *get_tag(char *line, char *tag, size_t tag_cap)
 {
     char *start;
     char *end;
@@
-    if ((end = strchr(line, '>')) == NULL)
+    if ((end = strchr(start + 1, '>')) == NULL)
         return NULL;
-    memcpy(tag, start + 1, end - start - 1);
-    tag[end - start - 1] = '\0';
+    size_t len = (size_t)(end - start - 1);
+    if (len >= tag_cap)
+        return NULL;
+    memcpy(tag, start + 1, len);
+    tag[len] = '\0';
     return tag;
 }

Also applies to: 206-230

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c` around lines 85
- 86, The test harness extracts token spans into fixed-size stack buffers
anchor[256] and tag[256] without bounds checking; ensure any computed length for
these copies is validated to be >=0 and < sizeof(anchor) (and sizeof(tag)),
clamp the copy length to sizeof-1, perform the copy with a bounded routine and
explicitly null-terminate, and handle the case where a '>' appears before '<' by
treating it as an error/empty span; apply the same fixes to the other copy site
around the 206-230 region so both anchor and tag extractions are safe.

Comment on lines +131 to +138
char value[1024];
int style;

get_value(line, value, &style);
implicit = (get_tag(line, tag) == NULL);

ok = yaml_scalar_event_initialize(&event, (yaml_char_t *)
get_anchor('&', line, anchor), (yaml_char_t *) get_tag(line, tag), (yaml_char_t *) value, -1, implicit, implicit, style);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '131,138p' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c
sed -n '264,285p' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c
rg -n -C2 'yaml_scalar_event_initialize' rust/md4x-sys/vendor/libyaml/include/yaml.h

Repository: unjs/md4x

Length of output: 1221


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find get_value declaration
rg -n 'get_value.*\(' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c | head -20

# Count lines in file to understand structure
wc -l rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c

# Check around line 268-269 for declaration
sed -n '265,275p' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c

Repository: unjs/md4x

Length of output: 635


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# More comprehensive search for get_value calls
rg -n 'get_value\(' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c

# Look at context around line 134
sed -n '128,142p' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c

# Check complete function definition
sed -n '233,290p' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c

Repository: unjs/md4x

Length of output: 2529


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify yaml_scalar_event_initialize signature accepts int length
rg -n -A3 'yaml_scalar_event_initialize' rust/md4x-sys/vendor/libyaml/include/yaml.h | head -20

Repository: unjs/md4x

Length of output: 272


Pass the scalar length explicitly when \0 escapes are supported.

get_value() can materialize embedded NUL bytes (line 273: else if (*c == '0') value[i++] = '\0';), but yaml_scalar_event_initialize(..., -1, ...) at line 134 makes libyaml treat value as a C string and stop at the first NUL. Any test case containing \0 will be truncated on emit. The yaml_scalar_event_initialize() signature (line 580–582 in yaml.h) exposes an explicit int length parameter to handle this.

Fix by changing get_value() to return the scalar length (line 11 declaration, line 233 definition), then pass this length instead of -1 at the call site (line 134).

Suggested fix
-void get_value(char *line, char *value, int *style);
+size_t get_value(char *line, char *value, int *style);
@@
-            get_value(line, value, &style);
+            size_t value_len = get_value(line, value, &style);
             implicit = (get_tag(line, tag) == NULL);
 
             ok = yaml_scalar_event_initialize(&event, (yaml_char_t *)
-                                              get_anchor('&', line, anchor), (yaml_char_t *) get_tag(line, tag), (yaml_char_t *) value, -1, implicit, implicit, style);
+                                              get_anchor('&', line, anchor), (yaml_char_t *) get_tag(line, tag), (yaml_char_t *) value, (int)value_len, implicit, implicit, style);
@@
-void get_value(char *line, char *value, int *style)
+size_t get_value(char *line, char *value, int *style)
 {
@@
     }
     value[i] = '\0';
+    return (size_t)i;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c` around lines 131
- 138, get_value currently may write embedded NULs but the caller in
run-emitter-test-suite.c passes -1 to yaml_scalar_event_initialize, causing
truncation; change get_value to return the actual scalar length (update its
prototype and definition) and capture that length at the call site, then pass
that length instead of -1 to yaml_scalar_event_initialize (keep using
get_anchor, get_tag and the same value buffer); ensure any callers of get_value
use the new return value.

Comment on lines +36 to +38
else if (!foundfile) {
input = fopen(argv[i], "rb");
foundfile = 1;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Check fopen() before setting foundfile.

A failed fopen() still marks the file as consumed. With NDEBUG, the later assert(input) disappears and yaml_parser_set_input_file() receives a null FILE *.

🐛 Suggested fix
         else if (!foundfile) {
-            input = fopen(argv[i], "rb");
-            foundfile = 1;
+            input = fopen(argv[i], "rb");
+            if (!input) {
+                perror(argv[i]);
+                return 1;
+            }
+            foundfile = 1;
         }
@@
-    assert(input);

Also applies to: 43-46

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x-sys/vendor/libyaml/tests/run-parser-test-suite.c` around lines 36 -
38, The code sets foundfile before checking fopen's result, so a failed fopen
can leave input NULL and later call yaml_parser_set_input_file(NULL); change the
logic in the main loop (where variables input and foundfile are used and where
yaml_parser_set_input_file is called) to only set foundfile = 1 after verifying
input != NULL, and if fopen fails handle the error (e.g., print a message and
continue or exit) rather than proceeding; ensure both occurrences (the block
around the current fopen and the similar block at lines 43-46) perform the same
NULL-check-before-setting-foundfile and avoid passing a NULL FILE* to
yaml_parser_set_input_file.

@pi0 pi0 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Rust Bindings

The FFI bindings are well done — all types, structs, flags, and function signatures correctly match the C headers. Nice work!

Critical: crates.io publishing is broken

md4x-sys/build.rs references C sources at ../../src/ (relative to crate dir). cargo publish only packages files within the crate root, so anyone installing from crates.io will get a build failure — the C sources won't be in the tarball.

Suggested minimal fix (2 files changed, no Docker needed):

  1. Update build.rs to try a local vendor/md4x/ first, falling back to ../../src/ for local dev:
// In build.rs, replace the hardcoded ../../src/ with:
let src_dir = {
    let vendored = manifest_dir.join("vendor/md4x");
    if vendored.exists() {
        vendored
    } else {
        manifest_dir.join("../../src")
    }
};
  1. Add a small publish script (e.g. rust/publish.sh) that temporarily vendors the C sources before publishing:
#!/bin/bash
set -e
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
SYS_DIR="$SCRIPT_DIR/md4x-sys"
VENDOR="$SYS_DIR/vendor/md4x"

# Copy C sources for crates.io packaging
mkdir -p "$VENDOR" "$VENDOR/renderers"
cp "$SCRIPT_DIR/../src/md4x.c" "$SCRIPT_DIR/../src/md4x.h" \
   "$SCRIPT_DIR/../src/entity.c" "$SCRIPT_DIR/../src/entity.h" "$VENDOR/"
cp "$SCRIPT_DIR/../src/renderers/"*.{c,h} "$VENDOR/renderers/"

# Publish
cargo publish -p md4x-sys --locked "$@"
sleep 30
cargo publish -p md4x --locked "$@"

# Cleanup
rm -rf "$VENDOR"

Then update .gitignore to add rust/md4x-sys/vendor/md4x/ so the copied sources aren't accidentally committed.

This is the standard -sys crate pattern. The vendored libyaml already follows it — just need the same for the md4x sources.

Minor issues

  • Missing crates.io metadata in both Cargo.toml: consider adding repository, readme, homepage
  • The 20s sleep between publishes in CI is fragile — crates.io index can take longer. The publish script above uses 30s, but a retry loop would be more robust
  • input.len() as u32 could use u32::try_from() to avoid silent truncation on >4GB inputs (unlikely but good practice)

@countradooku

Copy link
Copy Markdown
Author

I also would like ot make a full port in a separate repo if thats ok

- build.rs now checks vendor/md4x/ first (populated at publish time),
  falling back to ../../src/ for monorepo dev builds
- Add rust/publish.sh to vendor C sources, publish both crates with
  crates.io index polling, then clean up
- Gitignore rust/md4x-sys/vendor/md4x/ (transient, not committed)
- Add repository/homepage/readme metadata to both Cargo.toml files
- Replace input.len() as u32 with u32::try_from() in all render fns

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
.gitignore (1)

73-77: LGTM! Rust ignore patterns are appropriate.

The ignore patterns correctly handle Rust build outputs (rust/target/) and transient vendored sources (rust/md4x-sys/vendor/md4x/). The comment clarifies that vendored C sources are populated only during crates.io publish, which aligns with the build.rs fallback strategy described in the PR objectives.

Optional: Consider broader vendor pattern

If additional dependencies might be vendored in the future, you could consider ignoring the entire vendor directory:

-rust/md4x-sys/vendor/md4x/
+rust/md4x-sys/vendor/

However, the current specific pattern is fine if you intend to commit other files under vendor/ or want explicit control over what gets ignored.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.gitignore around lines 73 - 77, The .gitignore currently excludes Rust
build outputs and the specific vendored C sources (patterns rust/target/ and
rust/md4x-sys/vendor/md4x/) which is correct; if you want broader coverage for
any future vendored dependencies, add a more general vendor ignore (e.g., a
pattern covering vendor directories under rust like rust/**/vendor/ or
rust/vendor/) to complement or replace the specific md4x entry, otherwise leave
the existing specific patterns as-is.
rust/md4x/src/lib.rs (2)

426-430: Minor duplication: collect_heal is identical to collect_output.

The inner function duplicates the existing collect_output callback. Consider reusing the outer function.

♻️ Reuse existing callback
 pub fn heal(input: &str) -> Result<String, Error> {
     let mut buf: Vec<u8> = Vec::new();

-    unsafe extern "C" fn collect_heal(data: *const c_char, size: u32, userdata: *mut c_void) {
-        let buf = &mut *(userdata as *mut Vec<u8>);
-        let slice = std::slice::from_raw_parts(data as *const u8, size as usize);
-        buf.extend_from_slice(slice);
-    }
-
     let rc = unsafe {
         sys::md_heal(
             input.as_ptr() as *const c_char,
             u32::try_from(input.len()).map_err(|_| Error)?,
-            collect_heal,
+            collect_output,
             &mut buf as *mut Vec<u8> as *mut c_void,
         )
     };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x/src/lib.rs` around lines 426 - 430, collect_heal is a duplicate of
collect_output; remove the redundant unsafe extern "C" fn collect_heal and reuse
collect_output where collect_heal is currently passed; ensure the call sites
that previously referenced collect_heal now pass collect_output (and that the
signature matches: fn(*const c_char, u32, *mut c_void)) so the userdata cast to
*mut Vec<u8> and the from_raw_parts/extend_from_slice logic remains unchanged.

264-277: Silent handling of NUL bytes in strings may surprise callers.

CString::new().unwrap_or_default() silently converts strings containing interior NUL bytes to empty strings. If a user passes title: Some("My\0Title"), it becomes an empty string without any indication.

Consider returning an error or documenting this behavior:

Option 1: Return error on invalid input
 let c_title = match opts.title {
     Some(t) => {
-        title_cstr = std::ffi::CString::new(t).unwrap_or_default();
+        title_cstr = std::ffi::CString::new(t).map_err(|_| Error)?;
         title_cstr.as_ptr()
     }
     None => ptr::null(),
 };
 let c_css = match opts.css_url {
     Some(u) => {
-        css_cstr = std::ffi::CString::new(u).unwrap_or_default();
+        css_cstr = std::ffi::CString::new(u).map_err(|_| Error)?;
         css_cstr.as_ptr()
     }
     None => ptr::null(),
 };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/md4x/src/lib.rs` around lines 264 - 277, The current code silently
swallows interior NUL errors by calling CString::new(...).unwrap_or_default()
for opts.title and opts.css_url (producing title_cstr and css_cstr and then
c_title/c_css), which turns invalid inputs into empty strings; instead validate
and propagate the error: call CString::new on opts.title and opts.css_url,
handle the Result (map or match) and return or propagate an Err with a clear
message if CString::new fails (include which field failed), ensuring callers are
informed of invalid NUL-containing strings rather than silently replacing them
with empty values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rust/publish.sh`:
- Around line 23-24: The cargo publish calls run from the caller's CWD and must
be anchored to the Rust workspace; update both occurrences of the publish
command (the line with "cargo publish -p md4x-sys --locked \"$@\"" and the other
publish call around lines 36-37) to specify the workspace manifest with
--manifest-path rust/Cargo.toml (or equivalently run the command from the rust
directory) so Cargo resolves the workspace correctly when invoked from the repo
root.
- Around line 26-37: The polling loop that checks crates.io for md4x-sys (using
SYS_DIR and the cargo search/cargo metadata command) currently proceeds even if
indexing never succeeds; modify the script so that after the for-loop finishes
without finding the indexed version you print a clear error (including the
version fetched via cargo metadata) and exit non-zero (e.g., exit 1) to abort
publishing md4x; keep the existing success branch (echo "md4x-sys is indexed."
and break) unchanged and only add the failure check immediately after the loop
to fail fast when md4x-sys isn't indexed.

---

Nitpick comments:
In @.gitignore:
- Around line 73-77: The .gitignore currently excludes Rust build outputs and
the specific vendored C sources (patterns rust/target/ and
rust/md4x-sys/vendor/md4x/) which is correct; if you want broader coverage for
any future vendored dependencies, add a more general vendor ignore (e.g., a
pattern covering vendor directories under rust like rust/**/vendor/ or
rust/vendor/) to complement or replace the specific md4x entry, otherwise leave
the existing specific patterns as-is.

In `@rust/md4x/src/lib.rs`:
- Around line 426-430: collect_heal is a duplicate of collect_output; remove the
redundant unsafe extern "C" fn collect_heal and reuse collect_output where
collect_heal is currently passed; ensure the call sites that previously
referenced collect_heal now pass collect_output (and that the signature matches:
fn(*const c_char, u32, *mut c_void)) so the userdata cast to *mut Vec<u8> and
the from_raw_parts/extend_from_slice logic remains unchanged.
- Around line 264-277: The current code silently swallows interior NUL errors by
calling CString::new(...).unwrap_or_default() for opts.title and opts.css_url
(producing title_cstr and css_cstr and then c_title/c_css), which turns invalid
inputs into empty strings; instead validate and propagate the error: call
CString::new on opts.title and opts.css_url, handle the Result (map or match)
and return or propagate an Err with a clear message if CString::new fails
(include which field failed), ensuring callers are informed of invalid
NUL-containing strings rather than silently replacing them with empty values.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dbd1c7a4-6d51-435f-818c-d5c3ea779954

📥 Commits

Reviewing files that changed from the base of the PR and between 1083f97 and b3aceaa.

📒 Files selected for processing (6)
  • .gitignore
  • rust/md4x-sys/Cargo.toml
  • rust/md4x-sys/build.rs
  • rust/md4x/Cargo.toml
  • rust/md4x/src/lib.rs
  • rust/publish.sh
🚧 Files skipped from review as they are similar to previous changes (3)
  • rust/md4x-sys/build.rs
  • rust/md4x-sys/Cargo.toml
  • rust/md4x/Cargo.toml

Comment thread rust/publish.sh
Comment on lines +23 to +24
echo "==> Publishing md4x-sys"
cargo publish -p md4x-sys --locked "$@"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/usr/bin/env bash
set -euo pipefail

printf 'Cargo manifests found in this repo:\n'
fd '^Cargo\.toml$' -p

printf '\nRelevant lines from rust/publish.sh:\n'
nl -ba rust/publish.sh | sed -n '1,40p'

Repository: unjs/md4x

Length of output: 186


🏁 Script executed:

#!/usr/bin/env bash
set -euo pipefail

printf 'Cargo manifests found in this repo:\n'
fd '^Cargo\.toml$'

printf '\nContent of rust/publish.sh:\n'
cat -n rust/publish.sh

printf '\nChecking workspace setup:\n'
if [ -f "rust/Cargo.toml" ]; then
  echo "✓ rust/Cargo.toml exists"
  head -20 rust/Cargo.toml
fi

Repository: unjs/md4x

Length of output: 1775


Anchor both cargo publish calls to the Rust workspace.

Line 24 and Line 37 run from the caller's current directory, but Line 3 documents bash rust/publish.sh. If invoked from the repo root (as documented), Cargo will search for Cargo.toml starting from the repo root and find nothing, since the workspace manifest is at rust/Cargo.toml. This will cause the publish step to fail.

Suggested fix
 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+WORKSPACE_MANIFEST="$SCRIPT_DIR/Cargo.toml"
 SYS_DIR="$SCRIPT_DIR/md4x-sys"
 VENDOR="$SYS_DIR/vendor/md4x"
 SRC="$SCRIPT_DIR/../src"
@@
 echo "==> Publishing md4x-sys"
-cargo publish -p md4x-sys --locked "$@"
+cargo publish --manifest-path "$WORKSPACE_MANIFEST" -p md4x-sys --locked "$@"
@@
 echo "==> Publishing md4x"
-cargo publish -p md4x --locked "$@"
+cargo publish --manifest-path "$WORKSPACE_MANIFEST" -p md4x --locked "$@"

Also applies to: 36-37

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/publish.sh` around lines 23 - 24, The cargo publish calls run from the
caller's CWD and must be anchored to the Rust workspace; update both occurrences
of the publish command (the line with "cargo publish -p md4x-sys --locked
\"$@\"" and the other publish call around lines 36-37) to specify the workspace
manifest with --manifest-path rust/Cargo.toml (or equivalently run the command
from the rust directory) so Cargo resolves the workspace correctly when invoked
from the repo root.

Comment thread rust/publish.sh
Comment on lines +26 to +37
echo "==> Waiting for crates.io index to update..."
for i in $(seq 1 12); do
sleep 10
if cargo search md4x-sys --limit 1 2>/dev/null | grep -q "$(cargo metadata --no-deps --manifest-path "$SYS_DIR/Cargo.toml" --format-version 1 | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["packages"][0]["version"])')"; then
echo " md4x-sys is indexed."
break
fi
echo " Not indexed yet (attempt $i/12)..."
done

echo "==> Publishing md4x"
cargo publish -p md4x --locked "$@"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/usr/bin/env bash
set -euo pipefail

printf 'Polling/publish section:\n'
nl -ba rust/publish.sh | sed -n '23,39p'

python3 - <<'PY'
from pathlib import Path
text = Path("rust/publish.sh").read_text()
before_second_publish = text.split('echo "==> Publishing md4x"')[0]
print("\nFound explicit abort before second publish:", "exit 1" in before_second_publish)
PY

Repository: unjs/md4x

Length of output: 137


🏁 Script executed:

#!/bin/bash
# Check the polling section and look for exit statements
echo "=== Checking rust/publish.sh lines 20-45 ==="
sed -n '20,45p' rust/publish.sh

echo -e "\n=== Checking for 'exit' statements in entire script ==="
grep -n "exit" rust/publish.sh || echo "No exit statements found"

echo -e "\n=== Checking for abort/error handling after polling loop ==="
grep -n -A 5 "Not indexed yet" rust/publish.sh

Repository: unjs/md4x

Length of output: 1065


🏁 Script executed:

head -20 rust/publish.sh

Repository: unjs/md4x

Length of output: 593


Add explicit abort if md4x-sys fails to index.

The polling loop completes normally whether successful or not, allowing the script to proceed directly to publishing md4x without verifying the dependency is indexed. This turns a clear propagation delay into a confusing crates.io dependency resolution failure.

Suggested fix
 echo "==> Waiting for crates.io index to update..."
+indexed=false
 for i in $(seq 1 12); do
     sleep 10
     if cargo search md4x-sys --limit 1 2>/dev/null | grep -q "$(cargo metadata --no-deps --manifest-path "$SYS_DIR/Cargo.toml" --format-version 1 | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["packages"][0]["version"])')"; then
         echo "    md4x-sys is indexed."
+        indexed=true
         break
     fi
     echo "    Not indexed yet (attempt $i/12)..."
 done
+
+if [[ "$indexed" != true ]]; then
+    echo "md4x-sys did not appear in the crates.io index after 120 seconds; aborting md4x publish." >&2
+    exit 1
+fi
 
 echo "==> Publishing md4x"
 cargo publish -p md4x --locked "$@"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
echo "==> Waiting for crates.io index to update..."
for i in $(seq 1 12); do
sleep 10
if cargo search md4x-sys --limit 1 2>/dev/null | grep -q "$(cargo metadata --no-deps --manifest-path "$SYS_DIR/Cargo.toml" --format-version 1 | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["packages"][0]["version"])')"; then
echo " md4x-sys is indexed."
break
fi
echo " Not indexed yet (attempt $i/12)..."
done
echo "==> Publishing md4x"
cargo publish -p md4x --locked "$@"
echo "==> Waiting for crates.io index to update..."
indexed=false
for i in $(seq 1 12); do
sleep 10
if cargo search md4x-sys --limit 1 2>/dev/null | grep -q "$(cargo metadata --no-deps --manifest-path "$SYS_DIR/Cargo.toml" --format-version 1 | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["packages"][0]["version"])')"; then
echo " md4x-sys is indexed."
indexed=true
break
fi
echo " Not indexed yet (attempt $i/12)..."
done
if [[ "$indexed" != true ]]; then
echo "md4x-sys did not appear in the crates.io index after 120 seconds; aborting md4x publish." >&2
exit 1
fi
echo "==> Publishing md4x"
cargo publish -p md4x --locked "$@"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/publish.sh` around lines 26 - 37, The polling loop that checks crates.io
for md4x-sys (using SYS_DIR and the cargo search/cargo metadata command)
currently proceeds even if indexing never succeeds; modify the script so that
after the for-loop finishes without finding the indexed version you print a
clear error (including the version fetched via cargo metadata) and exit non-zero
(e.g., exit 1) to abort publishing md4x; keep the existing success branch (echo
"md4x-sys is indexed." and break) unchanged and only add the failure check
immediately after the loop to fail fast when md4x-sys isn't indexed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants