Integrate Rust crates with docs and CI/CD#7
Conversation
📝 WalkthroughWalkthroughAdds a Rust workspace with safe and sys crates for md4x, vendors libyaml 0.2.5 (C sources, build system, tests, docs), and updates CI/release workflows to build, test, and optionally publish the Rust crates to crates.io; also updates ignores and documentation to reflect the new Rust bindings. Changes
Sequence Diagram(s)sequenceDiagram
participant Dev as Developer
participant CI as GitHub Actions (ci.yml)
participant Rust as Rust Toolchain
participant Cargo as Cargo (workspace)
participant Crates as crates.io
Dev->>CI: push PR / tag
CI->>Rust: setup toolchain (dtolnay/rust-toolchain)
CI->>CI: cache Rust deps (swatinem/rust-cache)
CI->>Cargo: run `cargo test --workspace --locked` (in rust/)
alt release workflow with token
CI->>Cargo: publish md4x-sys (`cargo publish --locked`)
Cargo->>Crates: push crate index entry
CI->>Crates: poll/wait for index propagation
CI->>Cargo: publish md4x (`cargo publish --locked`)
Cargo->>Crates: push crate index entry
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
AGENTS.md (1)
137-150:⚠️ Potential issue | 🟡 MinorUpdate the build-output count in this section.
Line 137 still says “four static libraries”, but the list below now includes seven C libraries before the executable/WASM/NAPI/Rust targets. The heading is stale and will mislead contributors reading the build docs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AGENTS.md` around lines 137 - 150, The heading "Produces four static libraries" is stale; update that summary to reflect the actual number of C libraries listed (libmd4x, libmd4x-html, libmd4x-ast, libmd4x-ansi, libmd4x-meta, libmd4x-text, libmd4x-heal) so the doc accurately states seven C libraries are produced before the executable/WASM/NAPI/Rust targets; locate the heading above the list and change the count wording accordingly while leaving the individual items (md4x, md4x.wasm, md4x.{platform}-{arch}[-musl].node, rust/md4x-sys, rust/md4x) unchanged.
🟡 Minor comments (15)
rust/md4x-sys/vendor/libyaml/examples/strings.yaml-1-1 (1)
1-1:⚠️ Potential issue | 🟡 MinorTypo in key name: "unqouted" → "unquoted".
The key name has a spelling error. Since this is an example file demonstrating YAML string types, it should use the correct spelling for clarity.
📝 Proposed fix
-unqouted: string +unquoted: string🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/examples/strings.yaml` at line 1, Fix the spelling mistake in the YAML example by renaming the key "unqouted" to "unquoted" in strings.yaml; update the example entry so the key is correctly spelled ("unquoted: string") to keep the documentation/example consistent and clear.rust/md4x-sys/vendor/libyaml/docker/README.mkd-6-16 (1)
6-16:⚠️ Potential issue | 🟡 MinorFix the README example path.
The example uses
dockerfiles/ubuntu-16.04, but this tree is underdocker/, so the command as written does not match the files added in this PR. Line 6 also reads awkwardly (The current list is supports:), so this snippet could use a quick docs pass.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/docker/README.mkd` around lines 6 - 16, Update the README.mkd to correct the awkward header and fix the example Dockerfile path: replace the sentence "The current list is supports:" with a clear heading like "Supported platforms:" and change the example docker build command to point at the repo's docker directory (e.g., use the docker/dockerfiles/ubuntu-16.04 path in the -f argument instead of dockerfiles/ubuntu-16.04) so the example matches the files added in this PR.rust/md4x-sys/vendor/libyaml/.makefile-31-37 (1)
31-37:⚠️ Potential issue | 🟡 MinorPoint
SOURCE_FILESat the real source tree, not a generated test artifact.
find tests/run-test-suitefails on a fresh checkout since that directory is missing. The current implementation leavesindentwith no inputs, defeating the "pre-bootstrap" helper use case described at the top of the file. The commented original line or a pattern likefind . -type f -name '*.c'would correctly capture the source files insrc/andtests/.Suggested fix
-# SOURCE_FILES := $(shell find . | grep '\.c$$') -SOURCE_FILES := $(shell find tests/run-test-suite | grep '\.c$$') +SOURCE_FILES := $(shell find . -type f -name '*.c')🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/.makefile` around lines 31 - 37, The MAKEFILE sets SOURCE_FILES to the wrong tree (tests/run-test-suite) causing fresh checkouts to miss sources; change the SOURCE_FILES assignment to search the real source tree (e.g., use find . -type f -name '*.c' or the original commented pattern) so it captures src/ and tests/, and keep the INDENT fallback behavior intact by ensuring SOURCE_FILES is non-empty before invoking INDENT; update the SOURCE_FILES variable (and remove the hard-coded tests/run-test-suite) and verify functions referencing SOURCE_FILES and INDENT still work during pre-bootstrap.rust/md4x-sys/vendor/libyaml/CMakeLists.txt-84-88 (1)
84-88:⚠️ Potential issue | 🟡 MinorHonor
INSTALL_INCLUDE_DIRin the header install rule.The target's install interface uses
${INSTALL_INCLUDE_DIR}, but the actual header install step hard-codesinclude. Any non-default include-dir override produces an inconsistent package layout.♻️ Suggested fix
install( FILES include/yaml.h - DESTINATION include COMPONENT Development + DESTINATION "${INSTALL_INCLUDE_DIR}" COMPONENT Development )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/CMakeLists.txt` around lines 84 - 88, The install rule currently hard-codes "include" instead of using the same variable as the install interface; update the install(FILES ... DESTINATION ...) call that installs include/yaml.h to use ${INSTALL_INCLUDE_DIR} (the project install include dir variable) so header installation honors non-default include-dir overrides and matches the target's install interface.rust/md4x-sys/vendor/libyaml/src/emitter.c-483-485 (1)
483-485:⚠️ Potential issue | 🟡 Minor
assert(1)always passes and provides no protection for invalid emitter states.This assertion is meant to catch invalid states but always succeeds. Use
assert(0)to trigger a failure when an unexpected state is encountered.Proposed fix
default: - assert(1); /* Invalid state. */ + assert(0); /* Invalid state. */ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/src/emitter.c` around lines 483 - 485, The default branch in the emitter state switch uses assert(1) which never fails; update the invalid-state guard in emitter.c (the switch's default case in the emitter state handling, e.g., within the function containing that switch) to assert(0) so an unexpected emitter state triggers a failure (or alternatively call abort()/yaml_emitter_set_error with context if you prefer explicit error handling), ensuring the invalid-state path actually fails during debugging.rust/md4x-sys/vendor/libyaml/src/reader.c-396-398 (1)
396-398:⚠️ Potential issue | 🟡 Minor
assert(1)always passes and provides no protection for invalid encoding states.The comment says "Impossible" but the assertion
assert(1)always succeeds. Useassert(0)to catch unexpected encoding values during debugging.Proposed fix
default: - assert(1); /* Impossible. */ + assert(0); /* Impossible. */ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/src/reader.c` around lines 396 - 398, The switch default in reader.c currently uses assert(1) which always succeeds; update the default case's assertion to assert(0) (or an equivalent failing assertion) so unexpected encoding states are caught during debugging—locate the default: branch in the switch where assert(1) is used and replace it with assert(0) to ensure the "Impossible" path actually triggers an assertion failure.rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c-77-77 (1)
77-77:⚠️ Potential issue | 🟡 MinorTypo in error message: "inialize" should be "initialize".
Proposed fix
- fprintf(stderr, "Could not inialize the emitter object\n"); + fprintf(stderr, "Could not initialize the emitter object\n");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c` at line 77, Fix the typo in the fprintf error message: change the string passed to fprintf(stderr, ...) that currently reads "Could not inialize the emitter object\n" to use the correct spelling "initialize" so the message becomes "Could not initialize the emitter object\n"; update the fprintf call in example-deconstructor.c (the error logging for the emitter object) accordingly.rust/md4x-sys/vendor/libyaml/tests/run-emitter.c-71-73 (1)
71-73:⚠️ Potential issue | 🟡 Minor
assert(1)always passes and provides no protection for unexpected event types.Similar to other files, this assertion is meant to catch unexpected event types but
assert(1)always succeeds. Useassert(0)to fail on unexpected cases during testing.Proposed fix
default: - assert(1); + assert(0); /* Unexpected event type. */ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/tests/run-emitter.c` around lines 71 - 73, The default branch in the switch handling event types in run-emitter.c currently uses assert(1) which always succeeds; change that assertion to assert(0) (or an equivalent failing assertion such as assert(!"unexpected event type")) so the test fails on unexpected event types — locate the default: case in the switch and replace the assert(1) call accordingly.rust/md4x-sys/vendor/libyaml/src/parser.c-299-301 (1)
299-301:⚠️ Potential issue | 🟡 Minor
assert(1)always passes and provides no protection for invalid states.This assertion is meant to catch invalid parser states during debugging, but
assert(1)always evaluates to true and does nothing. It should beassert(0)to trigger an assertion failure when an unexpected state is encountered.Proposed fix
default: - assert(1); /* Invalid state. */ + assert(0); /* Invalid state. */ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/src/parser.c` around lines 299 - 301, The default case in the switch inside parser.c currently uses assert(1) which is a no-op; change that assertion to fail on invalid states by replacing assert(1) with assert(0) (or an equivalent failing assertion like assert(!"Invalid state")) in the default: branch so unexpected parser states trigger a diagnostic; locate the default case in the parser.c switch and update the assert accordingly.rust/md4x-sys/vendor/libyaml/src/emitter.c-1340-1342 (1)
1340-1342:⚠️ Potential issue | 🟡 Minor
assert(1)always passes and provides no protection for unexpected scalar styles.Same issue as above - the assertion does nothing. Use
assert(0)for impossible cases.Proposed fix
default: - assert(1); /* Impossible. */ + assert(0); /* Impossible. */ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/src/emitter.c` around lines 1340 - 1342, The default branch in the switch handling scalar styles uses assert(1), which always passes; replace that with assert(0) (or an explicit failing assertion) so the impossible case actually triggers a failure—locate the default: assert(1); occurrence in emitter.c (the switch over scalar styles) and change it to assert(0) to enforce the impossible-case check.rust/md4x-sys/vendor/libyaml/src/dumper.c-117-118 (1)
117-118:⚠️ Potential issue | 🟡 MinorComment typo: refers to "emitter" but checks
documentparameter.The assertion comment on Line 118 appears to be copy-pasted from Line 117. It should refer to the
documentparameter being checked.Proposed fix
assert(emitter); /* Non-NULL emitter object is required. */ - assert(document); /* Non-NULL emitter object is expected. */ + assert(document); /* Non-NULL document object is expected. */🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/src/dumper.c` around lines 117 - 118, The comment on the second assert is incorrect: it refers to "emitter" while the check is for the document parameter; update the comment next to assert(document); to mention "document" (e.g., "Non-NULL document object is required."), keeping the surrounding asserts and semantics in dumper.c intact so the assertions read consistently for emitters and documents.rust/md4x-sys/vendor/libyaml/include/Makefile.am-16-17 (1)
16-17:⚠️ Potential issue | 🟡 Minor
dist-hookdepends onhtmltarget which is only defined when DOXYGEN is enabled.If DOXYGEN is not enabled (line 6 conditional), the
htmltarget won't exist, causingmake distto fail. The dependency should be conditional or thehtmlprerequisite should be moved inside the conditional block.Proposed fix
-dist-hook: html - cp -a $(top_builddir)/doc/html $(top_distdir)/doc +if DOXYGEN +dist-hook: html + cp -a $(top_builddir)/doc/html $(top_distdir)/doc +endif🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/include/Makefile.am` around lines 16 - 17, The dist-hook target currently lists html as a prerequisite even when DOXYGEN is not enabled, causing make dist to fail; update the Makefile.am so that the cp -a $(top_builddir)/doc/html $(top_distdir)/doc step is only added when the DOXYGEN conditional is active (i.e., move the html prerequisite and the dist-hook install/copy command inside the same DOXYGEN conditional block) or make the html dependency conditional (using Automake conditionals) so dist-hook does not depend on a non-existent html target when DOXYGEN is disabled; locate the dist-hook target and the DOXYGEN conditional in the file and adjust accordingly.rust/md4x/src/lib.rs-264-277 (1)
264-277:⚠️ Potential issue | 🟡 Minor
CString::new().unwrap_or_default()silently handles interior NUL bytes.If
titleorcss_urlcontains an interior NUL byte,CString::new()returnsErr, andunwrap_or_default()silently produces an empty CString. This may be surprising to users. Consider either documenting this behavior or returning an error.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x/src/lib.rs` around lines 264 - 277, The current use of std::ffi::CString::new(...).unwrap_or_default() for title_cstr and css_cstr (constructed from opts.title and opts.css_url) silently converts strings containing interior NULs into empty CStrings; instead, explicitly handle the Result from CString::new for both title_cstr and css_cstr: propagate an error (update the enclosing function to return Result and return a descriptive error) or at minimum map the Err into a clear runtime error/log message rather than unwrap_or_default; update the title_cstr/css_cstr construction and any function signature that needs to change to surface this validation failure to callers.rust/md4x/src/lib.rs-229-238 (1)
229-238:⚠️ Potential issue | 🟡 MinorPotential truncation when casting
input.len()tou32.For inputs larger than
u32::MAX(4 GiB), this cast will silently truncate, causing only a portion of the input to be parsed. While extremely rare in practice, consider adding a length check or documenting the limitation.🛡️ Optional: Add length validation
pub fn render_html( input: &str, parser_flags: ParserFlags, renderer_flags: HtmlFlags, ) -> Result<String, Error> { + if input.len() > u32::MAX as usize { + return Err(Error); + } let mut buf: Vec<u8> = Vec::new(); let rc = unsafe {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x/src/lib.rs` around lines 229 - 238, The call to sys::md_html casts input.len() to u32 which can truncate for inputs > u32::MAX; update the code around the sys::md_html invocation to validate input.len() before casting (e.g., check if input.len() > u32::MAX) and return or propagate a clear error instead of calling sys::md_html with a truncated length; reference the input variable and the sys::md_html call so you add the length check immediately before creating rc and only cast to u32 after confirming the size is within bounds.rust/md4x-sys/vendor/libyaml/src/api.c-404-409 (1)
404-409:⚠️ Potential issue | 🟡 MinorTypo:
emptyshould beemitterinSTACK_EMPTYmacro call.The first argument to
STACK_EMPTYshould beemitter, notempty. This is inconsistent with the original libyaml source code and other uses of the macro throughout the function. Although the macro doesn't use this parameter in its expansion, the typo should be corrected for consistency with the upstream source.Fix
- while (!STACK_EMPTY(empty, emitter->tag_directives)) { + while (!STACK_EMPTY(emitter, emitter->tag_directives)) { yaml_tag_directive_t tag_directive = POP(emitter, emitter->tag_directives);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/src/api.c` around lines 404 - 409, The macro call uses a typo: replace the incorrect first argument `empty` with `emitter` in the STACK_EMPTY invocation that guards popping tag_directives; specifically update the line containing STACK_EMPTY(empty, emitter->tag_directives) so it reads STACK_EMPTY(emitter, emitter->tag_directives) to match the other uses (see POP(emitter, emitter->tag_directives), yaml_tag_directive_t handling, and STACK_DEL(emitter, emitter->tag_directives)).
🧹 Nitpick comments (12)
.github/workflows/release.yml (1)
63-65: Consider increasing the crates.io index wait time.A 20-second sleep may not be sufficient for the crates.io index to propagate, especially under load. The
md4xcrate depends onmd4x-sys, and if the index hasn't updated, the publish will fail.♻️ Suggested improvement with retry logic or longer wait
- name: Wait for crates.io index if: ${{ env.CARGO_REGISTRY_TOKEN != '' }} - run: sleep 20 + run: sleep 30Or use a more robust polling approach:
- name: Wait for crates.io index if: ${{ env.CARGO_REGISTRY_TOKEN != '' }} run: | for i in {1..10}; do cargo search md4x-sys 2>/dev/null | grep -q "^md4x-sys" && break echo "Waiting for crates.io index... (attempt $i)" sleep 10 done🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/release.yml around lines 63 - 65, The "Wait for crates.io index" step currently uses a fixed 20s sleep (if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }}), which is often too short for index propagation and causes downstream publishes of md4x/md4x-sys to fail; replace the single sleep with a robust polling/retry loop that runs cargo search md4x-sys (or similar index probe) up to N attempts with short sleeps between attempts, or at minimum increase the wait time substantially (e.g., multiple 10s intervals), and ensure the step still guards on the same token condition.rust/md4x-sys/Cargo.toml (1)
1-10: Consider adding recommended crates.io metadata.The core Cargo.toml configuration is correct with proper
linksdirective and build dependency. However, for better discoverability on crates.io, consider adding:📦 Suggested metadata additions
[package] name = "md4x-sys" version = "0.0.18" edition = "2021" description = "Raw FFI bindings to md4x, a CommonMark markdown parser" license = "MIT" links = "md4x" +repository = "https://github.com/unjs/md4x" +homepage = "https://github.com/unjs/md4x" +readme = "README.md" +keywords = ["markdown", "commonmark", "parser", "ffi"] +categories = ["parsing", "text-processing"] [build-dependencies] cc = "1"You may also want to add an
includeorexcludedirective to control what gets published to crates.io, especially given the vendored libyaml directory.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/Cargo.toml` around lines 1 - 10, Add recommended crates.io metadata to Cargo.toml for the md4x-sys package: include fields such as repository, homepage, documentation, readme, keywords, categories, and license-file to improve discoverability and linking; also add an include or exclude directive to control published files (exclude the vendored libyaml directory or explicitly include only src, Cargo.toml, README.md, LICENSE) so the crate archive doesn't ship unnecessary vendored sources. Ensure you update the [package] section in Cargo.toml and keep names like md4x-sys and the existing links field intact.rust/md4x-sys/build.rs (2)
13-25: Consider adding optimization flag to libyaml build for consistency.The md4x build (line 41) includes
-O2optimization, but the libyaml build does not have any optimization flags. For consistent performance characteristics, consider adding the same optimization level to the libyaml build.♻️ Proposed fix
cc::Build::new() .include(libyaml.join("include")) .include(libyaml.join("src")) .define("YAML_DECLARE_STATIC", None) .define("YAML_VERSION_MAJOR", "0") .define("YAML_VERSION_MINOR", "2") .define("YAML_VERSION_PATCH", "5") .define("YAML_VERSION_STRING", "\"0.2.5\"") + .flag_if_supported("-O2") .file(libyaml.join("src/api.c")) .file(libyaml.join("src/reader.c")) .file(libyaml.join("src/scanner.c")) .file(libyaml.join("src/parser.c")) .compile("yaml");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/build.rs` around lines 13 - 25, The libyaml cc build invocation (cc::Build::new() ... .compile("yaml")) lacks an optimization setting; add the same optimization level used elsewhere (e.g., -O2) by configuring the cc build (use .opt_level("2") or add .flag("-O2")) on the cc::Build chain before .compile("yaml") so libyaml is compiled with consistent optimization as md4x's build.
52-53:cargo:rerun-if-changedon directories may not trigger rebuilds as expected.Using
cargo:rerun-if-changedwith directory paths only detects changes to the directory's metadata (e.g., adding/removing files), not changes to file contents within. For more reliable rebuild detection, consider using a build dependency likeglobto enumerate files, or list specific source files.♻️ Suggested improvement using explicit file tracking
- println!("cargo:rerun-if-changed={}", src.display()); - println!("cargo:rerun-if-changed={}", libyaml.display()); + // Track specific source files for reliable rebuild detection + for entry in std::fs::read_dir(&src).unwrap() { + let path = entry.unwrap().path(); + if path.extension().map_or(false, |e| e == "c" || e == "h") { + println!("cargo:rerun-if-changed={}", path.display()); + } + } + for entry in std::fs::read_dir(&renderers).unwrap() { + let path = entry.unwrap().path(); + if path.extension().map_or(false, |e| e == "c" || e == "h") { + println!("cargo:rerun-if-changed={}", path.display()); + } + } + println!("cargo:rerun-if-changed={}", libyaml.join("src").display()); + println!("cargo:rerun-if-changed={}", libyaml.join("include").display());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/build.rs` around lines 52 - 53, The two println!("cargo:rerun-if-changed={}", src.display()) and println!("cargo:rerun-if-changed={}", libyaml.display()) emit directory paths which only track metadata changes; update build.rs (e.g., in the main build script where src and libyaml are used) to enumerate the actual source files (using a glob or walking the directories) and call println!("cargo:rerun-if-changed={}", path.display()) for each file found so that Cargo will rerun the build when file contents change rather than only when directory metadata changes.rust/md4x-sys/vendor/libyaml/tests/example-reformatter.c (1)
2-5: Add the direct<string.h>include.Lines 22-24 and Lines 30-41 use
memsetandstrcmp, so depending onyaml.hto provide those declarations is brittle and can break once indirect includes change.Proposed fix
`#include` <yaml.h> `#include` <stdlib.h> `#include` <stdio.h> +#include <string.h>Also applies to: 22-24, 30-41
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/tests/example-reformatter.c` around lines 2 - 5, The file example-reformatter.c relies on C string functions but doesn't include <string.h>; add a direct `#include` <string.h> alongside the existing includes at the top of example-reformatter.c so declarations for memset and strcmp are guaranteed (these functions are used in the code paths around the memset and strcmp calls).rust/md4x-sys/vendor/libyaml/.github/workflows/main.yml (1)
1-57: If this is meant to add CI here, this workflow is inert from this path.GitHub Actions only discovers workflows from the repository-root
.github/workflows, so the copy underrust/md4x-sys/vendor/libyaml/...will never execute in this repo. Mirror the job into the top-level workflow if you want libyaml coverage inmd4x.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/.github/workflows/main.yml` around lines 1 - 57, The workflow file (name: linux/mac) lives under a nested vendor directory so GitHub Actions won't discover it; to fix, move or duplicate this workflow into the repository-root workflows directory (.github/workflows/) or add an equivalent top-level workflow that contains the same "jobs: build" job and steps (bootstrap/configure/make/test, matrix strategy, install steps and compiler matrix) so libyaml tests run in CI; ensure the workflow name, matrix settings (matrix.compiler, matrix.os), and triggers (on: push/pull_request) are preserved in the top-level main.yml.rust/md4x-sys/vendor/libyaml/tests/CMakeLists.txt (1)
7-26: Register the new suite runners with CTest or another explicit CI entrypoint.
ctestonly knows abouttest-versionandtest-readerhere. Unless another job invokes them,run-parser-test-suiteandrun-emitter-test-suiteare build-only in CMake-based CI, so regressions in the new harnesses can slip through.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/tests/CMakeLists.txt` around lines 7 - 26, CTest is only registering test-version and test-reader, so the new harness runners (run-parser-test-suite, run-emitter-test-suite, and any other run-*-test-suite executables) are not executed by CI; update the CMakeLists.txt to call add_test for each runner you want executed (e.g., add_test(NAME run-parser-test-suite COMMAND run-parser-test-suite) and add_test(NAME run-emitter-test-suite COMMAND run-emitter-test-suite)), and ensure any other run-*-test-suite executables produced by add_yaml_executable are likewise registered so CTest will run them.rust/md4x-sys/vendor/libyaml/Makefile.am (1)
24-33: Use$(MAKE)for recursive invocations.The bare
makecalls here won't reliably forward the parent make's jobserver and flags, so-j,V=1,DESTDIR, and similar CI/release options can get lost.♻️ Suggested fix
bootstrap: maintainer-clean ./bootstrap ./configure - make + $(MAKE) @@ test: all - make -C tests check-TESTS + $(MAKE) -C tests check-TESTS @@ test-suite: tests/run-test-suite all - make -C $< test + $(MAKE) -C $< test @@ docker-dist: packaging - make -C $</docker libyaml-dist + $(MAKE) -C $</docker libyaml-dist @@ docker-test-pyyaml: packaging - make -C $</docker test-pyyaml + $(MAKE) -C $</docker test-pyyamlAlso applies to: 50-54
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/Makefile.am` around lines 24 - 33, Replace bare make invocations in the Makefile targets with $(MAKE) so jobserver and make flags propagate: in the bootstrap target use $(MAKE) instead of make; in the test target use $(MAKE) -C tests check-TESTS; and in the test-suite target use $(MAKE) -C $< test. Update any other occurrences (lines noted around the bootstrap, test, and test-suite targets) to use $(MAKE) consistently.rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c (1)
2-5: Missing#include <string.h>formemsetandstrcmp.The code uses
memset(lines 23-26) andstrcmp(lines 32, 37, 42) but doesn't explicitly include<string.h>. While this may work ifyaml.htransitively includes it, it's better to explicitly include required headers for portability.Proposed fix
`#include` <yaml.h> `#include` <stdlib.h> `#include` <stdio.h> +#include <string.h>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c` around lines 2 - 5, Add an explicit `#include` <string.h> to example-deconstructor.c so the declarations for memset and strcmp are available; place it with the other includes (near `#include` <stdlib.h> and `#include` <stdio.h>) to ensure uses of memset (lines invoking buffer zeroing) and strcmp (lines comparing YAML event strings) are properly declared.rust/md4x-sys/vendor/libyaml/include/Makefile.am (1)
1-4: Misleading variable name:AM_CPPFLAGStypically holds preprocessor flags, not file lists.
AM_CPPFLAGSconventionally holds C preprocessor flags like-Iand-D. Using it to store a header file name is confusing and non-idiomatic. Consider renaming to something likeLIBYAML_HEADERSfor clarity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x-sys/vendor/libyaml/include/Makefile.am` around lines 1 - 4, The variable AM_CPPFLAGS is misused to list a header file; rename it to a clear identifier like LIBYAML_HEADERS and update usages accordingly: replace the definition AM_CPPFLAGS = yaml.h with LIBYAML_HEADERS = yaml.h and change nobase_include_HEADERS = $(AM_CPPFLAGS) to nobase_include_HEADERS = $(LIBYAML_HEADERS), making sure no other rules or files rely on AM_CPPFLAGS for preprocessor flags (if they do, move actual flags into AM_CPPFLAGS and keep header list in LIBYAML_HEADERS).rust/md4x/src/lib.rs (2)
33-43: Consider adding context to theErrortype.The current
Erroris a unit struct with no information about what failed. Consider adding optional context (e.g., function name or error code) to aid debugging.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x/src/lib.rs` around lines 33 - 43, Change the unit `Error` into a contextual error struct (e.g., `pub struct Error { context: Option<String> }`) and update the `Display` impl to include the context when present (so `fmt` prints "md4x rendering failed" plus the context); add small helpers like `Error::new()` and `Error::with_context<C: Into<String>>(c: C)` to construct errors with optional context, and keep `impl std::error::Error for Error {}` as-is; update any call sites that construct `Error` to use the new constructors.
426-430:collect_healduplicatescollect_output.The
collect_healfunction is identical tocollect_output. Consider reusing the existing callback.♻️ Reuse existing callback
pub fn heal(input: &str) -> Result<String, Error> { let mut buf: Vec<u8> = Vec::new(); - - unsafe extern "C" fn collect_heal(data: *const c_char, size: u32, userdata: *mut c_void) { - let buf = &mut *(userdata as *mut Vec<u8>); - let slice = std::slice::from_raw_parts(data as *const u8, size as usize); - buf.extend_from_slice(slice); - } let rc = unsafe { sys::md_heal( input.as_ptr() as *const c_char, input.len() as u32, - collect_heal, + collect_output, &mut buf as *mut Vec<u8> as *mut c_void, ) };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x/src/lib.rs` around lines 426 - 430, collect_heal is a duplicate of collect_output; remove the duplicated unsafe extern "C" fn collect_heal and reuse the existing collect_output callback instead: delete collect_heal definition and update any places that passed collect_heal (callback registrations or FFI calls) to pass collect_output, making sure the signature matches (unsafe extern "C" fn(data: *const c_char, size: u32, userdata: *mut c_void)) and adjust imports/visibility if necessary so collect_output is visible where the callback is used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/release.yml:
- Around line 56-72: Update the three steps named "Publish md4x-sys to
crates.io", "Wait for crates.io index", and "Publish md4x to crates.io" to stop
using the secrets context in their if conditions; instead set
CARGO_REGISTRY_TOKEN as an env at the job level or in a prior step and change
the if expressions to use env (e.g. if: ${{ env.CARGO_REGISTRY_TOKEN != '' }}),
and remove the direct use of ${{ secrets.CARGO_REGISTRY_TOKEN }} inside the if
so the conditional works correctly in GitHub Actions.
In `@rust/md4x-sys/vendor/libyaml/src/api.c`:
- Around line 1129-1146: In yaml_document_delete, add the missing local
declaration for the variable named context (use the same declaration used in
yaml_document_add_scalar); place that declaration at the start of the
yaml_document_delete function so the STACK_EMPTY, POP, and STACK_DEL macros can
reference it (i.e., copy the context variable declaration from
yaml_document_add_scalar into yaml_document_delete).
In `@rust/md4x-sys/vendor/libyaml/src/writer.c`:
- Around line 116-120: The low-surrogate high byte is taking the wrong bits (it
uses ((value >> 8) & 0xFF)) so most scalars > U+10000 produce invalid UTF-16;
update the assignment that writes the low-surrogate high byte (the
emitter->raw_buffer.last[high+2] write using value) to mask down to the top 2
bits of the 10-bit low-surrogate payload (use ((value >> 8) & 0x03) instead of &
0xFF) so the low surrogate becomes 0xDC00 + (value & 0x3FF) and U+10400 will end
with 0xDC00.
In `@rust/md4x-sys/vendor/libyaml/tests/example-deconstructor-alt.c`:
- Around line 426-435: The code is creating the scalar implicitness mapping but
uses the wrong top-level key "version"; change the yaml_document_add_scalar call
that sets the key to use "implicit" instead of "version" so the mapping appended
with yaml_document_append_mapping_pair(&output_document, properties, key, map)
correctly names the section; update the scalar creation where key is assigned
(yaml_document_add_scalar) to "implicit" to restore the expected schema for
scalar events.
In `@rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c`:
- Around line 58-60: The code sets foundfile immediately after calling
fopen(argv[i], "rb") even when fopen returns NULL; change the logic in
run-emitter-test-suite.c so you first call fopen and check its return (input)
for NULL, only then set foundfile and proceed; update both occurrences where
fopen/ foundfile are used (the block with input = fopen(argv[i], "rb") and the
similar block at the other occurrence around the 72-line area) to avoid marking
the file as found when fopen fails and to handle/propagate the fopen error
appropriately.
- Around line 85-86: The test harness extracts token spans into fixed-size stack
buffers anchor[256] and tag[256] without bounds checking; ensure any computed
length for these copies is validated to be >=0 and < sizeof(anchor) (and
sizeof(tag)), clamp the copy length to sizeof-1, perform the copy with a bounded
routine and explicitly null-terminate, and handle the case where a '>' appears
before '<' by treating it as an error/empty span; apply the same fixes to the
other copy site around the 206-230 region so both anchor and tag extractions are
safe.
- Around line 131-138: get_value currently may write embedded NULs but the
caller in run-emitter-test-suite.c passes -1 to yaml_scalar_event_initialize,
causing truncation; change get_value to return the actual scalar length (update
its prototype and definition) and capture that length at the call site, then
pass that length instead of -1 to yaml_scalar_event_initialize (keep using
get_anchor, get_tag and the same value buffer); ensure any callers of get_value
use the new return value.
In `@rust/md4x-sys/vendor/libyaml/tests/run-parser-test-suite.c`:
- Around line 36-38: The code sets foundfile before checking fopen's result, so
a failed fopen can leave input NULL and later call
yaml_parser_set_input_file(NULL); change the logic in the main loop (where
variables input and foundfile are used and where yaml_parser_set_input_file is
called) to only set foundfile = 1 after verifying input != NULL, and if fopen
fails handle the error (e.g., print a message and continue or exit) rather than
proceeding; ensure both occurrences (the block around the current fopen and the
similar block at lines 43-46) perform the same
NULL-check-before-setting-foundfile and avoid passing a NULL FILE* to
yaml_parser_set_input_file.
---
Outside diff comments:
In `@AGENTS.md`:
- Around line 137-150: The heading "Produces four static libraries" is stale;
update that summary to reflect the actual number of C libraries listed (libmd4x,
libmd4x-html, libmd4x-ast, libmd4x-ansi, libmd4x-meta, libmd4x-text,
libmd4x-heal) so the doc accurately states seven C libraries are produced before
the executable/WASM/NAPI/Rust targets; locate the heading above the list and
change the count wording accordingly while leaving the individual items (md4x,
md4x.wasm, md4x.{platform}-{arch}[-musl].node, rust/md4x-sys, rust/md4x)
unchanged.
---
Minor comments:
In `@rust/md4x-sys/vendor/libyaml/.makefile`:
- Around line 31-37: The MAKEFILE sets SOURCE_FILES to the wrong tree
(tests/run-test-suite) causing fresh checkouts to miss sources; change the
SOURCE_FILES assignment to search the real source tree (e.g., use find . -type f
-name '*.c' or the original commented pattern) so it captures src/ and tests/,
and keep the INDENT fallback behavior intact by ensuring SOURCE_FILES is
non-empty before invoking INDENT; update the SOURCE_FILES variable (and remove
the hard-coded tests/run-test-suite) and verify functions referencing
SOURCE_FILES and INDENT still work during pre-bootstrap.
In `@rust/md4x-sys/vendor/libyaml/CMakeLists.txt`:
- Around line 84-88: The install rule currently hard-codes "include" instead of
using the same variable as the install interface; update the install(FILES ...
DESTINATION ...) call that installs include/yaml.h to use ${INSTALL_INCLUDE_DIR}
(the project install include dir variable) so header installation honors
non-default include-dir overrides and matches the target's install interface.
In `@rust/md4x-sys/vendor/libyaml/docker/README.mkd`:
- Around line 6-16: Update the README.mkd to correct the awkward header and fix
the example Dockerfile path: replace the sentence "The current list is
supports:" with a clear heading like "Supported platforms:" and change the
example docker build command to point at the repo's docker directory (e.g., use
the docker/dockerfiles/ubuntu-16.04 path in the -f argument instead of
dockerfiles/ubuntu-16.04) so the example matches the files added in this PR.
In `@rust/md4x-sys/vendor/libyaml/examples/strings.yaml`:
- Line 1: Fix the spelling mistake in the YAML example by renaming the key
"unqouted" to "unquoted" in strings.yaml; update the example entry so the key is
correctly spelled ("unquoted: string") to keep the documentation/example
consistent and clear.
In `@rust/md4x-sys/vendor/libyaml/include/Makefile.am`:
- Around line 16-17: The dist-hook target currently lists html as a prerequisite
even when DOXYGEN is not enabled, causing make dist to fail; update the
Makefile.am so that the cp -a $(top_builddir)/doc/html $(top_distdir)/doc step
is only added when the DOXYGEN conditional is active (i.e., move the html
prerequisite and the dist-hook install/copy command inside the same DOXYGEN
conditional block) or make the html dependency conditional (using Automake
conditionals) so dist-hook does not depend on a non-existent html target when
DOXYGEN is disabled; locate the dist-hook target and the DOXYGEN conditional in
the file and adjust accordingly.
In `@rust/md4x-sys/vendor/libyaml/src/api.c`:
- Around line 404-409: The macro call uses a typo: replace the incorrect first
argument `empty` with `emitter` in the STACK_EMPTY invocation that guards
popping tag_directives; specifically update the line containing
STACK_EMPTY(empty, emitter->tag_directives) so it reads STACK_EMPTY(emitter,
emitter->tag_directives) to match the other uses (see POP(emitter,
emitter->tag_directives), yaml_tag_directive_t handling, and STACK_DEL(emitter,
emitter->tag_directives)).
In `@rust/md4x-sys/vendor/libyaml/src/dumper.c`:
- Around line 117-118: The comment on the second assert is incorrect: it refers
to "emitter" while the check is for the document parameter; update the comment
next to assert(document); to mention "document" (e.g., "Non-NULL document object
is required."), keeping the surrounding asserts and semantics in dumper.c intact
so the assertions read consistently for emitters and documents.
In `@rust/md4x-sys/vendor/libyaml/src/emitter.c`:
- Around line 483-485: The default branch in the emitter state switch uses
assert(1) which never fails; update the invalid-state guard in emitter.c (the
switch's default case in the emitter state handling, e.g., within the function
containing that switch) to assert(0) so an unexpected emitter state triggers a
failure (or alternatively call abort()/yaml_emitter_set_error with context if
you prefer explicit error handling), ensuring the invalid-state path actually
fails during debugging.
- Around line 1340-1342: The default branch in the switch handling scalar styles
uses assert(1), which always passes; replace that with assert(0) (or an explicit
failing assertion) so the impossible case actually triggers a failure—locate the
default: assert(1); occurrence in emitter.c (the switch over scalar styles) and
change it to assert(0) to enforce the impossible-case check.
In `@rust/md4x-sys/vendor/libyaml/src/parser.c`:
- Around line 299-301: The default case in the switch inside parser.c currently
uses assert(1) which is a no-op; change that assertion to fail on invalid states
by replacing assert(1) with assert(0) (or an equivalent failing assertion like
assert(!"Invalid state")) in the default: branch so unexpected parser states
trigger a diagnostic; locate the default case in the parser.c switch and update
the assert accordingly.
In `@rust/md4x-sys/vendor/libyaml/src/reader.c`:
- Around line 396-398: The switch default in reader.c currently uses assert(1)
which always succeeds; update the default case's assertion to assert(0) (or an
equivalent failing assertion) so unexpected encoding states are caught during
debugging—locate the default: branch in the switch where assert(1) is used and
replace it with assert(0) to ensure the "Impossible" path actually triggers an
assertion failure.
In `@rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c`:
- Line 77: Fix the typo in the fprintf error message: change the string passed
to fprintf(stderr, ...) that currently reads "Could not inialize the emitter
object\n" to use the correct spelling "initialize" so the message becomes "Could
not initialize the emitter object\n"; update the fprintf call in
example-deconstructor.c (the error logging for the emitter object) accordingly.
In `@rust/md4x-sys/vendor/libyaml/tests/run-emitter.c`:
- Around line 71-73: The default branch in the switch handling event types in
run-emitter.c currently uses assert(1) which always succeeds; change that
assertion to assert(0) (or an equivalent failing assertion such as
assert(!"unexpected event type")) so the test fails on unexpected event types —
locate the default: case in the switch and replace the assert(1) call
accordingly.
In `@rust/md4x/src/lib.rs`:
- Around line 264-277: The current use of
std::ffi::CString::new(...).unwrap_or_default() for title_cstr and css_cstr
(constructed from opts.title and opts.css_url) silently converts strings
containing interior NULs into empty CStrings; instead, explicitly handle the
Result from CString::new for both title_cstr and css_cstr: propagate an error
(update the enclosing function to return Result and return a descriptive error)
or at minimum map the Err into a clear runtime error/log message rather than
unwrap_or_default; update the title_cstr/css_cstr construction and any function
signature that needs to change to surface this validation failure to callers.
- Around line 229-238: The call to sys::md_html casts input.len() to u32 which
can truncate for inputs > u32::MAX; update the code around the sys::md_html
invocation to validate input.len() before casting (e.g., check if input.len() >
u32::MAX) and return or propagate a clear error instead of calling sys::md_html
with a truncated length; reference the input variable and the sys::md_html call
so you add the length check immediately before creating rc and only cast to u32
after confirming the size is within bounds.
---
Nitpick comments:
In @.github/workflows/release.yml:
- Around line 63-65: The "Wait for crates.io index" step currently uses a fixed
20s sleep (if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }}), which is often too
short for index propagation and causes downstream publishes of md4x/md4x-sys to
fail; replace the single sleep with a robust polling/retry loop that runs cargo
search md4x-sys (or similar index probe) up to N attempts with short sleeps
between attempts, or at minimum increase the wait time substantially (e.g.,
multiple 10s intervals), and ensure the step still guards on the same token
condition.
In `@rust/md4x-sys/build.rs`:
- Around line 13-25: The libyaml cc build invocation (cc::Build::new() ...
.compile("yaml")) lacks an optimization setting; add the same optimization level
used elsewhere (e.g., -O2) by configuring the cc build (use .opt_level("2") or
add .flag("-O2")) on the cc::Build chain before .compile("yaml") so libyaml is
compiled with consistent optimization as md4x's build.
- Around line 52-53: The two println!("cargo:rerun-if-changed={}",
src.display()) and println!("cargo:rerun-if-changed={}", libyaml.display()) emit
directory paths which only track metadata changes; update build.rs (e.g., in the
main build script where src and libyaml are used) to enumerate the actual source
files (using a glob or walking the directories) and call
println!("cargo:rerun-if-changed={}", path.display()) for each file found so
that Cargo will rerun the build when file contents change rather than only when
directory metadata changes.
In `@rust/md4x-sys/Cargo.toml`:
- Around line 1-10: Add recommended crates.io metadata to Cargo.toml for the
md4x-sys package: include fields such as repository, homepage, documentation,
readme, keywords, categories, and license-file to improve discoverability and
linking; also add an include or exclude directive to control published files
(exclude the vendored libyaml directory or explicitly include only src,
Cargo.toml, README.md, LICENSE) so the crate archive doesn't ship unnecessary
vendored sources. Ensure you update the [package] section in Cargo.toml and keep
names like md4x-sys and the existing links field intact.
In `@rust/md4x-sys/vendor/libyaml/.github/workflows/main.yml`:
- Around line 1-57: The workflow file (name: linux/mac) lives under a nested
vendor directory so GitHub Actions won't discover it; to fix, move or duplicate
this workflow into the repository-root workflows directory (.github/workflows/)
or add an equivalent top-level workflow that contains the same "jobs: build" job
and steps (bootstrap/configure/make/test, matrix strategy, install steps and
compiler matrix) so libyaml tests run in CI; ensure the workflow name, matrix
settings (matrix.compiler, matrix.os), and triggers (on: push/pull_request) are
preserved in the top-level main.yml.
In `@rust/md4x-sys/vendor/libyaml/include/Makefile.am`:
- Around line 1-4: The variable AM_CPPFLAGS is misused to list a header file;
rename it to a clear identifier like LIBYAML_HEADERS and update usages
accordingly: replace the definition AM_CPPFLAGS = yaml.h with LIBYAML_HEADERS =
yaml.h and change nobase_include_HEADERS = $(AM_CPPFLAGS) to
nobase_include_HEADERS = $(LIBYAML_HEADERS), making sure no other rules or files
rely on AM_CPPFLAGS for preprocessor flags (if they do, move actual flags into
AM_CPPFLAGS and keep header list in LIBYAML_HEADERS).
In `@rust/md4x-sys/vendor/libyaml/Makefile.am`:
- Around line 24-33: Replace bare make invocations in the Makefile targets with
$(MAKE) so jobserver and make flags propagate: in the bootstrap target use
$(MAKE) instead of make; in the test target use $(MAKE) -C tests check-TESTS;
and in the test-suite target use $(MAKE) -C $< test. Update any other
occurrences (lines noted around the bootstrap, test, and test-suite targets) to
use $(MAKE) consistently.
In `@rust/md4x-sys/vendor/libyaml/tests/CMakeLists.txt`:
- Around line 7-26: CTest is only registering test-version and test-reader, so
the new harness runners (run-parser-test-suite, run-emitter-test-suite, and any
other run-*-test-suite executables) are not executed by CI; update the
CMakeLists.txt to call add_test for each runner you want executed (e.g.,
add_test(NAME run-parser-test-suite COMMAND run-parser-test-suite) and
add_test(NAME run-emitter-test-suite COMMAND run-emitter-test-suite)), and
ensure any other run-*-test-suite executables produced by add_yaml_executable
are likewise registered so CTest will run them.
In `@rust/md4x-sys/vendor/libyaml/tests/example-deconstructor.c`:
- Around line 2-5: Add an explicit `#include` <string.h> to
example-deconstructor.c so the declarations for memset and strcmp are available;
place it with the other includes (near `#include` <stdlib.h> and `#include`
<stdio.h>) to ensure uses of memset (lines invoking buffer zeroing) and strcmp
(lines comparing YAML event strings) are properly declared.
In `@rust/md4x-sys/vendor/libyaml/tests/example-reformatter.c`:
- Around line 2-5: The file example-reformatter.c relies on C string functions
but doesn't include <string.h>; add a direct `#include` <string.h> alongside the
existing includes at the top of example-reformatter.c so declarations for memset
and strcmp are guaranteed (these functions are used in the code paths around the
memset and strcmp calls).
In `@rust/md4x/src/lib.rs`:
- Around line 33-43: Change the unit `Error` into a contextual error struct
(e.g., `pub struct Error { context: Option<String> }`) and update the `Display`
impl to include the context when present (so `fmt` prints "md4x rendering
failed" plus the context); add small helpers like `Error::new()` and
`Error::with_context<C: Into<String>>(c: C)` to construct errors with optional
context, and keep `impl std::error::Error for Error {}` as-is; update any call
sites that construct `Error` to use the new constructors.
- Around line 426-430: collect_heal is a duplicate of collect_output; remove the
duplicated unsafe extern "C" fn collect_heal and reuse the existing
collect_output callback instead: delete collect_heal definition and update any
places that passed collect_heal (callback registrations or FFI calls) to pass
collect_output, making sure the signature matches (unsafe extern "C" fn(data:
*const c_char, size: u32, userdata: *mut c_void)) and adjust imports/visibility
if necessary so collect_output is visible where the callback is used.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8f4a689f-7daa-4b06-9578-f704c0cc5669
⛔ Files ignored due to path filters (1)
rust/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (74)
.github/workflows/ci.yml.github/workflows/release.yml.gitignore.prettierignoreAGENTS.mdCHANGELOG.mdREADME.mdrust/Cargo.tomlrust/md4x-sys/Cargo.tomlrust/md4x-sys/build.rsrust/md4x-sys/src/lib.rsrust/md4x-sys/vendor/libyaml/.appveyor.ymlrust/md4x-sys/vendor/libyaml/.github/workflows/main.ymlrust/md4x-sys/vendor/libyaml/.gitignorerust/md4x-sys/vendor/libyaml/.indent.prorust/md4x-sys/vendor/libyaml/.makefilerust/md4x-sys/vendor/libyaml/.travis.ymlrust/md4x-sys/vendor/libyaml/CMakeLists.txtrust/md4x-sys/vendor/libyaml/Changesrust/md4x-sys/vendor/libyaml/Licenserust/md4x-sys/vendor/libyaml/Makefile.amrust/md4x-sys/vendor/libyaml/ReadMe.mdrust/md4x-sys/vendor/libyaml/announcement.msgrust/md4x-sys/vendor/libyaml/bootstraprust/md4x-sys/vendor/libyaml/configure.acrust/md4x-sys/vendor/libyaml/doc/doxygen.cfgrust/md4x-sys/vendor/libyaml/docker/README.mkdrust/md4x-sys/vendor/libyaml/docker/alpine-3.7rust/md4x-sys/vendor/libyaml/docker/fedora-25rust/md4x-sys/vendor/libyaml/docker/ubuntu-14.04rust/md4x-sys/vendor/libyaml/docker/ubuntu-16.04rust/md4x-sys/vendor/libyaml/examples/anchors.yamlrust/md4x-sys/vendor/libyaml/examples/array.yamlrust/md4x-sys/vendor/libyaml/examples/global-tag.yamlrust/md4x-sys/vendor/libyaml/examples/json.yamlrust/md4x-sys/vendor/libyaml/examples/mapping.yamlrust/md4x-sys/vendor/libyaml/examples/numbers.yamlrust/md4x-sys/vendor/libyaml/examples/strings.yamlrust/md4x-sys/vendor/libyaml/examples/tags.yamlrust/md4x-sys/vendor/libyaml/examples/yaml-version.yamlrust/md4x-sys/vendor/libyaml/include/Makefile.amrust/md4x-sys/vendor/libyaml/include/yaml.hrust/md4x-sys/vendor/libyaml/regression-inputs/clusterfuzz-testcase-minimized-5607885063061504.ymlrust/md4x-sys/vendor/libyaml/src/Makefile.amrust/md4x-sys/vendor/libyaml/src/api.crust/md4x-sys/vendor/libyaml/src/dumper.crust/md4x-sys/vendor/libyaml/src/emitter.crust/md4x-sys/vendor/libyaml/src/loader.crust/md4x-sys/vendor/libyaml/src/parser.crust/md4x-sys/vendor/libyaml/src/reader.crust/md4x-sys/vendor/libyaml/src/scanner.crust/md4x-sys/vendor/libyaml/src/writer.crust/md4x-sys/vendor/libyaml/src/yaml_private.hrust/md4x-sys/vendor/libyaml/tests/CMakeLists.txtrust/md4x-sys/vendor/libyaml/tests/Makefile.amrust/md4x-sys/vendor/libyaml/tests/ReadMe.mdrust/md4x-sys/vendor/libyaml/tests/example-deconstructor-alt.crust/md4x-sys/vendor/libyaml/tests/example-deconstructor.crust/md4x-sys/vendor/libyaml/tests/example-reformatter-alt.crust/md4x-sys/vendor/libyaml/tests/example-reformatter.crust/md4x-sys/vendor/libyaml/tests/run-all-tests.shrust/md4x-sys/vendor/libyaml/tests/run-dumper.crust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.crust/md4x-sys/vendor/libyaml/tests/run-emitter.crust/md4x-sys/vendor/libyaml/tests/run-loader.crust/md4x-sys/vendor/libyaml/tests/run-parser-test-suite.crust/md4x-sys/vendor/libyaml/tests/run-parser.crust/md4x-sys/vendor/libyaml/tests/run-scanner.crust/md4x-sys/vendor/libyaml/tests/test-reader.crust/md4x-sys/vendor/libyaml/tests/test-version.crust/md4x-sys/vendor/libyaml/yaml-0.1.pc.inrust/md4x-sys/vendor/libyaml/yamlConfig.cmake.inrust/md4x/Cargo.tomlrust/md4x/src/lib.rs
| - name: Publish md4x-sys to crates.io | ||
| if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }} | ||
| run: cargo publish -p md4x-sys --locked --token "${CARGO_REGISTRY_TOKEN}" | ||
| working-directory: rust | ||
| env: | ||
| CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }} | ||
|
|
||
| - name: Wait for crates.io index | ||
| if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }} | ||
| run: sleep 20 | ||
|
|
||
| - name: Publish md4x to crates.io | ||
| if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }} | ||
| run: cargo publish -p md4x --locked --token "${CARGO_REGISTRY_TOKEN}" | ||
| working-directory: rust | ||
| env: | ||
| CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }} |
There was a problem hiding this comment.
Invalid use of secrets context in if expressions.
The secrets context is not available in if conditions. GitHub Actions only allows env, github, inputs, job, matrix, needs, runner, steps, strategy, and vars contexts in if expressions. These conditions will not work as intended.
A common workaround is to set the secret as an environment variable in a prior step and check for its presence, or use env context instead:
🐛 Proposed fix using env context
- name: Publish md4x-sys to crates.io
- if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }}
+ if: ${{ env.CARGO_REGISTRY_TOKEN != '' }}
run: cargo publish -p md4x-sys --locked --token "${CARGO_REGISTRY_TOKEN}"
working-directory: rust
env:
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}
- name: Wait for crates.io index
- if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }}
+ if: ${{ env.CARGO_REGISTRY_TOKEN != '' }}
- run: sleep 20
+ run: sleep 30
+ env:
+ CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}
- name: Publish md4x to crates.io
- if: ${{ secrets.CARGO_REGISTRY_TOKEN != '' }}
+ if: ${{ env.CARGO_REGISTRY_TOKEN != '' }}
run: cargo publish -p md4x --locked --token "${CARGO_REGISTRY_TOKEN}"
working-directory: rust
env:
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}Alternatively, you can define the env at the job level to avoid repetition:
jobs:
release:
env:
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}Then use if: ${{ env.CARGO_REGISTRY_TOKEN != '' }} in each step without repeating the env block.
🧰 Tools
🪛 actionlint (1.7.11)
[error] 57-57: context "secrets" is not allowed here. available contexts are "env", "github", "inputs", "job", "matrix", "needs", "runner", "steps", "strategy", "vars". see https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability for more details
(expression)
[error] 64-64: context "secrets" is not allowed here. available contexts are "env", "github", "inputs", "job", "matrix", "needs", "runner", "steps", "strategy", "vars". see https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability for more details
(expression)
[error] 68-68: context "secrets" is not allowed here. available contexts are "env", "github", "inputs", "job", "matrix", "needs", "runner", "steps", "strategy", "vars". see https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability for more details
(expression)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/release.yml around lines 56 - 72, Update the three steps
named "Publish md4x-sys to crates.io", "Wait for crates.io index", and "Publish
md4x to crates.io" to stop using the secrets context in their if conditions;
instead set CARGO_REGISTRY_TOKEN as an env at the job level or in a prior step
and change the if expressions to use env (e.g. if: ${{ env.CARGO_REGISTRY_TOKEN
!= '' }}), and remove the direct use of ${{ secrets.CARGO_REGISTRY_TOKEN }}
inside the if so the conditional works correctly in GitHub Actions.
| while (!STACK_EMPTY(&context, document->nodes)) { | ||
| yaml_node_t node = POP(&context, document->nodes); | ||
| yaml_free(node.tag); | ||
| switch (node.type) { | ||
| case YAML_SCALAR_NODE: | ||
| yaml_free(node.data.scalar.value); | ||
| break; | ||
| case YAML_SEQUENCE_NODE: | ||
| STACK_DEL(&context, node.data.sequence.items); | ||
| break; | ||
| case YAML_MAPPING_NODE: | ||
| STACK_DEL(&context, node.data.mapping.pairs); | ||
| break; | ||
| default: | ||
| assert(0); /* Should not happen. */ | ||
| } | ||
| } | ||
| STACK_DEL(&context, document->nodes); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if yaml_document_delete compiles correctly
rg -n "yaml_document_delete" --type c -A 10Repository: unjs/md4x
Length of output: 10444
🏁 Script executed:
# Get the full yaml_document_delete function
sed -n '1123,1150p' rust/md4x-sys/vendor/libyaml/src/api.cRepository: unjs/md4x
Length of output: 1030
🏁 Script executed:
# Check yaml_document_add_scalar function mentioned in the review
sed -n '1199,1230p' rust/md4x-sys/vendor/libyaml/src/api.cRepository: unjs/md4x
Length of output: 979
🏁 Script executed:
# Search for context variable declarations in api.c to understand the pattern
rg -n "struct.*context|yaml_error_type_t" rust/md4x-sys/vendor/libyaml/src/api.c | head -20Repository: unjs/md4x
Length of output: 317
Missing context variable declaration in yaml_document_delete.
The variable context is used in STACK_EMPTY, POP, and STACK_DEL macro calls but is never declared in this function, causing a compilation error. This pattern is inconsistently applied—other functions in the same file (e.g., yaml_document_add_scalar at line 1199) properly declare context. Add the missing declaration at the beginning of the function.
Add missing context declaration
YAML_DECLARE(void)
yaml_document_delete(yaml_document_t *document)
{
+ struct {
+ yaml_error_type_t error;
+ } context;
yaml_tag_directive_t *tag_directive;
assert(document); /* Non-NULL document object is expected. */🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rust/md4x-sys/vendor/libyaml/src/api.c` around lines 1129 - 1146, In
yaml_document_delete, add the missing local declaration for the variable named
context (use the same declaration used in yaml_document_add_scalar); place that
declaration at the start of the yaml_document_delete function so the
STACK_EMPTY, POP, and STACK_DEL macros can reference it (i.e., copy the context
variable declaration from yaml_document_add_scalar into yaml_document_delete).
| value -= 0x10000; | ||
| emitter->raw_buffer.last[high] = 0xD8 + (value >> 18); | ||
| emitter->raw_buffer.last[low] = (value >> 10) & 0xFF; | ||
| emitter->raw_buffer.last[high+2] = 0xDC + ((value >> 8) & 0xFF); | ||
| emitter->raw_buffer.last[low+2] = value & 0xFF; |
There was a problem hiding this comment.
Mask the low-surrogate high byte to 2 bits.
Line 119 currently mixes bits 10-15 of value into the second UTF-16 word, so most > U+10000 scalars are emitted as invalid UTF-16. U+10400, for example, should end with 0xDC00, not 0xE000.
Proposed fix
- emitter->raw_buffer.last[high+2] = 0xDC + ((value >> 8) & 0xFF);
+ emitter->raw_buffer.last[high+2] = 0xDC + ((value >> 8) & 0x03);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| value -= 0x10000; | |
| emitter->raw_buffer.last[high] = 0xD8 + (value >> 18); | |
| emitter->raw_buffer.last[low] = (value >> 10) & 0xFF; | |
| emitter->raw_buffer.last[high+2] = 0xDC + ((value >> 8) & 0xFF); | |
| emitter->raw_buffer.last[low+2] = value & 0xFF; | |
| value -= 0x10000; | |
| emitter->raw_buffer.last[high] = 0xD8 + (value >> 18); | |
| emitter->raw_buffer.last[low] = (value >> 10) & 0xFF; | |
| emitter->raw_buffer.last[high+2] = 0xDC + ((value >> 8) & 0x03); | |
| emitter->raw_buffer.last[low+2] = value & 0xFF; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rust/md4x-sys/vendor/libyaml/src/writer.c` around lines 116 - 120, The
low-surrogate high byte is taking the wrong bits (it uses ((value >> 8) & 0xFF))
so most scalars > U+10000 produce invalid UTF-16; update the assignment that
writes the low-surrogate high byte (the emitter->raw_buffer.last[high+2] write
using value) to mask down to the top 2 bits of the 10-bit low-surrogate payload
(use ((value >> 8) & 0x03) instead of & 0xFF) so the low surrogate becomes
0xDC00 + (value & 0x3FF) and U+10400 will end with 0xDC00.
| /* Add 'implicit': {} */ | ||
|
|
||
| key = yaml_document_add_scalar(&output_document, NULL, | ||
| (yaml_char_t *)"version", -1, YAML_PLAIN_SCALAR_STYLE); | ||
| if (!key) goto document_error; | ||
| map = yaml_document_add_mapping(&output_document, NULL, | ||
| YAML_FLOW_MAPPING_STYLE); | ||
| if (!map) goto document_error; | ||
| if (!yaml_document_append_mapping_pair(&output_document, | ||
| properties, key, map)) goto document_error; |
There was a problem hiding this comment.
Rename this mapping from version to implicit.
This block is building the scalar implicitness map (plain / quoted), but the top-level key is "version". That changes the emitted schema for scalar events and will break consumers expecting an "implicit" section.
🐛 Suggested fix
key = yaml_document_add_scalar(&output_document, NULL,
- (yaml_char_t *)"version", -1, YAML_PLAIN_SCALAR_STYLE);
+ (yaml_char_t *)"implicit", -1, YAML_PLAIN_SCALAR_STYLE);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /* Add 'implicit': {} */ | |
| key = yaml_document_add_scalar(&output_document, NULL, | |
| (yaml_char_t *)"version", -1, YAML_PLAIN_SCALAR_STYLE); | |
| if (!key) goto document_error; | |
| map = yaml_document_add_mapping(&output_document, NULL, | |
| YAML_FLOW_MAPPING_STYLE); | |
| if (!map) goto document_error; | |
| if (!yaml_document_append_mapping_pair(&output_document, | |
| properties, key, map)) goto document_error; | |
| /* Add 'implicit': {} */ | |
| key = yaml_document_add_scalar(&output_document, NULL, | |
| (yaml_char_t *)"implicit", -1, YAML_PLAIN_SCALAR_STYLE); | |
| if (!key) goto document_error; | |
| map = yaml_document_add_mapping(&output_document, NULL, | |
| YAML_FLOW_MAPPING_STYLE); | |
| if (!map) goto document_error; | |
| if (!yaml_document_append_mapping_pair(&output_document, | |
| properties, key, map)) goto document_error; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rust/md4x-sys/vendor/libyaml/tests/example-deconstructor-alt.c` around lines
426 - 435, The code is creating the scalar implicitness mapping but uses the
wrong top-level key "version"; change the yaml_document_add_scalar call that
sets the key to use "implicit" instead of "version" so the mapping appended with
yaml_document_append_mapping_pair(&output_document, properties, key, map)
correctly names the section; update the scalar creation where key is assigned
(yaml_document_add_scalar) to "implicit" to restore the expected schema for
scalar events.
| else if (!foundfile) { | ||
| input = fopen(argv[i], "rb"); | ||
| foundfile = 1; |
There was a problem hiding this comment.
Handle fopen() failures before marking the file as found.
foundfile is set even when fopen() returns NULL. In release builds the assert(input) check disappears, so the next read path can dereference a null FILE *.
🐛 Suggested fix
else if (!foundfile) {
- input = fopen(argv[i], "rb");
- foundfile = 1;
+ input = fopen(argv[i], "rb");
+ if (!input) {
+ perror(argv[i]);
+ return 1;
+ }
+ foundfile = 1;
}
@@
- assert(input);Also applies to: 72-72
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c` around lines 58
- 60, The code sets foundfile immediately after calling fopen(argv[i], "rb")
even when fopen returns NULL; change the logic in run-emitter-test-suite.c so
you first call fopen and check its return (input) for NULL, only then set
foundfile and proceed; update both occurrences where fopen/ foundfile are used
(the block with input = fopen(argv[i], "rb") and the similar block at the other
occurrence around the 72-line area) to avoid marking the file as found when
fopen fails and to handle/propagate the fopen error appropriately.
| char anchor[256]; | ||
| char tag[256]; |
There was a problem hiding this comment.
Bound-check anchor and tag extraction before copying.
Both helpers copy unbounded spans into 256-byte stack buffers. A long token, or a > that appears before <, can turn the computed length into an oversized copy and overrun the stack.
🛡️ Suggested fix
-char *get_anchor(char sigil, char *line, char *anchor);
-char *get_tag(char *line, char *tag);
+char *get_anchor(char sigil, char *line, char *anchor, size_t anchor_cap);
+char *get_tag(char *line, char *tag, size_t tag_cap);
@@
-char *get_anchor(char sigil, char *line, char *anchor)
+char *get_anchor(char sigil, char *line, char *anchor, size_t anchor_cap)
{
char *start;
char *end;
@@
- memcpy(anchor, start, end - start);
- anchor[end - start] = '\0';
+ size_t len = (size_t)(end - start);
+ if (len >= anchor_cap)
+ return NULL;
+ memcpy(anchor, start, len);
+ anchor[len] = '\0';
return anchor;
}
@@
-char *get_tag(char *line, char *tag)
+char *get_tag(char *line, char *tag, size_t tag_cap)
{
char *start;
char *end;
@@
- if ((end = strchr(line, '>')) == NULL)
+ if ((end = strchr(start + 1, '>')) == NULL)
return NULL;
- memcpy(tag, start + 1, end - start - 1);
- tag[end - start - 1] = '\0';
+ size_t len = (size_t)(end - start - 1);
+ if (len >= tag_cap)
+ return NULL;
+ memcpy(tag, start + 1, len);
+ tag[len] = '\0';
return tag;
}Also applies to: 206-230
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c` around lines 85
- 86, The test harness extracts token spans into fixed-size stack buffers
anchor[256] and tag[256] without bounds checking; ensure any computed length for
these copies is validated to be >=0 and < sizeof(anchor) (and sizeof(tag)),
clamp the copy length to sizeof-1, perform the copy with a bounded routine and
explicitly null-terminate, and handle the case where a '>' appears before '<' by
treating it as an error/empty span; apply the same fixes to the other copy site
around the 206-230 region so both anchor and tag extractions are safe.
| char value[1024]; | ||
| int style; | ||
|
|
||
| get_value(line, value, &style); | ||
| implicit = (get_tag(line, tag) == NULL); | ||
|
|
||
| ok = yaml_scalar_event_initialize(&event, (yaml_char_t *) | ||
| get_anchor('&', line, anchor), (yaml_char_t *) get_tag(line, tag), (yaml_char_t *) value, -1, implicit, implicit, style); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '131,138p' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c
sed -n '264,285p' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c
rg -n -C2 'yaml_scalar_event_initialize' rust/md4x-sys/vendor/libyaml/include/yaml.hRepository: unjs/md4x
Length of output: 1221
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find get_value declaration
rg -n 'get_value.*\(' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c | head -20
# Count lines in file to understand structure
wc -l rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c
# Check around line 268-269 for declaration
sed -n '265,275p' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.cRepository: unjs/md4x
Length of output: 635
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# More comprehensive search for get_value calls
rg -n 'get_value\(' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c
# Look at context around line 134
sed -n '128,142p' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c
# Check complete function definition
sed -n '233,290p' rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.cRepository: unjs/md4x
Length of output: 2529
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Verify yaml_scalar_event_initialize signature accepts int length
rg -n -A3 'yaml_scalar_event_initialize' rust/md4x-sys/vendor/libyaml/include/yaml.h | head -20Repository: unjs/md4x
Length of output: 272
Pass the scalar length explicitly when \0 escapes are supported.
get_value() can materialize embedded NUL bytes (line 273: else if (*c == '0') value[i++] = '\0';), but yaml_scalar_event_initialize(..., -1, ...) at line 134 makes libyaml treat value as a C string and stop at the first NUL. Any test case containing \0 will be truncated on emit. The yaml_scalar_event_initialize() signature (line 580–582 in yaml.h) exposes an explicit int length parameter to handle this.
Fix by changing get_value() to return the scalar length (line 11 declaration, line 233 definition), then pass this length instead of -1 at the call site (line 134).
Suggested fix
-void get_value(char *line, char *value, int *style);
+size_t get_value(char *line, char *value, int *style);
@@
- get_value(line, value, &style);
+ size_t value_len = get_value(line, value, &style);
implicit = (get_tag(line, tag) == NULL);
ok = yaml_scalar_event_initialize(&event, (yaml_char_t *)
- get_anchor('&', line, anchor), (yaml_char_t *) get_tag(line, tag), (yaml_char_t *) value, -1, implicit, implicit, style);
+ get_anchor('&', line, anchor), (yaml_char_t *) get_tag(line, tag), (yaml_char_t *) value, (int)value_len, implicit, implicit, style);
@@
-void get_value(char *line, char *value, int *style)
+size_t get_value(char *line, char *value, int *style)
{
@@
}
value[i] = '\0';
+ return (size_t)i;
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rust/md4x-sys/vendor/libyaml/tests/run-emitter-test-suite.c` around lines 131
- 138, get_value currently may write embedded NULs but the caller in
run-emitter-test-suite.c passes -1 to yaml_scalar_event_initialize, causing
truncation; change get_value to return the actual scalar length (update its
prototype and definition) and capture that length at the call site, then pass
that length instead of -1 to yaml_scalar_event_initialize (keep using
get_anchor, get_tag and the same value buffer); ensure any callers of get_value
use the new return value.
| else if (!foundfile) { | ||
| input = fopen(argv[i], "rb"); | ||
| foundfile = 1; |
There was a problem hiding this comment.
Check fopen() before setting foundfile.
A failed fopen() still marks the file as consumed. With NDEBUG, the later assert(input) disappears and yaml_parser_set_input_file() receives a null FILE *.
🐛 Suggested fix
else if (!foundfile) {
- input = fopen(argv[i], "rb");
- foundfile = 1;
+ input = fopen(argv[i], "rb");
+ if (!input) {
+ perror(argv[i]);
+ return 1;
+ }
+ foundfile = 1;
}
@@
- assert(input);Also applies to: 43-46
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rust/md4x-sys/vendor/libyaml/tests/run-parser-test-suite.c` around lines 36 -
38, The code sets foundfile before checking fopen's result, so a failed fopen
can leave input NULL and later call yaml_parser_set_input_file(NULL); change the
logic in the main loop (where variables input and foundfile are used and where
yaml_parser_set_input_file is called) to only set foundfile = 1 after verifying
input != NULL, and if fopen fails handle the error (e.g., print a message and
continue or exit) rather than proceeding; ensure both occurrences (the block
around the current fopen and the similar block at lines 43-46) perform the same
NULL-check-before-setting-foundfile and avoid passing a NULL FILE* to
yaml_parser_set_input_file.
pi0
left a comment
There was a problem hiding this comment.
Review: Rust Bindings
The FFI bindings are well done — all types, structs, flags, and function signatures correctly match the C headers. Nice work!
Critical: crates.io publishing is broken
md4x-sys/build.rs references C sources at ../../src/ (relative to crate dir). cargo publish only packages files within the crate root, so anyone installing from crates.io will get a build failure — the C sources won't be in the tarball.
Suggested minimal fix (2 files changed, no Docker needed):
- Update
build.rsto try a localvendor/md4x/first, falling back to../../src/for local dev:
// In build.rs, replace the hardcoded ../../src/ with:
let src_dir = {
let vendored = manifest_dir.join("vendor/md4x");
if vendored.exists() {
vendored
} else {
manifest_dir.join("../../src")
}
};- Add a small publish script (e.g.
rust/publish.sh) that temporarily vendors the C sources before publishing:
#!/bin/bash
set -e
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
SYS_DIR="$SCRIPT_DIR/md4x-sys"
VENDOR="$SYS_DIR/vendor/md4x"
# Copy C sources for crates.io packaging
mkdir -p "$VENDOR" "$VENDOR/renderers"
cp "$SCRIPT_DIR/../src/md4x.c" "$SCRIPT_DIR/../src/md4x.h" \
"$SCRIPT_DIR/../src/entity.c" "$SCRIPT_DIR/../src/entity.h" "$VENDOR/"
cp "$SCRIPT_DIR/../src/renderers/"*.{c,h} "$VENDOR/renderers/"
# Publish
cargo publish -p md4x-sys --locked "$@"
sleep 30
cargo publish -p md4x --locked "$@"
# Cleanup
rm -rf "$VENDOR"Then update .gitignore to add rust/md4x-sys/vendor/md4x/ so the copied sources aren't accidentally committed.
This is the standard -sys crate pattern. The vendored libyaml already follows it — just need the same for the md4x sources.
Minor issues
- Missing crates.io metadata in both
Cargo.toml: consider addingrepository,readme,homepage - The 20s
sleepbetween publishes in CI is fragile — crates.io index can take longer. The publish script above uses 30s, but a retry loop would be more robust input.len() as u32could useu32::try_from()to avoid silent truncation on >4GB inputs (unlikely but good practice)
|
I also would like ot make a full port in a separate repo if thats ok |
- build.rs now checks vendor/md4x/ first (populated at publish time), falling back to ../../src/ for monorepo dev builds - Add rust/publish.sh to vendor C sources, publish both crates with crates.io index polling, then clean up - Gitignore rust/md4x-sys/vendor/md4x/ (transient, not committed) - Add repository/homepage/readme metadata to both Cargo.toml files - Replace input.len() as u32 with u32::try_from() in all render fns Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
.gitignore (1)
73-77: LGTM! Rust ignore patterns are appropriate.The ignore patterns correctly handle Rust build outputs (
rust/target/) and transient vendored sources (rust/md4x-sys/vendor/md4x/). The comment clarifies that vendored C sources are populated only during crates.io publish, which aligns with the build.rs fallback strategy described in the PR objectives.Optional: Consider broader vendor pattern
If additional dependencies might be vendored in the future, you could consider ignoring the entire vendor directory:
-rust/md4x-sys/vendor/md4x/ +rust/md4x-sys/vendor/However, the current specific pattern is fine if you intend to commit other files under
vendor/or want explicit control over what gets ignored.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.gitignore around lines 73 - 77, The .gitignore currently excludes Rust build outputs and the specific vendored C sources (patterns rust/target/ and rust/md4x-sys/vendor/md4x/) which is correct; if you want broader coverage for any future vendored dependencies, add a more general vendor ignore (e.g., a pattern covering vendor directories under rust like rust/**/vendor/ or rust/vendor/) to complement or replace the specific md4x entry, otherwise leave the existing specific patterns as-is.rust/md4x/src/lib.rs (2)
426-430: Minor duplication:collect_healis identical tocollect_output.The inner function duplicates the existing
collect_outputcallback. Consider reusing the outer function.♻️ Reuse existing callback
pub fn heal(input: &str) -> Result<String, Error> { let mut buf: Vec<u8> = Vec::new(); - unsafe extern "C" fn collect_heal(data: *const c_char, size: u32, userdata: *mut c_void) { - let buf = &mut *(userdata as *mut Vec<u8>); - let slice = std::slice::from_raw_parts(data as *const u8, size as usize); - buf.extend_from_slice(slice); - } - let rc = unsafe { sys::md_heal( input.as_ptr() as *const c_char, u32::try_from(input.len()).map_err(|_| Error)?, - collect_heal, + collect_output, &mut buf as *mut Vec<u8> as *mut c_void, ) };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x/src/lib.rs` around lines 426 - 430, collect_heal is a duplicate of collect_output; remove the redundant unsafe extern "C" fn collect_heal and reuse collect_output where collect_heal is currently passed; ensure the call sites that previously referenced collect_heal now pass collect_output (and that the signature matches: fn(*const c_char, u32, *mut c_void)) so the userdata cast to *mut Vec<u8> and the from_raw_parts/extend_from_slice logic remains unchanged.
264-277: Silent handling of NUL bytes in strings may surprise callers.
CString::new().unwrap_or_default()silently converts strings containing interior NUL bytes to empty strings. If a user passestitle: Some("My\0Title"), it becomes an empty string without any indication.Consider returning an error or documenting this behavior:
Option 1: Return error on invalid input
let c_title = match opts.title { Some(t) => { - title_cstr = std::ffi::CString::new(t).unwrap_or_default(); + title_cstr = std::ffi::CString::new(t).map_err(|_| Error)?; title_cstr.as_ptr() } None => ptr::null(), }; let c_css = match opts.css_url { Some(u) => { - css_cstr = std::ffi::CString::new(u).unwrap_or_default(); + css_cstr = std::ffi::CString::new(u).map_err(|_| Error)?; css_cstr.as_ptr() } None => ptr::null(), };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/md4x/src/lib.rs` around lines 264 - 277, The current code silently swallows interior NUL errors by calling CString::new(...).unwrap_or_default() for opts.title and opts.css_url (producing title_cstr and css_cstr and then c_title/c_css), which turns invalid inputs into empty strings; instead validate and propagate the error: call CString::new on opts.title and opts.css_url, handle the Result (map or match) and return or propagate an Err with a clear message if CString::new fails (include which field failed), ensuring callers are informed of invalid NUL-containing strings rather than silently replacing them with empty values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rust/publish.sh`:
- Around line 23-24: The cargo publish calls run from the caller's CWD and must
be anchored to the Rust workspace; update both occurrences of the publish
command (the line with "cargo publish -p md4x-sys --locked \"$@\"" and the other
publish call around lines 36-37) to specify the workspace manifest with
--manifest-path rust/Cargo.toml (or equivalently run the command from the rust
directory) so Cargo resolves the workspace correctly when invoked from the repo
root.
- Around line 26-37: The polling loop that checks crates.io for md4x-sys (using
SYS_DIR and the cargo search/cargo metadata command) currently proceeds even if
indexing never succeeds; modify the script so that after the for-loop finishes
without finding the indexed version you print a clear error (including the
version fetched via cargo metadata) and exit non-zero (e.g., exit 1) to abort
publishing md4x; keep the existing success branch (echo "md4x-sys is indexed."
and break) unchanged and only add the failure check immediately after the loop
to fail fast when md4x-sys isn't indexed.
---
Nitpick comments:
In @.gitignore:
- Around line 73-77: The .gitignore currently excludes Rust build outputs and
the specific vendored C sources (patterns rust/target/ and
rust/md4x-sys/vendor/md4x/) which is correct; if you want broader coverage for
any future vendored dependencies, add a more general vendor ignore (e.g., a
pattern covering vendor directories under rust like rust/**/vendor/ or
rust/vendor/) to complement or replace the specific md4x entry, otherwise leave
the existing specific patterns as-is.
In `@rust/md4x/src/lib.rs`:
- Around line 426-430: collect_heal is a duplicate of collect_output; remove the
redundant unsafe extern "C" fn collect_heal and reuse collect_output where
collect_heal is currently passed; ensure the call sites that previously
referenced collect_heal now pass collect_output (and that the signature matches:
fn(*const c_char, u32, *mut c_void)) so the userdata cast to *mut Vec<u8> and
the from_raw_parts/extend_from_slice logic remains unchanged.
- Around line 264-277: The current code silently swallows interior NUL errors by
calling CString::new(...).unwrap_or_default() for opts.title and opts.css_url
(producing title_cstr and css_cstr and then c_title/c_css), which turns invalid
inputs into empty strings; instead validate and propagate the error: call
CString::new on opts.title and opts.css_url, handle the Result (map or match)
and return or propagate an Err with a clear message if CString::new fails
(include which field failed), ensuring callers are informed of invalid
NUL-containing strings rather than silently replacing them with empty values.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: dbd1c7a4-6d51-435f-818c-d5c3ea779954
📒 Files selected for processing (6)
.gitignorerust/md4x-sys/Cargo.tomlrust/md4x-sys/build.rsrust/md4x/Cargo.tomlrust/md4x/src/lib.rsrust/publish.sh
🚧 Files skipped from review as they are similar to previous changes (3)
- rust/md4x-sys/build.rs
- rust/md4x-sys/Cargo.toml
- rust/md4x/Cargo.toml
| echo "==> Publishing md4x-sys" | ||
| cargo publish -p md4x-sys --locked "$@" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/usr/bin/env bash
set -euo pipefail
printf 'Cargo manifests found in this repo:\n'
fd '^Cargo\.toml$' -p
printf '\nRelevant lines from rust/publish.sh:\n'
nl -ba rust/publish.sh | sed -n '1,40p'Repository: unjs/md4x
Length of output: 186
🏁 Script executed:
#!/usr/bin/env bash
set -euo pipefail
printf 'Cargo manifests found in this repo:\n'
fd '^Cargo\.toml$'
printf '\nContent of rust/publish.sh:\n'
cat -n rust/publish.sh
printf '\nChecking workspace setup:\n'
if [ -f "rust/Cargo.toml" ]; then
echo "✓ rust/Cargo.toml exists"
head -20 rust/Cargo.toml
fiRepository: unjs/md4x
Length of output: 1775
Anchor both cargo publish calls to the Rust workspace.
Line 24 and Line 37 run from the caller's current directory, but Line 3 documents bash rust/publish.sh. If invoked from the repo root (as documented), Cargo will search for Cargo.toml starting from the repo root and find nothing, since the workspace manifest is at rust/Cargo.toml. This will cause the publish step to fail.
Suggested fix
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+WORKSPACE_MANIFEST="$SCRIPT_DIR/Cargo.toml"
SYS_DIR="$SCRIPT_DIR/md4x-sys"
VENDOR="$SYS_DIR/vendor/md4x"
SRC="$SCRIPT_DIR/../src"
@@
echo "==> Publishing md4x-sys"
-cargo publish -p md4x-sys --locked "$@"
+cargo publish --manifest-path "$WORKSPACE_MANIFEST" -p md4x-sys --locked "$@"
@@
echo "==> Publishing md4x"
-cargo publish -p md4x --locked "$@"
+cargo publish --manifest-path "$WORKSPACE_MANIFEST" -p md4x --locked "$@"Also applies to: 36-37
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rust/publish.sh` around lines 23 - 24, The cargo publish calls run from the
caller's CWD and must be anchored to the Rust workspace; update both occurrences
of the publish command (the line with "cargo publish -p md4x-sys --locked
\"$@\"" and the other publish call around lines 36-37) to specify the workspace
manifest with --manifest-path rust/Cargo.toml (or equivalently run the command
from the rust directory) so Cargo resolves the workspace correctly when invoked
from the repo root.
| echo "==> Waiting for crates.io index to update..." | ||
| for i in $(seq 1 12); do | ||
| sleep 10 | ||
| if cargo search md4x-sys --limit 1 2>/dev/null | grep -q "$(cargo metadata --no-deps --manifest-path "$SYS_DIR/Cargo.toml" --format-version 1 | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["packages"][0]["version"])')"; then | ||
| echo " md4x-sys is indexed." | ||
| break | ||
| fi | ||
| echo " Not indexed yet (attempt $i/12)..." | ||
| done | ||
|
|
||
| echo "==> Publishing md4x" | ||
| cargo publish -p md4x --locked "$@" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/usr/bin/env bash
set -euo pipefail
printf 'Polling/publish section:\n'
nl -ba rust/publish.sh | sed -n '23,39p'
python3 - <<'PY'
from pathlib import Path
text = Path("rust/publish.sh").read_text()
before_second_publish = text.split('echo "==> Publishing md4x"')[0]
print("\nFound explicit abort before second publish:", "exit 1" in before_second_publish)
PYRepository: unjs/md4x
Length of output: 137
🏁 Script executed:
#!/bin/bash
# Check the polling section and look for exit statements
echo "=== Checking rust/publish.sh lines 20-45 ==="
sed -n '20,45p' rust/publish.sh
echo -e "\n=== Checking for 'exit' statements in entire script ==="
grep -n "exit" rust/publish.sh || echo "No exit statements found"
echo -e "\n=== Checking for abort/error handling after polling loop ==="
grep -n -A 5 "Not indexed yet" rust/publish.shRepository: unjs/md4x
Length of output: 1065
🏁 Script executed:
head -20 rust/publish.shRepository: unjs/md4x
Length of output: 593
Add explicit abort if md4x-sys fails to index.
The polling loop completes normally whether successful or not, allowing the script to proceed directly to publishing md4x without verifying the dependency is indexed. This turns a clear propagation delay into a confusing crates.io dependency resolution failure.
Suggested fix
echo "==> Waiting for crates.io index to update..."
+indexed=false
for i in $(seq 1 12); do
sleep 10
if cargo search md4x-sys --limit 1 2>/dev/null | grep -q "$(cargo metadata --no-deps --manifest-path "$SYS_DIR/Cargo.toml" --format-version 1 | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["packages"][0]["version"])')"; then
echo " md4x-sys is indexed."
+ indexed=true
break
fi
echo " Not indexed yet (attempt $i/12)..."
done
+
+if [[ "$indexed" != true ]]; then
+ echo "md4x-sys did not appear in the crates.io index after 120 seconds; aborting md4x publish." >&2
+ exit 1
+fi
echo "==> Publishing md4x"
cargo publish -p md4x --locked "$@"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| echo "==> Waiting for crates.io index to update..." | |
| for i in $(seq 1 12); do | |
| sleep 10 | |
| if cargo search md4x-sys --limit 1 2>/dev/null | grep -q "$(cargo metadata --no-deps --manifest-path "$SYS_DIR/Cargo.toml" --format-version 1 | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["packages"][0]["version"])')"; then | |
| echo " md4x-sys is indexed." | |
| break | |
| fi | |
| echo " Not indexed yet (attempt $i/12)..." | |
| done | |
| echo "==> Publishing md4x" | |
| cargo publish -p md4x --locked "$@" | |
| echo "==> Waiting for crates.io index to update..." | |
| indexed=false | |
| for i in $(seq 1 12); do | |
| sleep 10 | |
| if cargo search md4x-sys --limit 1 2>/dev/null | grep -q "$(cargo metadata --no-deps --manifest-path "$SYS_DIR/Cargo.toml" --format-version 1 | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["packages"][0]["version"])')"; then | |
| echo " md4x-sys is indexed." | |
| indexed=true | |
| break | |
| fi | |
| echo " Not indexed yet (attempt $i/12)..." | |
| done | |
| if [[ "$indexed" != true ]]; then | |
| echo "md4x-sys did not appear in the crates.io index after 120 seconds; aborting md4x publish." >&2 | |
| exit 1 | |
| fi | |
| echo "==> Publishing md4x" | |
| cargo publish -p md4x --locked "$@" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rust/publish.sh` around lines 26 - 37, The polling loop that checks crates.io
for md4x-sys (using SYS_DIR and the cargo search/cargo metadata command)
currently proceeds even if indexing never succeeds; modify the script so that
after the for-loop finishes without finding the indexed version you print a
clear error (including the version fetched via cargo metadata) and exit non-zero
(e.g., exit 1) to abort publishing md4x; keep the existing success branch (echo
"md4x-sys is indexed." and break) unchanged and only add the failure check
immediately after the loop to fail fast when md4x-sys isn't indexed.
Summary by CodeRabbit