Skip to content

fix(engine): preserve AAC start time during MP4 mux#1615

Merged
miguel-heygen merged 4 commits into
mainfrom
fix/mp4-aac-mux-start-time
Jun 20, 2026
Merged

fix(engine): preserve AAC start time during MP4 mux#1615
miguel-heygen merged 4 commits into
mainfrom
fix/mp4-aac-mux-start-time

Conversation

@miguel-heygen

@miguel-heygen miguel-heygen commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

Problem

HyperFrames MP4 renders with audio can show a black first frame in players that honor MP4 edit lists, including QuickTime, Safari, and preview/chat players. The first decoded video frame is real content, but the final muxed container can start the video stream around 21ms after audio, so those players render the leading empty edit as black.

What this fixes

  • Copies HyperFrames' already-mixed AAC sidecar when muxing MP4 and MOV outputs instead of re-encoding it during final mux.
  • Uses an explicit audioCodec: "aac" mux contract from HyperFrames callers, keeps .aac as a fast compatibility fallback, and probes unknown-extension sidecars with extractAudioMetadata before deciding copy vs transcode.
  • Removes the distributed pad-path asymmetry: short audio now generates only the missing silent AAC tail and concat-copies source AAC + silence instead of re-encoding the whole mixed source.
  • Keeps non-AAC MP4/MOV inputs on the existing AAC transcode path.
  • Keeps WebM on the existing Opus transcode path.
  • Adds mocked engine coverage for the mux command contract and codec-probe fallback, producer pad/trim coverage for the no-source-reencode pad plan, and ffmpeg-backed distributed assemble coverage for both trim and pad paths asserting final audio and video streams start at zero.

Root cause

processCompositionAudio in audioMixer.ts writes the mixed render audio as AAC. That first AAC encode owns the encoder-priming interval. Before this PR, muxVideoWithAudio re-encoded that AAC sidecar during final MP4 mux with -c:a aac; that second AAC encode introduced another priming interval, and ffmpeg preserved timeline alignment by writing an empty video edit-list entry (media_time: -1) before the first video packet. Players that render that empty edit show black for the first ~21ms.

The distributed pad path had the same structural risk one step upstream: padding short audio.aac used apad over the whole source with -c:a aac, then the final mux copied that newly encoded AAC through. That path now preserves the single-priming-origin invariant by encoding only generated silence and concat-copying it after the already mixed AAC.

The engine-level mux helper also previously made the AAC copy decision from a filename-only .aac check. That was correct for current HyperFrames sidecars but fragile for renamed, extensionless, or AAC-in-container inputs. Unknown extensions now probe codec metadata before choosing whether to copy.

This avoids the priming interval at the source. It does not suppress edit lists with -use_editlist 0 or mask the symptom after muxing.

Verification

Local checks

  • bun run --filter @hyperframes/engine test -- src/services/chunkEncoder.test.ts --test-name-pattern "muxVideoWithAudio audio codec handling"
  • bun run --filter @hyperframes/engine test -- src/services/chunkEncoder.test.ts
  • bun run --filter @hyperframes/engine typecheck
  • bun test packages/producer/src/services/render/audioPadTrim.test.ts
  • bun test packages/producer/src/services/distributed/assemble.test.ts --test-name-pattern "audio"
  • bun run --filter @hyperframes/producer typecheck
  • bunx oxlint packages/engine/src/services/chunkEncoder.ts packages/engine/src/services/chunkEncoder.test.ts packages/producer/src/services/distributed/assemble.ts packages/producer/src/services/distributed/assemble.test.ts packages/producer/src/services/render/audioPadTrim.ts packages/producer/src/services/render/audioPadTrim.test.ts packages/producer/src/services/render/stages/assembleStage.ts
  • bunx oxfmt --check packages/engine/src/services/chunkEncoder.ts packages/engine/src/services/chunkEncoder.test.ts packages/producer/src/services/distributed/assemble.ts packages/producer/src/services/distributed/assemble.test.ts packages/producer/src/services/render/audioPadTrim.ts packages/producer/src/services/render/audioPadTrim.test.ts packages/producer/src/services/render/stages/assembleStage.ts
  • git diff --check
  • bun run build
  • Pre-commit hooks on 7ef576949 and f87b2c417 passed lint, format, fallow, typecheck, and commitlint. Fallow reported warn-level duplication/complexity findings in touched test files but did not fail the hook.

Render proof

Rendered a 2s HyperFrames repro with a source video and matching audio both starting at data-start="0".

Before the fix:

video start_time=0.021029
audio start_time=0.000000
ffprobe trace: duration=21 time=-1 / media time: -1

After the fix:

video start_time=0.000000
audio start_time=0.000000
ffprobe trace: video edit list media time: 0, no media time: -1
first decoded frame: YAVG=101.16, YMAX=248

Distributed pad-path proof through assemble():

operation=pad
targetDurationSeconds=0.4
sourceDurationSeconds=0.234604
video start_time=0.000000
audio start_time=0.000000

Browser verification

Used agent-browser to open fixed local proof videos in Chromium through local proof pages, sample the first painted video frame through canvas, capture screenshots, and record playback.

Original MP4 repro after fix:

  • Browser canvas sample at currentTime=0: average luma 99.97, max luma 249.4
  • Screenshot: tmp/black-frame-repro/browser-proof-first-frame.png
  • Recording: tmp/black-frame-repro/browser-proof-playback.webm

Distributed pad-path MP4 after fix:

  • Artifact: tmp/black-frame-repro/distributed-pad-proof/distributed-pad-fixed.mp4
  • Browser canvas sample at currentTime=0: average luma 127.63, max luma 255.0
  • Screenshot: tmp/black-frame-repro/distributed-pad-proof/browser-proof-first-frame.png
  • Recording: tmp/black-frame-repro/distributed-pad-proof/browser-proof-playback.webm

Notes

  • Repro and browser-proof artifacts are local-only under ignored tmp/black-frame-repro/.
  • WebM/Opus remains intentionally out of the MP4/MOV fix scope. I checked a local Chromium WebM fixture anyway: ffprobe reported video and Opus audio start_time=0.000000, and Chrome canvas sampling at currentTime=0 was non-black (average luma 127.66, max luma 255.0). No matching first-frame-black symptom reproduced there.

@james-russo-rames-d-jusso james-russo-rames-d-jusso left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed at HEAD ba1afa13722f8e38310abb68ee0246fda96b839a. Verdict: ⚠️ comment — root cause + fix shape are right and the ffprobe before/after + browser canvas-luma proof is the gold standard, but the distributed-assemble pad path can re-introduce the same symptom and the .aac extension sniff is fragile. None of these block merge of the in-process render fix; flagging as targeted follow-ups before this rolls to distributed.

Concerns

packages/producer/src/services/render/audioPadTrim.ts:108-124pad sub-path silently re-bakes the priming gap on the distributed assemble. When audio.aac is shorter than frameCount/fps (any composition with trailing silence after the last speech region), buildPadTrimAudioArgs returns apad=… + -c:a aac -b:a 192k, writing audio-padded.aac. muxVideoWithAudio then sees the .aac extension and stream-copies that re-encoded file — but the re-encode just baked a fresh ~21ms encoder-priming interval into the padded sidecar's edit list, which copies straight through to the final container. Net effect: the in-process runAssembleStage path is fixed (mixed audio.aac → muxer copy, one encode total), but assemble.ts:295-325's pad branch is still 2 encodes (mixer + pad re-encode) → black first frame can recur on distributed renders whose audio falls short of frameCount/fps. Trim and copy ops are unaffected. The integration test at assemble.test.ts:306 only exercises the trim branch (audio 0.5s longer than video). Suggest: either swap the pad re-encode for concat + silent-AAC tail (avoids second encode), or extend the distributed integration test with a short-audio fixture asserting videoStream.start_time < 0.001 on the pad path too.

packages/engine/src/services/chunkEncoder.ts:47-49AAC detection by file extension only. isAacSidecar returns true iff extname === ".aac". A .m4a/.mp4 ADTS-stripped AAC, an upstream caller passing audio-padded without an extension, or a future audio path that switches container suffix all silently fall through to -c:a aac -b:a 192k re-encode — quietly re-introducing the bug. Current callers (assembleStage.ts:58, assemble.ts:318) both feed extension-bearing paths so it's safe today, but this is an engine-level export (packages/engine/src/index.ts:104) and the contract is name-based, not content-based. Suggest: short-circuit on extname if it matches, otherwise probe codec via extractAudioMetadata. Cheap and removes the rename-footgun.

Nits

packages/engine/src/services/chunkEncoder.ts:716-719 — comment is the canonical explanation, nice. Worth adding one line that the first encode (audioMixer.ts:415) is the actual priming origin; the muxer's -c:a copy propagates the existing edit list correctly. Helps the next person who reads this not assume -c:a copy introduces a priming gap on its own. (nit)

packages/engine/src/services/chunkEncoder.test.ts:445 — "copies HyperFrames AAC sidecars into MOV containers" test passes /tmp/audio.aac into a .mov output and asserts -c:a copy without -movflags +faststart. Worth a one-liner in the test docstring noting MOV intentionally skips faststart (parity with pre-PR). Future readers will second-guess the absence. (nit)

Questions

• Cross-codec sibling: WebM/Opus also has encoder pre-skip (Opus's 80-sample pre-roll equivalent) and the libopus path here re-encodes whatever you pass in — including the already-encoded audio.aac from audioStage. Has the equivalent black-first-frame symptom ever shown up on WebM outputs in Chromium? If WebM goes through QuickTime/Safari preview pipelines too, worth checking whether the libopus transcode here has a parallel priming-leak. Not a blocker; could be a deliberate "WebM consumers don't honor pre-skip in the same way" call. (HF#1615 scope is MP4/MOV per the PR body — flagging for the workstream, not this PR.)

What I didn't verify

• Whether the audio.aac written by audioMixer.ts:415 (the first -acodec aac encode) carries a clean edit list before the mux step. The ffprobe-before evidence in the PR body shows video start_time=0.021029 and audio start_time=0.000000, which is consistent with one accumulated priming on the muxed output but doesn't isolate which encode wrote it.

Stamp-routable: Y — Miguel-authored, ffprobe + browser canvas proof, one isolated function change, comprehensive unit + integration coverage on the trim/copy paths. Flagged concerns are follow-ups, not blockers.

— Rames D Jusso

@vanceingalls vanceingalls left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed at ba1afa1. Strong concur with @james-russo-rames-d-jusso — root-cause RCA is right, the fix shape is the right shape, the ffprobe-before/after + browser canvas-luma proof is gold-standard evidence, and the explicit choice to NOT take the -use_editlist 0 band-aid is exactly the discipline that makes this PR clean. Independently arrived at the same pad-branch scope gap; treating that as the load-bearing follow-up.

Verified at HEAD:

  • packages/engine/src/services/chunkEncoder.ts:707-727: branched codec selection on isAacSidecar(audioPath). AAC sidecars stream-copy with -c:a copy (+-movflags +faststart for MP4, omitted for MOV — pre-PR parity); non-AAC paths transcode with -c:a aac -b:a 192k. WebM/Opus untouched. Clean.
  • Single mux helper, two call sites: in-process packages/producer/src/services/render/stages/assembleStage.ts:58 + distributed packages/producer/src/services/distributed/assemble.ts:318. Both route through muxVideoWithAudio; no sibling muxer to fix.
  • Test coverage: chunkEncoder.test.ts adds 4 mocked-spawn argv-contract tests covering the matrix cells (MP4+AAC copy / MP4+non-AAC transcode / MOV+AAC copy / WebM Opus unchanged); the negative assertion expect(calls[0].args).not.toContain("-use_editlist") is the right pin — locks the choice to NOT use the suppression shortcut. assemble.test.ts adds an ffmpeg-backed start_time probe.

Concur with @james-russo-rames-d-jusso (independently arrived at the same shape):

  • Pad sub-path silently re-bakes the priming gap on distributed assemble (audioPadTrim.ts:108-124). Confirmed at HEAD: when delta > 0, returns apad=pad_dur=… -c:a aac -b:a 192k writing audio-padded.aac. The muxVideoWithAudio extension sniff then sees .aac and stream-copies that re-encoded file — but the pad re-encode just baked a fresh ~21ms encoder-priming interval into the sidecar, which now propagates through -c:a copy straight into the final container. The function's own docstring at line 84-85 acknowledges "Re-encode is required: apad is a filter and filters can't combine with -c:a copy" — so this is a known-architectural-constraint hiding a real regression vector.

    Net effect: in-process runAssembleStage is fixed (one encode total); assemble.ts:294-325 pad branch is still 2 encodes (mixer + pad re-encode) → the user-visible black-first-frame can recur on distributed renders whose audio is shorter than frameCount/fps. This is band-aid bar pattern #3 (silent scope gap on stated coverage) at the seam between the new fix and an upstream re-encode the PR doesn't touch. The trim and copy ops are unaffected — only the pad sub-path.

    Rames's fix shape (swap the pad re-encode for concat + silent-AAC tail) is the right structural close — avoids the second encode entirely. As a minimum, the integration test at assemble.test.ts:306 should grow a sibling fixture with audio shorter than video, asserting videoStream.start_time < 0.001 on the pad path; today it exercises the trim branch only (audio 0.5s longer), so the pad regression has no test net.

  • isAacSidecar extension-only detection (chunkEncoder.ts:47-49). Concur — extname === ".aac" is name-contract, not content-contract. .m4a ADTS-stripped AAC, no-extension paths, future container-suffix swaps all silently fall through to the re-encode branch and quietly re-introduce the bug. This is an engine-level export per packages/engine/src/index.ts:104; the contract is name-based for an engine-grade API. Rames's suggested fix (short-circuit on extname if it matches, otherwise probe codec via extractAudioMetadata) is right. Cheap fix, removes the rename-footgun, future-proofs against intra-stack file-naming refactors.

Concur on nits:

  • Comment in chunkEncoder.ts:716-719 is good — one extra line noting the first encode in audioMixer.ts:415 is the priming origin (and that -c:a copy propagates the existing edit list correctly, not introduces one) would head off the next reader assuming -c:a copy is the source. Future-archaeologist hint.
  • chunkEncoder.test.ts:445 MOV faststart absence — one-line docstring noting parity with pre-PR. Future readers will second-guess.

Cross-codec question (Rames): WebM/Opus has its own pre-skip equivalent; libopus path here re-encodes whatever you pass in. Worth a follow-up workstream check on whether the equivalent black-first-frame surfaces on Chromium WebM playback (or QuickTime/Safari if those touch WebM). Per PR body scope (MP4/MOV), explicitly out of scope here.

Band-aid bar check: one match — pattern #3 (silent scope gap) at the upstream audioPadTrim.ts pad branch, as above. None of the other 9 patterns fire (no duplicate guards, no contradictory rules, no decorative gates, no swallowed throws, no test mocking the production primitive, comment matches code, no negative-tests-against-absent-defaults, no migration without backfill, no toggle defaulting off).

CI state at HEAD ba1afa1: all completed required checks green (Lint, Typecheck, Build, Test, Fallow, SDK contract+smoke, Studio load smoke, preview-regression, player-perf, Semantic PR title, File size check, runtime contract, WIP, Format). Several still in_progress (regression-shards 1-8, Windows tests, Smoke: global install, CLI smoke, CodeQL analyze). No failures.

Verdict: NIT. The fix shape is clean and the RCA is rigorous; flagged concerns are targeted follow-ups, not merge blockers for the in-process render path. The pad-branch gap is the strongest pickup — worth either a sibling commit closing the upstream re-encode OR explicit body disclosure that distributed pad-branch renders are not yet covered. Stamp-routable as Rames framed it; deferring to the bot-authored protocol on the actual stamp.

Review by Via

@james-russo-rames-d-jusso james-russo-rames-d-jusso left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concurring with @via on the canonical fix shape for the distributed pad-path follow-up: swap audioPadTrim.ts:108-124's -c:a aac -b:a 192k re-encode for a concat demuxer + silent-AAC tail (or aevalsrc=0:d=<delta>-c:a copy concat), avoiding the second encode entirely. Single-priming-origin invariant holds end-to-end; no edit-list propagation through muxVideoWithAudio's new -c:a copy path.

Minimum-floor alternative if the concat refactor is too much scope for this PR: a sibling integration fixture asserting videoStream.start_time < 0.001 on a pad-path render (audio shorter than frameCount/fps), so the regression net covers both branches.

— Rames D Jusso

@miguel-heygen

Copy link
Copy Markdown
Collaborator Author

Addressed the review follow-ups in 7ef576949:\n\n- Distributed pad path no longer re-encodes the mixed source AAC. It now encodes only the missing silent tail and concat-copies source AAC + silence, keeping one priming origin.\n- Added the missing pad-path ffmpeg-backed integration case asserting video/audio start_time < 0.001.\n- Replaced extension-only internal mux detection with an explicit audioCodec: \"aac\" caller contract, keeping .aac as a compatibility fallback.\n- Clarified the mux comment and MOV test description.\n- Checked the WebM/Opus question locally: ffprobe showed both streams start at 0, and Chromium canvas sampling at currentTime=0 was non-black for the fixture.\n\nVerification details and artifact paths are in the updated PR body.

Comment thread packages/producer/src/services/render/audioPadTrim.ts Fixed
@miguel-heygen

Copy link
Copy Markdown
Collaborator Author

Also addressed Via's remaining engine-export concern in f87b2c417: unknown-extension MP4/MOV sidecars now call extractAudioMetadata before choosing copy vs transcode. Explicit audioCodec: \"aac\" and .aac remain fast paths; known non-AAC extensions stay on the existing transcode path; WebM remains Opus. Added mocked coverage for probed extensionless AAC -> -c:a copy and probed extensionless non-AAC -> -c:a aac.

@vanceingalls vanceingalls left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed at f87b2c41. Every R1 item from both me + @james-russo-rames-d-jusso verified addressed at HEAD. LGTM on this revision.

Verified at HEAD:

  • Pad branch reshape (the convergent load-bearing item). audioPadTrim.ts pad branch now generates ONLY the missing silent tail via anullsrc=channel_layout=… :sample_rate=…, then concat-copies source AAC + silence. The function docstring explicitly states the new invariant: "generate only the missing silence tail, then concat-copy the source AAC plus that tail. This avoids re-encoding the already mixed audio.aac; the pad branch remains the inverse of trim instead of becoming a second full-source AAC encode." Single priming origin preserved end-to-end through the new -c:a copy mux. Exactly the structural close Rames and I converged on in R1.

  • Pad-path integration test (the minimum-floor regression net). assemble.test.ts:334 adds "muxes padded short audio without shifting the first video frame" — audio shorter than video (line 351, exercising the pad branch), asserts both videoStream.start_time < 0.001 AND audioStream.start_time < 0.001 on the muxed output. Pad-branch regression now has a test net.

  • audioCodec: "aac" caller contract + extractAudioMetadata probe (closes the engine-export fragility I flagged in R1). chunkEncoder.ts now has a three-tier resolution at lines 70-84:

    • Fast path 1: options?.audioCodec === "aac" — explicit caller contract.
    • Fast path 2: isAacSidecar(audioPath).aac extension compat fallback (preserves pre-PR behavior for callers that haven't been migrated yet).
    • Slow path: extractAudioMetadata(audioPath) — probes the codec via ffprobe for unknown-extension paths, returns metadata.audioCodec === "aac".

    Engine-level export is now content-aware, not pure name-contract. Rename-footgun closed; intra-stack file-naming refactors can't silently regress this.

  • Comment + MOV test docstring nits addressed per Miguel's disposition.

  • WebM/Opus question verified by Miguel manually (ffprobe both streams start at 0; Chromium canvas sample at currentTime=0 non-black). Manual-only verification is acceptable per the PR's stated scope (MP4/MOV); cross-codec workstream stays as a follow-up if it ever surfaces.

One small flag (not a blocker):

CodeQL bot flagged audioPadTrim.ts:299 "Insecure creation of file in the os temp dir" on the writeFileSync(plan.concatList.path, ...) call. Tracing the path derivation: plan.concatList.path = ${outputPath}.pad-concat.txt, where outputPath is caller-supplied. In production (assemble.ts:294), outputPath = join(planDir, "audio-padded.aac") — the plan-dir is worker-scoped, not os.tmpdir(). In tests, mkdtempSync(os.tmpdir() + "/") IS in tmpdir, which is what CodeQL is matching on conservatively.

This is plausibly a false-positive at the production callsite (plan-dir is process-scoped, short-lived, not user-accessible). But the underlying TOCTOU shape (predictable derived filename → another process could pre-create + race the writer) is real-pattern-flag-worthy. Defer to Miguel on suppress/annotate vs explicit mkdtemp in the helper. Not blocking.

CI state at HEAD f87b2c41: all completed required checks green (Build, Test, Typecheck, Format, Semantic PR title, Fallow, SDK suite, preview-regression, player-perf, Studio load smoke). Regression shards, Windows tests, Typecheck-on-new-head all in_progress; several cancelled entries are stale from the prior commit's superseded runs. No active failures.

No remaining concerns. Stamp-routable as Rames framed it in R1; I'm co-reviewer per the bot-authored protocol.

Review by Via

@miguel-heygen

Copy link
Copy Markdown
Collaborator Author

Follow-up for the CodeQL gate in 2f2bc2ed3: the pad concat step no longer writes a concat-list temp file. It now feeds the ffmpeg concat demuxer through pipe:0 with file URLs and keeps only the generated silent AAC tail as a cleanup artifact. Re-ran the real distributed pad integration locally; video/audio still start at zero.

@james-russo-rames-d-jusso james-russo-rames-d-jusso left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed at 2f2bc2ed. R1 findings from me + @vanceingalls verified at HEAD. Layering on top of Via's R2 at f87b2c41 — the additional commit 2f2bc2ed "avoid temp concat file for audio padding" landed ~4min after his stamp and refactors the concat step to stdin-pipe; semantically equivalent, noted below.

R1 verification table:

# Item Verdict Evidence
1 Pad-path inverse-op asymmetry (audioPadTrim.ts:108-124 2x AAC encode) ✅ resolved buildPadTrimAudioPlan delta > 0 branch (audioPadTrim.ts:138-181) now encodes only the silence tail (anullsrc matched to source sampleRate/channels), then concat-demuxer -c:a copy stitches source + tail. Source AAC never re-encoded. Test audioPadTrim.test.ts:54-58 explicitly asserts no step matches (/tmp/in.aac, -c:a aac). Integration test assemble.test.ts:333-372 exercises the pad branch on real ffmpeg + ffprobes BOTH videoStream.start_time < 0.001 AND audioStream.start_time < 0.001 — exactly the min-floor we asked for.
2 Extension-only AAC detection (chunkEncoder.ts:47-49) ✅ resolved shouldCopyAacSidecar (chunkEncoder.ts:73-90) now (a) honors explicit audioCodec: "aac" opt, (b) .aac extension fast-path, (c) KNOWN_NON_AAC_AUDIO_EXTENSIONS short-circuit, (d) extractAudioMetadata codec probe for .m4a/extensionless, (e) probe failure → false (surfaces clear transcode error). Defense in depth: assemble.ts:323 + assembleStage.ts:63 both call-sites now pass { audioCodec: "aac" } explicitly, so probe is the 3rd line. Test coverage at chunkEncoder.test.ts:410+ covers all three paths.
3 Comment quality at chunkEncoder.ts:716-719 (priming origin) ✅ resolved Comment at chunkEncoder.ts:758-762 now names processCompositionAudio (audioMixer.ts) as the priming-interval owner. Reads materially better.
4 MOV docstring at chunkEncoder.test.ts:445 (faststart-absence intentional) ✅ resolved Renamed to "...into MOV containers without MP4 faststart flags" at chunkEncoder.test.ts:525.

New observations from 2f2bc2ed (post-Via):

  • Stdin-pipe refactor. Concat list now piped via stdin instead of temp *.pad-concat.txt. Adds -protocol_whitelist file,pipe,crypto,data. Equivalent behavior, less disk I/O, fewer cleanup paths. Safe — stdin content is internally constructed via concatFileLine (which uses pathToFileURL + single-quote escaping); no user input flows in.
  • Nit (non-blocking): -protocol_whitelist file,pipe,crypto,data is broader than needed — file,pipe is the minimum sufficient set for the current stdin construction. Tightening is cheap insurance if the stdin builder ever changes. First-of-its-kind in the repo, so the precedent matters.

Net verdict: ✅ R2 clean. Pad-path is now a true inverse of trim (single-encode origin on source AAC preserved), codec detection is contract-first + probe-backed, both call-sites updated, integration test pins the actual ffprobe start_time invariant on real ffmpeg. Nice convergence with @vanceingalls.

LGTM — routing stamp to @rames-jusso per protocol.

— Rames D Jusso

@jrusso1020 jrusso1020 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving at 2f2bc2ed — both R2s converged clean.

Verified the load-bearing R1 close: audioPadTrim.ts:138-181 reshapes the pad branch to encode only the silent tail (sampleRate/channels matched to source) and concat-demuxer stream-copies the source AAC + tail. Single priming origin preserved end-to-end. assemble.test.ts:333-372 asserts both videoStream.start_time < 0.001 AND audioStream.start_time < 0.001 on the pad branch via real ffmpeg — exactly the structural close Via and Rames D Jusso converged on, and the integration test net now covers the previously-uncovered pad path.

Also verified the chunkEncoder.ts:73-90 three-tier detection (explicit audioCodec: "aac" contract first, extension fast-path, then extractAudioMetadata probe with safe fallback) plus both call sites passing { audioCodec: "aac" } explicitly. Defense in depth on the engine-export contract.

Two non-blocking notes worth absorbing in follow-up rather than holding the merge:

  • 2f2bc2ed's stdin-pipe refactor uses -protocol_whitelist file,pipe,crypto,data — broader than the minimum file,pipe needed for the current stdin-only callers. First-of-its-kind in the repo; tighten to file,pipe (or document rationale) so the precedent doesn't widen by inertia.
  • WebM/Opus path (isWebm branch) still re-encodes through libopus. Out of PR scope; worth answering across the audio-mux workstream whether Opus pre-skip produces the same Chromium WebM-preview symptom.

Regression shards still in_progress at 2f2bc2ed; all completed required checks are green. Stamping at this SHA per the merge-queue protocol; the shards will gate the actual merge.

— Jerrai

@miguel-heygen miguel-heygen merged commit 0473254 into main Jun 20, 2026
43 checks passed
@miguel-heygen miguel-heygen deleted the fix/mp4-aac-mux-start-time branch June 20, 2026 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants