Replace MoviePy with ffmpeg for 10-100x performance improvement by cyberb · Pull Request #15 · motattack/mtslinker

cyberb · 2026-03-24T12:01:27Z

Summary

Parallel downloads: 4 concurrent downloads instead of sequential, with 1MB chunk size (was 8KB)
ffmpeg instead of MoviePy: Eliminates the full video re-encoding that caused 3-hour videos to take 14+ hours to process. Uses ffmpeg concat demuxer with -c copy for near-instant concatenation
Removed moviepy/numpy dependencies: Only requires httpx and tqdm as Python deps. ffmpeg is the only new system requirement
Proper gap handling: Black segments and silence generated with ffmpeg (ultrafast, stillimage tune) instead of in-memory numpy arrays
Audio-only track merging: Uses ffmpeg amix filter with proper delay offsets

What changed

File	Change
`downloader.py`	Added `download_chunks_parallel()` using ThreadPoolExecutor, chunk size 8KB → 1MB
`processor.py`	Full rewrite: MoviePy → ffmpeg subprocess calls (ffprobe, concat demuxer, filter_complex)
`webinar.py`	Updated to use new parallel download + ffmpeg pipeline
`requirements.txt`	Removed `moviepy`
`setup.py`	Removed `moviepy` dep, bumped version to 2.0.0
`Dockerfile`	Added `ffmpeg` package installation

Performance

Tested on a real 3-hour webinar (139 chunks, 106 video + 33 audio-only segments):

Metric	Old (MoviePy)	New (ffmpeg)
Total time	14+ hours (reported in #8)	~70 minutes
Video concat	Full re-encode	`-c copy` (stream copy)
Peak RAM	~1GB	~300MB

For recordings without audio-only tracks, the improvement is even larger since the audio mixing step (the slowest remaining part) is skipped entirely.

Requirements

ffmpeg and ffprobe must be installed (apt install ffmpeg / brew install ffmpeg)
The tool checks for ffmpeg at startup and gives a clear error if missing

Fixes #8

Test plan

Download a webinar with multiple video segments — verified output plays correctly
Test with recordings that have 33 audio-only tracks — audio merged successfully
Test with recordings that have gaps between segments — black segments generated correctly
Verify Docker build works with new Dockerfile

- Parallel downloads (4 concurrent) instead of sequential - Increase download chunk size from 8KB to 1MB - Replace MoviePy video processing with direct ffmpeg calls - Use ffmpeg concat demuxer with -c copy (no re-encoding) - Normalize segments to common resolution for reliable concat - Handle gaps with lightweight ffmpeg-generated black segments - Merge audio-only tracks using ffmpeg filter_complex - Remove moviepy and numpy dependencies - Add ffmpeg to Dockerfile - Bump version to 2.0.0 Fixes motattack#8

Some MTS Link recordings store only audio in the direct mp4 files, while the HLS delivery endpoint has both video and audio streams. Check the HLS playlist for a video track and download via ffmpeg when detected.

The old _merge_audio_tracks passed all audio files (up to 63+) as simultaneous inputs to a single ffmpeg amix command, which required ffmpeg to hold all delayed audio streams in memory for the full recording duration — causing OOM kills on long recordings. Now audio tracks are pre-delayed individually, then mixed via tree reduction in batches of 8. Also routes all subprocess calls through _run_ffmpeg which logs stderr on failure instead of silently swallowing it.

Recordings with multiple simultaneous feeds (webcam + screen share) have segments with overlapping timestamps. The old code laid them out sequentially, turning a 3hr recording into 10+ hours of concat video. This also caused the audio merge WAVs to be padded to 10hrs each, requiring ~400GB of disk. Added _deduplicate_overlapping() which keeps only the longest segment per time window (186 -> 7 segments in a real test case). Also pass total_duration from the API to _merge_audio_tracks so WAVs are padded to the correct recording length, not the (potentially inflated) concat file duration.

The previous approach materialized each audio track as a full-duration WAV (~1.8GB each for a 3hr recording). With 63 tracks that's ~113GB, filling the disk and crashing with "No space left on device". Now audio tracks are mixed in batches directly with adelay inside the ffmpeg filter graph, outputting compressed m4a (~15MB each). No intermediate WAVs are created. Batch results are tree-reduced and intermediates are deleted immediately after each round.

- Add _validate_downloaded_file() to check files with ffprobe after download - Re-download corrupt files (missing moov atom) up to 2 retries - Validate existing cached files on disk, re-download if corrupt - Add _is_valid_media() in processor to skip corrupt files during classification - Audio batch mixing catches errors and skips failed batches instead of crashing - If all audio batches fail, output video without audio overlay

Extract presentation.update events from the MTS API to get slide images and their timestamps. Download pre-rendered slide JPGs and composite them with the webcam video in a 1280x720 layout: - Left 960px: presentation slide - Top-right 320x180: webcam - Slides are pre-encoded as 1fps video segments and concatenated into a single track, then overlaid with the webcam in one pass. Recordings without presentations are unaffected (existing behavior).

Some recordings have tiny thumbnail-sized video segments (192x108) as the first file. The old code used the first segment's resolution for all normalization, resulting in a blurry output. Now scans all segments and picks the largest, with a 640x360 floor.

When multiple webcams overlap at the same timestamp, the old code kept the longest segment (often a random participant). Now tracks conference ID from the API and prefers the user with the most total segments across the recording — typically the presenter/instructor. Falls back to longest segment when conf_id is unavailable.

The -loop 1 -framerate 1 -t approach could produce millions of frames for long-duration slides (e.g., last slide staying up for 3 hours), causing ffmpeg to spin for hours and write gigabytes. Now uses -frames:v to strictly cap frame count to match duration at 1fps.

Detects h264_nvenc at startup and uses it for all encoding steps if available. Falls back to libx264 CPU encoding if no GPU. Massively reduces CPU load and encoding time on systems with NVIDIA GPUs, while keeping the CPU cool.

The overlay step was CPU-bound (97°C). Now uses hwupload_cuda, scale_cuda, and overlay_cuda to do the compositing entirely on GPU. Falls back to CPU filters if CUDA overlay is not available.

Two changes: 1. Swap inputs in slide compositing so webcam (25fps) drives the output frame clock instead of the slide track (1fps). Fixes choppy webcam playback in presentation videos. 2. For recordings without presentation slides that have multiple concurrent webcams (ПЗ sessions), composite all active webcams into a grid layout using xstack instead of discarding all but one. Grid size adapts to the number of concurrent webcams (2x1, 2x2, 3x3 etc). Audio from all participants is mixed.

…ipeline Webcam inputs may lack audio tracks, causing ffmpeg to fail with 'Stream specifier :a matches no streams'. Since _merge_audio_tracks handles all audio separately, the grid step should output video only.

The old scoring picked the conference with the most segments, which favored participants toggling their cameras (many short segments) over the presenter (few long segments). Also had a window-shrinking bug where replacing a long segment with a short higher-ranked one let subsequent segments leak through. New approach: identify the main conference by total recorded duration, keep its segments, and fill gaps from other conferences.

Extracts ADMIN role from userlist events, maps to conference IDs via conference.add events, and passes is_admin flag through the download pipeline. Dedup now prefers ADMIN conferences (the presenter), falling back to total duration when no admin is found. Also fixes download_chunks_parallel to preserve the is_admin flag.

…ebcam layout Dedup gap-fill: clamp "other" conference segments to actual gap boundaries instead of using raw file duration, preventing timeline overflow. Compile: skip segments starting before current_time (safety net for overlaps), and truncate segments via -t so they can't overflow into the next segment. Slide composite: scale webcam proportionally to 320px wide (was fixed 320x180), so portrait webcams render at a usable size instead of being squished.

Grid fix: cell dimensions from integer division could be odd, causing ffmpeg's scale filter to round up and produce dimensions larger than the pad target ("Padded dimensions cannot be smaller than input"). Now forces even dimensions and uses min() to cap scale output. Audio fix: amix divides volume by number of inputs at each stage. After 3 levels of mixing (batch->reduce->overlay), audio was attenuated to near-silence (-91 dB). Added volume=N compensation after each amix to restore original loudness.

normalize=0 already prevents amix from dividing by N, so the volume=N multiplier was over-amplifying (~x112 across 3 pipeline stages), turning noise from silent tracks into interference. Also filter out silent audio-only segments (<-80 dB) before mixing so they don't waste processing time or add noise floor.

-80 dB was filtering out participant microphone audio that sits around -80 to -60 dB. Only -91 dB is true digital silence.

With normalize=0 and no volume=N, mixing silent segments with real audio just gives real audio. The filter was incorrectly dropping participant microphone tracks. Removing it simplifies the pipeline and ensures all audio-only segments are included.

When slides + multiple webcams are present, analyzes audio levels per participant to detect who is talking. Switches the right-side webcam to show the active speaker, defaulting to presenter when nobody else talks. Uses 2s analysis windows with 4s minimum hold to prevent flickering.

NVENC + complex overlay filter on 3+ hour videos consumes ~7GB, triggering OOM killer. libx264 uses ~300MB for the same operation. All other encoding steps still use NVENC.

The 720p cap + fast preset still OOM-kills on 3.5h recordings with many segments (e.g. 1197678196: 125 chunks, 23 participants, 34 segments).

Split compositing into 30-min chunks so ffmpeg never holds the full video in memory. This allows using NVENC again (faster) and restores 720p resolution cap. Each chunk is composited independently then concatenated with stream copy.

_get_video_encoder_fast() set _NVENC_AVAILABLE directly, bypassing _detect_gpu(). This left _CUDA_OVERLAY_AVAILABLE as None, so compositing always used CPU overlay even with CUDA support available.

Each participant's audio segments are analyzed by independent ffmpeg calls. Running 4 in parallel instead of sequentially speeds up speaker detection ~4x on multi-core systems.

Speaker switching can produce segments whose combined duration exceeds the original recording. Cap the concat at total_duration from the API to ensure the output matches the expected length.

Split monolithic processor.py (1915 lines) into 6 focused classes: - FFmpegRunner: ffmpeg execution, GPU detection, encoder selection - MediaProber: file probing, duration, streams, audio levels - GridCompositor: multi-webcam grid layout - SlideCompositor: presentation slide overlay - AudioMerger: batched audio mixing with tree-reduce - SegmentBuilder: normalize, gaps, dedup, admin detection VideoProcessor composes all classes via constructor injection. No static methods, no underscore prefixes on public methods. processor.py is now a thin orchestrator with backward-compatible module-level functions. 33 tests across 5 test files, all passing. Added requirements-dev.txt with pytest.

yokidjo · 2026-04-10T08:16:38Z

@cyberb
Testing strategy suggestions (non-blocking)

Awesome work on the refactoring and adding 33 tests! The class decomposition with dependency injection is perfect for testing.

Two suggestions for future iterations:

Integration tests with Testcontainers

If the current tests mock FFmpegRunner and MediaProber (which makes sense for fast unit tests), consider adding a few integration tests with Testcontainers to verify that complex ffmpeg filter graphs work with real binaries.

Benefits:

Catch regressions when ffmpeg CLI behavior changes between versions
Verify that amix tree-reduce, xstack grid, and CUDA overlay detection work end-to-end
Run identically in CI and locally

Trade-off: slower — can be marked @pytest.mark.integration and run separately.

Code coverage reporting

Adding pytest-cov would help track which parts of the pipeline are well-tested vs. untested. Example setup:

pytest --cov=mtslinker --cov-report=term --cov-report=html

This gives visibility into coverage for audio mixing, grid composition, slide overlay, and edge case handling.

Both are just ideas for the roadmap — not blockers for this PR. Thanks again for the massive effort on this!

yokidjo · 2026-04-10T08:19:16Z

GitHub Actions integration suggestion

I see you added requirements-dev.txt — great first step toward CI. Here's a complete setup you could add in a future PR if you want automated testing on every push.

Create .github/workflows/ci.yml with:

name: CI

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.10"
          
      - name: Install ffmpeg
        run: sudo apt-get update && sudo apt-get install -y ffmpeg
        
      - name: Install dependencies
        run: |
          pip install --upgrade pip
          pip install -r requirements.txt
          pip install -r requirements-dev.txt
          
      - name: Run tests with coverage
        run: pytest --cov=mtslinker --cov-report=xml --cov-report=term
        
      - name: Upload coverage to Codecov
        uses: codecov/codecov-action@v3
        with:
          files: ./coverage.xml

Notes:

ubuntu-latest has Docker pre-installed, so Testcontainers would work out of the box if added later
For Codecov, you'll need to add CODECOV_TOKEN to repository secrets (optional — public repos can use tokenless upload)
Integration tests with @pytest.mark.integration can run in the same job since ffmpeg is already installed via apt

Not a blocker — just a template for when you're ready to automate the test suite.

Grid composite already includes audio from input 0 (sorted so audio-bearing webcam is first). Extracting the same webcam audio again for _merge_audio_tracks caused the voice to play twice. Audio-only tracks already cover other participants.

yokidjo · 2026-04-10T11:19:04Z

@cyberb Let's wrap this up — amazing work!

I think we should cap this PR at the current functionality and move any remaining edge cases to follow-up PRs or issues. This will let us merge the massive performance improvements now rather than chasing the last 1% of fringe scenarios indefinitely.

Here's a summary of what's been accomplished:

Performance

14+ hours → 70 minutes for 3-hour webinars
RAM usage: ~1GB → ~300MB
Parallel downloads (4 threads, 1MB chunks)
-c copy for video concat (no re-encoding)

Architecture

Split monolithic processor.py (1915 lines) into 6 focused classes:
- FFmpegRunner: ffmpeg execution, GPU detection, encoder selection
- MediaProber: file probing, duration, streams, audio levels
- GridCompositor: multi-webcam grid layout
- SlideCompositor: presentation slide overlay
- AudioMerger: batched audio mixing with tree-reduce
- SegmentBuilder: normalize, gaps, dedup, admin detection
Two-phase pipeline: analyze → manifest.json → execute
Dependency injection throughout (no static methods, testable)

Edge cases handled

Overlapping segments (grid composition + dedup strategies)
Variable resolutions (auto-detects max, 640x360 floor)
ADMIN detection for presenter prioritization
GPU acceleration (NVENC + CUDA overlay with CPU fallback)
OOM prevention (30-min chunking, tree-reduce audio mixing, no giant WAVs)
Corrupt file recovery (ffprobe validation, up to 2 retries)
Gaps and black segment generation
Audio-only tracks and webcams without audio
Speaker detection and switching (with hysteresis)
Presentation slide compositing (1280x720 layout)

Testing

33 tests across 5 test files, all passing
requirements-dev.txt with pytest
Class structure ready for mocking and Testcontainers in the future

Security

All ffmpeg calls use subprocess.run with List[str] (no shell=True)
Safe against command injection from malicious filenames

What's left (can be follow-up issues/PRs)

Docker validation (build image, test on short webinar with slides, verify no ffmpeg errors)
Remaining audio edge cases from your 30-link test batch
GitHub Actions CI workflow
Code coverage reporting (pytest-cov)
Testcontainers integration tests

This PR is already a massive win. Let's merge it and iterate on the rest in smaller, focused PRs. 🚀

Grid input 0 already has audio — but other webcams' audio was lost. Now extracts audio from all webcam files EXCEPT the one used as input 0 in each grid segment. This captures all voices without echo. Added test_audio_pipeline.py with integration tests: - Grid audio no echo (same source not duplicated) - Grid takes audio from first input - Audio merge preserves timing with adelay - Segment duration matches plan - Black segments have silent audio stream

yokidjo · 2026-04-10T13:06:17Z

@cyberb
One question before this gets too deep into edge-case handling:

Have you looked at the actual MTS Link player JavaScript code in the browser?

I'm wondering if we could simplify (or even eliminate) most of the complex reconstruction logic by understanding how the official client does it. The browser player must have a source of truth for:

Segment timeline (what plays when)
Overlapping sources (webcam + screen share simultaneously)
Layout instructions (who goes where in the grid)
Audio mixing priorities

If we can find where the player builds its internal playlist, we could replicate that logic instead of heuristically fixing:

Gap detection and black frame insertion
Overlap deduplication
Manual adelay offset calculation
Grid xstack assembly
Admin/speaker detection

yokidjo

@cyberb
One question before this gets too deep into edge-case handling:

Have you looked at the actual MTS Link player JavaScript code in the browser?

Grid input 0 already has audio — other webcams need extraction. Tracks which path is input 0 per grid segment and excludes only those. Added test_audio_pipeline.py: echo, timing, duration, silence tests.

Replace all guesswork (overlap detection, dedup, speaker switching) with StreamTimeline that builds playback windows from API mediasession events — matching exactly what the MTS-Link web player does. - Add StreamTimeline class with dataclasses (MediaSession, TimeWindow, GridSource, AudioTrack, DownloadChunk, SlideEvent) - GridCompositor now mixes all audio streams inline via amix - Remove dedup strategy, overlap heuristics, webcam audio extraction - Remove dead code: deduplicate, extract_admin_conf_ids, is_valid, is_silent, analyze_audio_levels, legacy compat wrappers - Fix .gitignore (was too broad, ignored tests/) - 49 tests passing

yokidjo

@cyberb StreamTimeline approach looks great. Code is cleaner, logic matches the actual player, tests pass.

@motattack LGTM. Ready for merge.

Audio-only streams were downloaded as raw binary from the storage URL, which returns valid MP4 containers with silent audio (-91 dB). The real audio lives in the HLS playlist variants. Now tries HLS first for all streams (video and audio-only), falling back to direct download only if HLS is unavailable.

The variable was removed in the mediasession rewrite (9a8a657) but the logging line still referenced it. Strategy is now always 'timeline'.

When multiple streams are active: - Screenshare → main area, admin → PIP overlay - Admin (no screenshare) → main area, participant → PIP overlay - No admin/screenshare → fall back to grid Also fix grid xstack to always output exact target resolution, preventing concat corruption from mismatched segment sizes.

Split GridCompositor into smaller classes: - GridLayout: xstack grid compositing - PresenterLayout: main + PIP overlay compositing - GridCompositor: backward-compatible facade delegating to both - _build_audio_filter: shared audio mixing helper - _even: shared utility Add 12 new tests covering presenter layout (main-only, PIP, extra audio, resolution consistency), audio filter builder, grid resolution matching, and facade backward compat. 61 tests passing.

The final amix step assumed the video always has an audio stream and that it matches the mixed audio's 44100 Hz sample rate. HLS sources come in at 48000 Hz, causing "Invalid argument" in the filter graph. Now checks for audio presence and resamples before mixing.

The old heuristic (has_video && !has_audio = screenshare) was wrong — it matched webcams with muted mics. The API provides explicit stream.screensharing data on mediasession.add events. Now uses that to correctly identify screen share streams for presenter layout.

cyberb · 2026-04-20T12:00:42Z

I think I am done with this, feel free to take it or leave it, thanks!

yokidjo · 2026-04-20T15:14:28Z

@motattack I don't have write access to this repository, so I cannot merge this PR myself.

GitHub API returns 403 Forbidden with the message "Must have push access", confirming that I lack the necessary permissions.

@cyberb has done an enormous amount of work here

Please merge this Pull Request yourself, as only you (or someone else with write/admin access) can do it.

Thanks!

anullsrc is infinite; with -c:v copy the video muxes faster than realtime so -shortest alone lets ffmpeg's interleaving buffer grow until av_interleaved_write_frame fails with Cannot allocate memory. Probe the input duration and pass -t to bound the silent track, falling back to -shortest-only when the probe yields no duration. Strengthen test_ensure_audio_adds_silent to assert the output has an audio stream and that the -t bound does not truncate the video.

The bc9b12a -t output bound did not help: with -c:v copy the muxer receives all video packets at once while anullsrc audio is generated lazily, so it buffers the whole copied segment in RAM to interleave, dying with av_interleaved_write_frame: Cannot allocate memory on long segments. Add -max_interleave_delta 0 so packets are written without buffering, and make anullsrc a finite input (-t before -i) so the fallback path is bounded too. Add regression tests asserting the command keeps both safeguards; the existing real-ffmpeg test uses a short clip and cannot trigger the length-proportional OOM.

Root cause of the av_interleaved_write_frame OOM: when a manifest segment has source_offset > source file duration (planner asks normalize to seek past end-of-source), ffmpeg decodes zero frames, writes a valid 262-byte empty container, and exits 0. ensure_audio then sees duration=0, takes the unbounded anullsrc branch with -c:v copy, and since no video packets ever arrive, -shortest never fires — the muxer buffers silent audio forever until RAM+swap are exhausted. The previous fix tuned ffmpeg muxer flags but never closed the duration=0 vector that bypassed them. - normalize(): after ffmpeg.run, probe output and raise CalledProcessError if duration <= 0. - ensure_audio(): raise CalledProcessError on non-positive-duration input and remove the unbounded anullsrc fallback entirely (it had no safe semantics). - processor.execute() video branch: wrap normalize+ensure_audio in the same try/except → generate_black fallback already used by the grid and presenter branches; a seek-past-EOF segment becomes a correctly-sized black gap instead of crashing the whole webinar. Reproduction (segment 73, source duration 67.24s, source_offset 104.17s): before: normalize produces 262B duration-0 file exit 0; ensure_audio runs unbounded and hits ENOMEM. After: normalize raises with a clear message naming the input and seek offset; processor falls back to black.

The pycache was accidentally added before __pycache__/ was in .gitignore, so the rule never applied (gitignore is bypassed for already-tracked files). Untrack them and add *.pyc / *.egg-info/ so they stay out.

When the kernel OOM killer reaped ffmpeg mid slide-composite chunk (exit -9 / 137), the whole video failed. NVENC's pinned host buffers push long filter_complex graphs over the memory budget on some videos. Catch SIGKILL only (not generic ffmpeg errors), re-run that chunk with libx264, and stay on CPU for the remaining chunks of the same video so the next chunk does not pay the kill cost again. Non-OOM failures still propagate. Adds get_video_encoder_cpu() and a tests/test_slides.py spy for the retry path.

cyberb added 23 commits March 24, 2026 10:40

Download video via HLS when storage mp4 is audio-only

cdcb119

Some MTS Link recordings store only audio in the direct mp4 files, while the HLS delivery endpoint has both video and audio streams. Check the HLS playlist for a video track and download via ffmpeg when detected.

Add NVENC GPU encoding support with automatic detection

f95646e

Detects h264_nvenc at startup and uses it for all encoding steps if available. Falls back to libx264 CPU encoding if no GPU. Massively reduces CPU load and encoding time on systems with NVIDIA GPUs, while keeping the CPU cool.

Use CUDA GPU filters for overlay compositing when available

c3fa7ed

The overlay step was CPU-bound (97°C). Now uses hwupload_cuda, scale_cuda, and overlay_cuda to do the compositing entirely on GPU. Falls back to CPU filters if CUDA overlay is not available.

fix: skip audio in grid composite - audio handled by separate merge p…

d24229e

…ipeline Webcam inputs may lack audio tracks, causing ffmpeg to fail with 'Stream specifier :a matches no streams'. Since _merge_audio_tracks handles all audio separately, the grid step should output video only.

Lower silent audio threshold from -80 to -88 dB

24b1bf4

-80 dB was filtering out participant microphone audio that sits around -80 to -60 dB. Only -91 dB is true digital silence.

Cap target resolution at 720p to prevent OOM kills on long lectures

e614b83

Tisar2 mentioned this pull request Apr 5, 2026

Не полное отображение видео #13

Closed

cyberb added 6 commits April 5, 2026 15:53

Use libx264 for slide compositing to avoid OOM on 8GB RAM

dd04fb4

NVENC + complex overlay filter on 3+ hour videos consumes ~7GB, triggering OOM killer. libx264 uses ~300MB for the same operation. All other encoding steps still use NVENC.

Lower resolution cap to 480p and use ultrafast preset to fix OOM

bd4647b

The 720p cap + fast preset still OOM-kills on 3.5h recordings with many segments (e.g. 1197678196: 125 chunks, 23 participants, 34 segments).

Fix CUDA overlay detection skipped when encoder_fast called first

8bbfe96

_get_video_encoder_fast() set _NVENC_AVAILABLE directly, bypassing _detect_gpu(). This left _CUDA_OVERLAY_AVAILABLE as None, so compositing always used CPU overlay even with CUDA support available.

Parallelize speaker detection audio analysis with 4 threads

75b4457

Each participant's audio segments are analyzed by independent ffmpeg calls. Running 4 in parallel instead of sequentially speeds up speaker detection ~4x on multi-core systems.

Cap concat output at total_duration to prevent inflated video length

e720279

Speaker switching can produce segments whose combined duration exceeds the original recording. Cap the concat at total_duration from the API to ensure the output matches the expected length.

yokidjo mentioned this pull request Apr 10, 2026

Ускорение #8

Open

cyberb added 2 commits April 10, 2026 07:52

Add __pycache__ to gitignore

2772f7d

yokidjo approved these changes Apr 10, 2026

View reviewed changes

yokidjo suggested changes Apr 10, 2026

View reviewed changes

cyberb added 2 commits April 10, 2026 22:37

Extract audio from non-primary grid webcams, add audio pipeline tests

644e782

Grid input 0 already has audio — other webcams need extraction. Tracks which path is input 0 per grid segment and excludes only those. Added test_audio_pipeline.py: echo, timing, duration, silence tests.

yokidjo reviewed Apr 11, 2026

View reviewed changes

yokidjo approved these changes Apr 11, 2026

View reviewed changes

cyberb added 6 commits April 12, 2026 17:51

Fix NameError: remove stale overlap_strategy reference in log line

b8cc77f

The variable was removed in the mediasession rewrite (9a8a657) but the logging line still referenced it. Strategy is now always 'timeline'.

yokidjo approved these changes Apr 19, 2026

View reviewed changes

cyberb added 5 commits May 18, 2026 19:55

Stop tracking tests/__pycache__; ignore *.pyc and *.egg-info/

1e5e796

The pycache was accidentally added before __pycache__/ was in .gitignore, so the rule never applied (gitignore is bypassed for already-tracked files). Untrack them and add *.pyc / *.egg-info/ so they stay out.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace MoviePy with ffmpeg for 10-100x performance improvement#15

Replace MoviePy with ffmpeg for 10-100x performance improvement#15
cyberb wants to merge 65 commits into
motattack:masterfrom
cyberb:feature/ffmpeg-performance

cyberb commented Mar 24, 2026

Uh oh!

yokidjo commented Apr 10, 2026

Uh oh!

yokidjo commented Apr 10, 2026

Uh oh!

yokidjo commented Apr 10, 2026

Uh oh!

yokidjo commented Apr 10, 2026

Uh oh!

yokidjo left a comment •

edited

Loading

Uh oh!

yokidjo left a comment

Uh oh!

cyberb commented Apr 20, 2026

Uh oh!

yokidjo commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cyberb commented Mar 24, 2026

Summary

What changed

Performance

Requirements

Test plan

Uh oh!

yokidjo commented Apr 10, 2026

Uh oh!

yokidjo commented Apr 10, 2026

Uh oh!

yokidjo commented Apr 10, 2026

Uh oh!

yokidjo commented Apr 10, 2026

Uh oh!

yokidjo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yokidjo left a comment

Choose a reason for hiding this comment

Uh oh!

cyberb commented Apr 20, 2026

Uh oh!

yokidjo commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yokidjo left a comment •

edited

Loading