Skip to content

Add libplacebo GPU module with render and shader filters#1201

Open
D-Ogi wants to merge 6 commits intomltframework:masterfrom
D-Ogi:feature/placebo-module
Open

Add libplacebo GPU module with render and shader filters#1201
D-Ogi wants to merge 6 commits intomltframework:masterfrom
D-Ogi:feature/placebo-module

Conversation

@D-Ogi
Copy link

@D-Ogi D-Ogi commented Feb 1, 2026

Summary

New module placebo providing GPU-accelerated video processing via libplacebo:

  • placebo.render — GPU scaling (ewa_lanczos, lanczos, mitchell, etc.), debanding, dithering (blue noise, ordered LUT), and tonemapping (auto, clip, mobius, reinhard, hable, bt.2390, spline) with quality presets (fast/default/high_quality)
  • placebo.shader — Custom mpv-compatible .hook shader support with hot-reload on file change

Architecture

  • Singleton GPU context (gpu_context.c) with thread-safe initialization and render locking
  • Backend priority: D3D11 (Windows) → Vulkan → OpenGL
  • Vulkan loader dynamically loaded on Windows when libplacebo is built without vk-proc-addr support
  • Shader cache persisted to disk for faster subsequent startups
  • Graceful passthrough when no GPU is available

Build

Controlled by MOD_PLACEBO CMake option (default ON). Requires libplacebo via pkg-config. Optionally links D3D11/DXGI when PL_HAVE_D3D11 is detected at configure time. MSVC builds link PThreads4W.

Files (10 total)

File Description
CMakeLists.txt +1 line: MOD_PLACEBO option
src/modules/CMakeLists.txt +4 lines: placebo subdirectory
src/modules/placebo/CMakeLists.txt Module build config
src/modules/placebo/factory.c Module registration
src/modules/placebo/gpu_context.h GPU lifecycle API
src/modules/placebo/gpu_context.c Singleton GPU init (D3D11/Vulkan/OpenGL)
src/modules/placebo/filter_placebo_render.c Render filter implementation
src/modules/placebo/filter_placebo_render.yml Render filter metadata
src/modules/placebo/filter_placebo_shader.c Shader filter implementation
src/modules/placebo/filter_placebo_shader.yml Shader filter metadata

Testing

Tested on Windows with D3D11 backend via Kdenlive. Verified:

  • GPU initialization and fallback chain
  • Render filter with default/fast/high_quality presets
  • Shader filter with Anime4K and FSRCNNX .hook files
  • Shader hot-reload on file modification
  • Graceful passthrough when GPU is unavailable
  • Shader cache persistence across sessions
  • Thread safety under concurrent filter instances

@ddennedy
Copy link
Member

ddennedy commented Feb 1, 2026

How does this compare with using libplacebo through the existing avfilter?

@D-Ogi
Copy link
Author

D-Ogi commented Feb 1, 2026

The main difference is the GPU context lifecycle. The avfilter wrapper creates a new AVFilterGraph per filter instance and vf_libplacebo initializes its own Vulkan device inside that graph. With multiple filters on a timeline you get multiple GPU contexts. The native module uses a process-wide singleton in gpu_context.c so one pl_gpu, one pl_renderer, one pl_dispatch shared across all instances.

The frame path is also shorter. The avfilter wrapper does two memcpy round-trips between MLT buffers and AVFrames (line-by-line with linesize conversion), on top of whatever vf_libplacebo does internally for GPU transfer. The native module calls pl_tex_upload/pl_tex_download directly on the MLT image pointer, no intermediate AVFrame.

A the most interesting part is the shader filter that doesn't have an avfilter equivalent. It loads mpv .hook files at runtime and checks file mtime on every frame so when the file changes on disk, it re-parses only the pl_hook object while keeping the GPU context alive. This is useful for iterative shader development in for instance an NLE where you want to edit a .hook file in a text editor and see the result on the timeline without restarting anything. The avfilter path would need a full graph rebuild to pick up a changed shader_path.

The trade-off is a direct build dependency on libplacebo vs getting it through FFmpeg. The module could be optional so wouldn't affect builds where libplacebo isn't available. Regarding the tests, at this time I don't have solid Linux/MacOS environments to test all the dependencies, but as I read from failed tests the root cause seems to be easy to fix.

@D-Ogi D-Ogi force-pushed the feature/placebo-module branch 2 times, most recently from 518b737 to 67d2fa2 Compare February 1, 2026 21:33
New module 'placebo' providing GPU-accelerated video processing via
libplacebo. Includes two filters:

- placebo.render: GPU scaling, debanding, dithering, and tonemapping
  with quality presets (fast/default/high_quality)
- placebo.shader: Custom mpv-compatible .hook shader support

Backend priority: D3D11 (Windows) -> Vulkan -> OpenGL.
Vulkan loader is dynamically loaded on Windows when libplacebo is
built without vk-proc-addr support.

Features:
- Singleton GPU context with thread-safe access
- Shader cache persistence
- Multiple scaling algorithms (ewa_lanczos, lanczos, mitchell, etc.)
- Tone mapping (auto, clip, mobius, reinhard, hable, bt.2390, spline)
- Graceful fallback to passthrough when no GPU is available

The module is enabled by default but skipped automatically when
libplacebo is not installed.
@D-Ogi D-Ogi force-pushed the feature/placebo-module branch from 67d2fa2 to ebe1cae Compare February 1, 2026 21:48
@D-Ogi
Copy link
Author

D-Ogi commented Feb 2, 2026

Fixed the MinGW build: %zu is not supported by MSVCRT's printf which is what MinGW uses under the hood. Replaced with %llu + explicit cast.

@ddennedy
Copy link
Member

ddennedy commented Feb 2, 2026

You need to get at least some build workflows to actually build this (not all). For example,

  • .github/workflows/build-distros.yml: Add libplacebo-dev to Ubuntu and Debian, and add libplacebo-devel to Fedora 42 (I do not think one is available for Fedora 38).
  • .github/workflows/build-linux.yml: Add the libplacebo-dev package for the build-cmake job.
  • .github/workflows/build-msys2-mingw64.yml: Add the mingw-w64-x86_64-libplacebo package.

Replaced with %llu + explicit cast.

Our codebase generally prefers the macros %" PRIu64 " and %" PRId64 " from <inttypes.h>.

@D-Ogi D-Ogi force-pushed the feature/placebo-module branch 3 times, most recently from c5189b2 to 8129348 Compare February 2, 2026 23:12
Use PRIu64/PRId64 from <inttypes.h> instead of %zu/%ld for size
logging in the placebo module. Add libplacebo-dev packages to
Ubuntu, Debian, and Fedora 42 CI workflows, and
mingw-w64-x86_64-libplacebo to the MSYS2 MinGW64 workflow.
@D-Ogi D-Ogi force-pushed the feature/placebo-module branch from 8129348 to 35e85eb Compare February 2, 2026 23:25
@D-Ogi
Copy link
Author

D-Ogi commented Feb 2, 2026

Done. Added libplacebo packages to the three workflows, switched to PRIu64/PRId64 from <inttypes.h>, and added a minimum version requirement (libplacebo>=5.229) in the module's CMakeLists so older distros like Ubuntu 22.04 (ships v4.192) skip the module instead of failing.

Verified on my fork - all green: MSYS2 MinGW64, Ubuntu 24.04, 22.04, Debian stable/testing/unstable, Fedora 42, 38.

Break long mlt_log_info() call into multi-line format to match
the project's clang-format rules (same style as load_cache above).
@D-Ogi
Copy link
Author

D-Ogi commented Feb 4, 2026

Hi Dan, could I ask if you have an estimated timeline for the next round of review? This PR is a key enabler for my downstream work. The shader filters support lets Kdenlive reproduce After Effects preset pipelines and opens the door for the community to write custom GPU shaders within MLT.

@ddennedy
Copy link
Member

ddennedy commented Feb 4, 2026

In about 10 days as I’m on vacation

@ddennedy
Copy link
Member

ddennedy commented Feb 4, 2026

Something for you to comment on or think about until then. I have not looked closely enough. What happens when multiple placebo MLT filters are used on a producer? Does it transfer the image from RAM to GPU and back to RAM for each filter?

@D-Ogi
Copy link
Author

D-Ogi commented Feb 4, 2026

Currently each filter does a full RAM -> GPU -> RAM roundtrip per frame. The flow for N chained placebo filters looks like this:

RAM (producer image)
-> GPU upload -> GPU render -> GPU download -> RAM (filter 1)
-> ... -> ... -> ...- > RAM (filter 2)
-> ...-> ... -> ... -> RAM (filter 3)

So with 3 filters that's 6 CPU <-> GPU transfers instead of the ideal 2 (one upload at the start, one download at the end).
The singleton gpu_context shares the pl_gpu, pl_renderer, and pl_dispatch across all filter instances, so there's no redundant GPU initialization. But... each filter creates temporary textures, uploads the RAM buffer, processes, downloads back to RAM, and destroys the textures.

The reason is that MLT's mlt_frame_get_image() contract is fundamentally CPU-buffer-based. And I think no way for a filter to pass a GPU texture handle to the next filter in the chain. To eliminate the intermediate transfers, the frame would need to carry a "GPU-resident image" flag and a texture reference that downstream filters can reuse, with only the last filter in the chain (or the consumer) performing the final download. That's a non-trivial change to MLT's image passing architecture.

I could attempt to implement this, for example by attaching a pl_tex to the frame via mlt_properties_set_data and having each placebo filter check for an existing GPU texture before uploading from RAM. The last consumer or non-placebo filter would trigger the download. But I'd rather hear your thoughts on the right approach before going down that path, since it touches assumptions about frame ownership and lifetime that you know much better than I do.

When multiple placebo filters are stacked on one clip, each filter
previously did a full RAM→GPU upload and GPU→RAM download. The
intermediate uploads are redundant because the next placebo filter
would re-upload the same pixels immediately.

Each filter now attaches its output texture to the mlt_frame via
placebo_frame_put_tex(). The next placebo filter calls
placebo_frame_take_tex() to grab it directly as source, skipping
the upload. The download to RAM still happens every time (MLT
expects the image buffer to be current for non-GPU filters).

Staleness detection: put_tex records the RAM buffer pointer,
take_tex compares it against the current pointer. If a CPU filter
ran in between and requested a writable buffer (triggering a copy
and new allocation), the pointers differ and take_tex returns NULL,
falling back to a fresh upload.

Also cleans up internal ticket-style comments (C1/W2/etc.) with
descriptions of actual logic and pitfalls.
@D-Ogi D-Ogi force-pushed the feature/placebo-module branch from 84ee7e4 to 32f8a47 Compare February 4, 2026 23:01
@D-Ogi
Copy link
Author

D-Ogi commented Feb 4, 2026

benchmark_4k_results

D-Ogi added 2 commits February 5, 2026 00:33
Add apply_shader_params() to override pl_hook DYNAMIC parameters from
MLT animated properties (shader_param.* prefix) on every frame.  Uses
mlt_properties_anim_get_double/int to correctly resolve keyframe strings
("0=200;50=100") at the current frame position.

Add base64 decoding for shader_text values prefixed with "base64:" to
support inline shaders with characters that are problematic in MLT
property strings.
Run clang-format-14 (matching CI) on filter_placebo_shader.c and
gpu_context.c to fix designated initializer spacing, ternary line
breaks, and long argument lists.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants