Text → PNG via OpenAI Images (gpt-image-1/dall-e-3) or OpenRouter multimodal chat (Gemini Flash Image, Flux, ...).
-
Updated
Apr 23, 2026
Text → PNG via OpenAI Images (gpt-image-1/dall-e-3) or OpenRouter multimodal chat (Gemini Flash Image, Flux, ...).
Narration script → N-shot storyboard: LLM plans shots, image model renders. Feeds video-slideshow.
Prepend branded intros and append outros to a video with smooth crossfades. Auto-normalises mismatched formats.
Overlay timed text bands (lower-thirds, chapter markers, call-outs) on a video. One ffmpeg pass.
Concatenate N video clips into one MP4, normalising resolution/fps, with optional crossfade.
LLM-readable video brief — metadata + silence gaps + scene changes + thumbnails. Pure ffmpeg, no cloud.
Picture-in-picture composition: overlay an inset video on a main video with configurable position, scale, border.
Images + optional audio → MP4 slideshow with Ken Burns zoom and xfade transitions. Pure ffmpeg.
Add YouTube / QuickTime chapter markers to an MP4 via ffmetadata — no re-encode, just metadata.
Burn SRT/VTT subtitles into the video stream with full styling control. ffmpeg-only.
Remove silences from a video with inaudible crossfades. ffmpeg-only, no cloud.
Mix a music bed under a video with automatic sidechain ducking. ffmpeg-only, no cloud.
Speed up or slow down a video (0.25x-4x) with pitch-preserving atempo chain for natural voice.
Text → narrated WAV + SRT. edge-tts / say / piper. Pure local, cross-platform.
Add a description, image, and links to the agentino-skill topic page so that developers can more easily learn about it.
To associate your repository with the agentino-skill topic, visit your repo's landing page and select "manage topics."