← marketplace
creatorsconceptsha:76331c8943f91388manual
shorts-from-long-video
Use when slicing a long video into 3–5 highlight short clips with hook detection, safe-area subtitles, and 9:16 reframing — transcript-driven cuts via Whisper plus FFmpeg encoding.
Tutorials · creator-attached
One-line install
curl --create-dirs -fsSL https://skillmake.xyz/i/shorts-from-long-video -o ~/.claude/skills/shorts-from-long-video/SKILL.md
The hash above pins this exact content. The file we serve at /api/marketplace/shorts-from-long-video-76331c89/raw always matches sha:76331c8943f91388.
3,270 chars · ~818 tokens
--- name: shorts-from-long-video description: "Use when slicing a long video into 3–5 highlight short clips with hook detection, safe-area subtitles, and 9:16 reframing — transcript-driven cuts via Whisper plus FFmpeg encoding." source: https://ffmpeg.org/documentation.html generated: 2026-05-07T21:42:56.064Z category: concept audience: creators --- ## Tutorials - https://skillmake.xyz/v/shorts-from-long-video.mp4 ## When to use - Multiplying a single long-form video into Shorts/Reels/TikToks - Finding the best 30–90s moments without watching the whole video - Producing 9:16 reframes from a 16:9 master with safe-area captions - Batch-generating clip variants for A/B testing across platforms ## Key concepts ### hook detection Run an LLM over the timestamped transcript to score each 30–90s window for hook-worthiness: clear payoff, an opinion, a number, a story, surprise. Take the top N segments — typically 3–5 per hour of source. Avoid windows that depend on prior context the viewer won't have. ### 9:16 reframe Turning a 1920×1080 master into a 1080×1920 short. Either crop-center (works for talking heads), follow the speaker (face-detected via OpenCV/MediaPipe), or letterbox top + caption bottom (safe default when there's no clear subject). ### safe-area captions Mobile UI overlays the top ~250 px and bottom ~300 px of a 9:16 frame; captions go in the middle 1300px-tall band. Burn them in with FFmpeg's drawtext or render to SRT and rely on the platform; burned-in is safer for reach. ## API reference ``` ffmpeg crop + scale to 9:16 ``` Center-crop a 1920×1080 source to 1080×1920 by scaling first then cropping the wide axis. ``` ffmpeg -i in.mp4 -vf "crop=1080:1920:(in_w-1080)/2:0" -ss 00:01:23 -t 60 -c:v libx264 -crf 18 -c:a aac out.mp4 ``` ``` ffmpeg drawtext for burned-in captions ``` Burn each caption line at a specific timestamp range, in the safe middle band of a 9:16 frame. ``` ffmpeg -i in.mp4 -vf "drawtext=fontfile=/System/Library/Fonts/Supplemental/Arial.ttf:text='caption':fontcolor=white:fontsize=56:box=1:boxcolor=black@0.5:boxborderw=20:x=(w-text_w)/2:y=h*0.55:enable='between(t,1.5,4.0)'" out.mp4 ``` ``` hook-detection prompt template ``` Single-shot LLM prompt that takes the timestamped transcript and returns a JSON list of {start, end, hook} candidates ranked by score. ``` Score every 30–90s window of this transcript on these axes: clear payoff (0–3), opinion strength (0–3), surprise (0–3), self-contained (0–3). Return JSON [{startSec, endSec, score, hook, reason}]. Top 5 only. ``` ## Gotchas - Mid-sentence cuts kill watch time — extend the window to the next sentence boundary even if it pushes past 90s. - Center-crop fails for two-person interviews; fall back to face tracking or letterbox-with-caption. - Burned-in captions must use a font installed on the renderer, not a font name only — FFmpeg silently substitutes Arial. - TikTok/Shorts strip files >60s on upload via API; respect the platform max even if your hook is 75s. - Audio normalisation matters for Reels (target -14 LUFS) — FFmpeg's loudnorm filter handles it in two passes. --- Generated by SkillMake from https://ffmpeg.org/documentation.html on 2026-05-07T21:42:56.064Z. Verify against source before relying on details.
File: ~/.claude/skills/shorts-from-long-video/SKILL.md