Pipeline
The TKK pipeline turns a topic into a finished 28-45 second vertical video. Every step is automated and driven by a single Claude Code session using MCP tools.
Step by Step
Write a Screenplay
A Python file defines 6 scenes following the mystery arc: hook, wrong answer, contradiction, proof, betrayal, punch.
Scene class with a construct() method. The file includes VTT timing cues mapping narration segments to scenes, a TTS_SCRIPT with the full narration text, and DURATION constants controlling time allocation. Target: 150 words per minute narration pace.
Generate TTS
Narration text is sent to Fish Audio, which returns a natural voice recording using the ELITE voice model.
generate_tts.py parses the screenplay's TTS_SCRIPT and produces per-scene tts_*.mp3 files. edge-tts is available as a fallback.
Render Preview Frames
Quick static PNG snapshots of each scene for visual review before committing to a full render.
python3 {topic}_manim.py --preview. Generates 6 PNGs in previews/ (~10 seconds). Lets you check layout, text placement, zone coverage, and visual balance without waiting for a full video render.
Quality Assurance
Three automated QA checks verify the video will look and sound right before final render.
qa_layout.py) — checks that content fills the vertical frame, verifies zone coverage and balance.Readability QA (
qa_readability.py) — contrast ratios, margin clearance, text size minimums.Sync QA (
qa_sync.py) — AV drift (>2s = FAIL), dead time (animations finishing >3s early), overflow (animations exceeding allocated time), number sync (visual numbers in correct scene), scene budget (minimum 1.5s per scene).
Full Render
Manim renders each scene as a video clip. Clips are concatenated and merged with voice audio to produce the final MP4.
{topic}_final.mp4 in the vidgen directory and appears in the clips library.
MCP Server
The entire pipeline is exposed as MCP (Model Context Protocol) tools via mcp_server.py. This lets Claude Code drive every step programmatically — reading/writing screenplays, generating TTS, rendering previews, running QA, and producing final videos — all from a single conversation.
Session Model
TKK uses a single Claude Code session to handle the full pipeline. No multi-agent orchestration, no autonomous workers, no message passing between processes. One session writes the screenplay, generates TTS, renders, runs QA, and fixes issues — all in a deterministic, linear flow.
This replaced an earlier multi-agent system (OpenClaw) that used 3 specialized agents. The single-session approach is simpler, more reliable, and easier to debug.