Both pipelines run 100% on the 3090 workstation (192.168.1.143). Your Mac does zero heavy lifting — it only sends the command and collects the finished .mp4.
What it does: Extracts the raw mathematical skeleton from the video, builds a real 3D rig with a weighted mesh, and renders a studio-lit silver mannequin video using Blender's EEVEE engine — all headlessly on the 3090.
Output: A production-quality .mp4 with controllable camera, lighting, and materials. Also produces a .blend file if you need to tweak anything.
When to use it: When you need the cleanest, most professional motion reference for Kling 3.0, or when you want to change camera angles, lighting, or export the skeleton to Unreal Engine.
Trade-off: Takes a bit longer due to full 3D rendering (~8-10 minutes total).
Pipeline steps:
- Extract frames from video → sync to 3090
- SAM 3D Body inference on 3090 (DINOv3 + MHR pose extraction)
- Blender MCP Server builds the scene headlessly (armature, mesh, weights, 300 keyframes)
- EEVEE renders PNG frames → ffmpeg compiles to H.264 MP4
- Final
.mp4transferred back to Mac
Key scripts:
run_option_a_mocap.sh— orchestratorremote_wrapper.py— runs on 3090, handles SAM3D inference + MHR extractionbuild_mhr_scene.py— runs inside 3090's Blender, builds the rigged sceneremote_mcp_render_client.py— sends render commands to 3090's Blender MCP server
What it does: Runs the video through ComfyUI on the 3090, using AI to paint the grey mannequin directly over the original pixels frame-by-frame. Skips all 3D math entirely.
Output: A single isolated mesh video (grey mannequin on black background).
When to use it: When you need a quick motion reference video to throw into Kling 3.0 immediately and don't need camera control.
Trade-off: It's "flat" 2D — you can't rotate the camera or export the skeleton. But it's blazing fast.
Pipeline steps:
- Upload video to 3090
- ComfyUI API triggers SAM 3D Body node with
render_mode=mesh_only - Isolated mesh video rendered directly
- Final
.mp4transferred back to Mac
Key scripts:
run_kling_mocap.sh— orchestratorsam3d_comfy_api.py— sends workflow to ComfyUI API with configurable render mode
Render modes available:
mesh_only— isolated grey mannequin (default, recommended)side_by_side— 3-way split (original | mask | overlay) for debuggingmask_only— just the silhouette maskoverlay— mannequin overlaid on original video
| Option A (Blender) | Option B (ComfyUI) | |
|---|---|---|
| Speed | ~8-10 min | ~3-5 min |
| Output quality | Studio-lit 3D render | AI pixel paint |
| Camera control | ✅ Full 3D | ❌ Fixed |
| Exportable rig | ✅ .blend / Unreal | ❌ No |
| Compute | 100% 3090 | 100% 3090 |
| Best for | Final production ref | Quick iteration |