What is vpick

vpick is a visual workflow canvas where creators and AI Agents collaborate to make videos.

The Pain of Making Videos

The most time-consuming part of making a great video isn't the creativity itself — it's the execution:

Step What You Do Time Spent
Storyboarding Plan each shot's composition, framing, description A lot
Visual Generation Generate key frames for each scene, select and adjust styles A lot
Animation Set start/end frames, duration, transitions, wait for generation A lot
Creative Decisions Decide style direction, pick favorites, give feedback A little

You spend 80% of your time on repetitive execution, and only 20% on the creative decisions that truly matter.

How vpick Solves This

Hand that 80% to an AI Agent, and focus only on the critical 20%.

What You Handle (20%)

What the Agent Handles (80%)

Collaboration Flow

You: "Make a coffee brand ad, 6 scenes, warm tones"
        |
Agent: Plans 6 scene descriptions
Agent: Batch generates 6 key frames
Agent: Creates 6 short videos using start/end frames
        |
You: Browse the results
You: "Change scene 2 to a top-down angle, make scene 4 warmer"
        |
Agent: Adjusts immediately, regenerates those two shots
        |
You: Satisfied, download all videos

The canvas is your shared workspace. You can see every step the Agent takes, and you can stop, modify, or take over at any time.

Supported Generation Types

Type Models Description
Image Nano Banana 2, Grok Imagine, Seedream Scene key frames, product shots, generated in seconds
Video Veo 3.1, Kling 3.0, Grok Video, Runway 3-15 second clips with start/end frame control and sound
Voice ElevenLabs V3 Multi-voice, multi-language TTS
Music Suno V4.5 AI music generation with vocal and instrumental modes
Lipsync Kling Avatar Static portrait + voice = talking video
Vocal Separation Demucs Separate audio into vocals and accompaniment
Voice Changer ElevenLabs STS Voice style transformation
Text Gemini Storyboard scripts, scene descriptions, copy

Start/End Frame Control

This is the most practical feature for making videos. You can:

[Start Frame] -> [Video Generator] <- [End Frame]
                       |
              3-10 second animated video

Just Talk to Make It Happen

You don't need to learn any operations. Just tell the Agent:

From idea to finished video, all through natural language conversation.