Understanding Nodes

Nodes are the basic building blocks on the vpick canvas. Each node handles one specific task.

Node Types Overview

Node	What It Does
Text	Stores a piece of text
AI Assistant	Calls AI to generate copy
Image Generator	Generates images with AI
Video Generator	Generates short videos with AI
Audio Generator	Generates voice-over with AI (TTS)
Music Generator	Generates music with AI (BGM)
Lipsync	Makes a portrait talk in sync with audio
Vocal Separator	Separates audio into vocals, accompaniment, and original
Voice Changer	Transforms voice style with ElevenLabs
Audio Combine	Mixes multiple audio tracks into one
Combine	Merges multiple video clips into one
List	Stores multiple items for batch processing
Upload	Uploads your own images or files
Group	Visually groups multiple nodes

AI Assistant

Enter a prompt and AI will generate text content for you.

Common uses:

Generate product descriptions
List creative ideas
Write social media copy

You can enable "Export as List" to automatically split the AI response into multiple items, making it easy to feed into an Image Generator for batch processing.

Image Generator

Supports multiple models (Nano Banana 2, Grok Imagine, Seedream, etc.) to generate images from text descriptions.

Aspect ratios: 1:1 (square), 16:9 (landscape), 9:16 (portrait), etc.
Supports multiple reference image inputs
Connect a List to generate multiple images at once
See AI Models Guide for detailed model comparisons

Video Generator

Supports multiple models (Veo 3.1, Kling 3.0, Grok Video, Runway, etc.) to generate short videos from text descriptions.

Duration varies by model, from 3 to 15 seconds
Some models support sound generation (Kling, Grok, Veo)
Start/end frame support: Upload images as the starting or ending frame of the video
See AI Models Guide for detailed model comparisons

Audio Generator (Voice Over)

Uses the ElevenLabs V3 model to convert text to speech.

9 voices available (Roger, Sarah, Brian, etc.), each with a demo preview
Supports 10 languages
Adjustable Stability: affects the expressiveness of the voice
Output audio can connect to the Combine node to overlay on video, or to the Lipsync node

Music Generator

Uses the Suno V4.5 model to generate complete music from text descriptions.

Simple mode: Enter a description (e.g., "an upbeat jazz piano piece"), AI generates automatically
Custom mode: Specify music style and song title
Instrumental: Enable Instrumental mode to generate vocal-free background music
Great for video background music — connect to the Combine node's audio-in port

Lipsync

Uses the Kling Avatar model to turn a static portrait photo into a talking video.

Connect a portrait photo (image-in) + an audio clip (audio-in)
AI syncs the person's lip movements to the audio
Two modes: Standard ($0.12/sec), Pro ($0.24/sec)
Best results: Use a front-facing, clear photo with closed mouth

Vocal Separator

Uses the Demucs model to separate audio into three independent tracks.

Input: video (video-in) or audio (audio-in)
Output: vocals (vocals-out), accompaniment (accompaniment-out), original audio (origin-out)
Automatically creates 3 Upload nodes to store the results after separation
Perfect for removing background music or extracting vocals

Voice Changer

Uses ElevenLabs Speech-to-Speech to transform voice style.

Input: audio (audio-in)
Output: transformed audio (audio-out)
Uses your own ElevenLabs API Key (set in Settings -> ElevenLabs)
Choose from built-in voices or clone your own voice
Option to remove background noise
No vpick credits consumed (uses your own ElevenLabs quota)

Audio Combine

Mixes multiple audio tracks into one.

Connect multiple audio sources to the audio-in port
Outputs the mixed audio (audio-out)
Perfect for mixing vocals with background music

Combine

Merges multiple video clips in order into one complete video.

Connect multiple Video Generator / Lipsync / Upload nodes to the videos-in port
Optionally connect audio (audio-in) as background music
Audio mixing: If the video already has sound, it will be mixed with the background music (not replaced)
Automatically handles videos with different resolutions (re-encodes to a unified format)

List

The key node for batch generation. Store multiple items in a List, connect it to an Image or Video Generator, and it will automatically generate one output for each item.

For example, a List with 5 items connected to an Image Generator will produce 5 images.

Upload

Upload images from your computer to the canvas. Common uses:

As reference images for Image Generator
As start frames or end frames for Video Generator
As portrait photos for Lipsync

Group

Visually group multiple nodes for easier management.

Select multiple nodes and press Ctrl+G to create a group
Dragging a group moves all member nodes together
Customize group color and label

Connections

Nodes connect to each other with lines, and data flows along these connections:

[AI Assistant] -> [List] -> [Image Generator]

This way, the AI-generated text enters the List, and each item in the List generates an image separately.

Advanced Example: Lipsync Video

[Upload (Portrait Photo)] -> image-in -> [Lipsync]
[Audio Generator] -> audio-in -> [Lipsync]
[Lipsync] -> videos-in -> [Combine]
[Music Generator] -> audio-in -> [Combine]

This workflow will:

Audio Generator creates the voice-over
Lipsync makes the portrait talk
Music Generator creates background music
Combine merges the lipsync video + background music into the final video