How to Write Seedance 2.0 Prompts: Shot Structure + the 8-Element Formula
2026-06-16

How to Write Seedance 2.0 Prompts: Shot Structure + the 8-Element Formula

Writing Seedance 2.0 prompts is not about flowery words — it is about structure. The model is best understood as a "multimodal AI director" that splits a shot into a space layer (what is in frame) and a time layer (how it changes over time). So a good prompt is an engineering instruction — who, where, doing what, shot how, in what order. This piece lays out how to write the text prompt along the officially recommended structure, and how VideoLens generates that structure from a reference video. (Text prompts only — multimodal reference-to-video is out of scope here.)

I. The 8-element formula

The recommended formula: precise subject + action detail + scene/environment + light & color + camera move + visual style + image quality + constraints. Lock "who is doing what" first, then "where and what mood," then "how it is shot," and finally tighten the result with style, quality and constraints.

ElementWhat to writeExample
Subject2–3 stable static traits (wardrobe/hair/look/class)a woman in a red dress and straw hat
Actiondown to limbs + range/speed/forceslowly raises a hand, dips her head
Scenethe setting / position / spatial relationa dorm corridor at dusk
Light & colorthe light and color tone of the framewarm sunlight through the window, soft light
Camera movestandard terms, one move per shotsteady medium tracking, slow push-in
Visual styleart style and overall tonecinematic documentary / fresh anime / 3D
Qualitysharpness, detail, textureHD, cinematic, soft light
Constraintsbound the result, avoid artifactsno subtitles, no logo/watermark, no face warp

II. Space × time: why shot sequencing

The model decouples space and time internally, so the ideal prompt for a complex video is a timeline of shots: split it into several shots and describe each in event order. A vague "a man runs nervously down the street, very cinematic" is far weaker than Shot 1 / Shot 2 / Shot 3.

Organize each shot as: ① camera move or cut → ② subject action & expression → ③ position / spatial change → ④ audio (SFX / voice / BGM).

Per the official guide: the model is unstable with exact timings (e.g. 0–3s). Do not force per-shot durations — order with "Shot 1 / 2 / 3" and let pacing emerge.

III. Writing action (four rules)

RuleHowExample
Specific + quantifiedname the limb + range/speed/forceslowly raise hand, quick head turn
Prefer slow, small movesavoid sprinting/leaping/violent rollswalk slowly, sit down naturally
Add transitionsstate the inertia linking movesraise the arm off the turning motion
Externalize emotionuse body detail, not "very sad/angry"see the table below

Emotion externalization — translating abstract feeling into filmable detail:

FeelingExternalized as action & detail
Sadnesshead down, shoulders trembling, reddened eyes, fingers gripping the hem, tears welling but not falling
Joyan irrepressible smile, relaxed brow, light steps, a little spin
Anxietychecking the watch, drumming fingers, quick breath, darting eyes, nail-biting
Angerclenched fists, tight jaw, heaving chest, knife-like stare, words forced through teeth
Reliefa long exhale, shoulders loosening, a faint smile, gaze lifting to the distance

IV. Camera, quality and constraints

Use standard camera terms directly — the model reads them well: medium, close-up, wide, slow push-in, steady pan, locked-off. Note: keep to one move per shot; combining push/pull/pan/tilt destabilizes the image.

The closing trio — quality, style and constraints — tightens the output:

TypePurposeTemplate / example
Qualitysets sharpness & textureHD, rich detail, cinematic, soft light
Styleunifies the art directioncyberpunk teal-purple, retro film, fresh anime, 3D
Constraintsavoids artifacts & leftover marksno subtitles / no text / no logo / no watermark

V. Audio & dialogue symbols

Seedance 2.0 natively co-generates audio and video; fixed symbols mark the type of information so the model parses it correctly:

TypeSymbolExample
Music()(upbeat rock plays in the background)
SFX<><a dog barks in the distance>
Dialogue{}{hello world}; mark the language for non-CN/EN, e.g. in Japanese say {こんにちは}
Caption【】【Chapter 1: Departure】

A few dialogue tips: keep one language per line (proper nouns aside); the model misreads rare/polyphonic Chinese characters — swap in a common homophone (e.g. 螭龙山 → 吃龙山); and add a "no subtitles" constraint if you do not want captions.

VI. Generate this structure with VideoLens

Writing this by hand means typing every shot. VideoLens runs it in reverse — give it a reference video and its Creation Assistant breaks it down shot by shot and outputs Seedance 2.0 prompts in exactly this structure:

· anchors the recurring characters, scenes and props as reusable entities; · generates a per-shot prompt in shot order (camera move + subject action + scene & light); · closes with a style tail that unifies quality and tone, defaulting to "no subtitles, no logo/watermark"; · separates dialogue, SFX and BGM and maps each onto its shot.

In short: you do not start from a blank page — VideoLens hands you a shot list you can tweak directly.

The prompt methodology in this article is compiled from ByteDance’s official Doubao Seedance 2.0 prompt guide; specs, phrasing and terminology follow the official documentation.

A Seedance 2.0 prompt is essentially a shot-level "director instruction." Once you internalize the space + time layers, the 8 elements, shot sequencing and the symbol convention, you can write prompts that generate reliably — and when you want to skip the manual work, hand a reference video to VideoLens for a ready-to-tweak shot list.