How to Write Seedance 2.0 Prompts: Shot Structure + the 8-Element Formula
Writing Seedance 2.0 prompts is not about flowery words — it is about structure. The model is best understood as a "multimodal AI director" that splits a shot into a space layer (what is in frame) and a time layer (how it changes over time). So a good prompt is an engineering instruction — who, where, doing what, shot how, in what order. This piece lays out how to write the text prompt along the officially recommended structure, and how VideoLens generates that structure from a reference video. (Text prompts only — multimodal reference-to-video is out of scope here.)
I. The 8-element formula
The recommended formula: precise subject + action detail + scene/environment + light & color + camera move + visual style + image quality + constraints. Lock "who is doing what" first, then "where and what mood," then "how it is shot," and finally tighten the result with style, quality and constraints.
| Element | What to write | Example |
|---|---|---|
| Subject | 2–3 stable static traits (wardrobe/hair/look/class) | a woman in a red dress and straw hat |
| Action | down to limbs + range/speed/force | slowly raises a hand, dips her head |
| Scene | the setting / position / spatial relation | a dorm corridor at dusk |
| Light & color | the light and color tone of the frame | warm sunlight through the window, soft light |
| Camera move | standard terms, one move per shot | steady medium tracking, slow push-in |
| Visual style | art style and overall tone | cinematic documentary / fresh anime / 3D |
| Quality | sharpness, detail, texture | HD, cinematic, soft light |
| Constraints | bound the result, avoid artifacts | no subtitles, no logo/watermark, no face warp |
II. Space × time: why shot sequencing
The model decouples space and time internally, so the ideal prompt for a complex video is a timeline of shots: split it into several shots and describe each in event order. A vague "a man runs nervously down the street, very cinematic" is far weaker than Shot 1 / Shot 2 / Shot 3.
Organize each shot as: ① camera move or cut → ② subject action & expression → ③ position / spatial change → ④ audio (SFX / voice / BGM).
III. Writing action (four rules)
| Rule | How | Example |
|---|---|---|
| Specific + quantified | name the limb + range/speed/force | slowly raise hand, quick head turn |
| Prefer slow, small moves | avoid sprinting/leaping/violent rolls | walk slowly, sit down naturally |
| Add transitions | state the inertia linking moves | raise the arm off the turning motion |
| Externalize emotion | use body detail, not "very sad/angry" | see the table below |
Emotion externalization — translating abstract feeling into filmable detail:
| Feeling | Externalized as action & detail |
|---|---|
| Sadness | head down, shoulders trembling, reddened eyes, fingers gripping the hem, tears welling but not falling |
| Joy | an irrepressible smile, relaxed brow, light steps, a little spin |
| Anxiety | checking the watch, drumming fingers, quick breath, darting eyes, nail-biting |
| Anger | clenched fists, tight jaw, heaving chest, knife-like stare, words forced through teeth |
| Relief | a long exhale, shoulders loosening, a faint smile, gaze lifting to the distance |
IV. Camera, quality and constraints
Use standard camera terms directly — the model reads them well: medium, close-up, wide, slow push-in, steady pan, locked-off. Note: keep to one move per shot; combining push/pull/pan/tilt destabilizes the image.
The closing trio — quality, style and constraints — tightens the output:
| Type | Purpose | Template / example |
|---|---|---|
| Quality | sets sharpness & texture | HD, rich detail, cinematic, soft light |
| Style | unifies the art direction | cyberpunk teal-purple, retro film, fresh anime, 3D |
| Constraints | avoids artifacts & leftover marks | no subtitles / no text / no logo / no watermark |
V. Audio & dialogue symbols
Seedance 2.0 natively co-generates audio and video; fixed symbols mark the type of information so the model parses it correctly:
| Type | Symbol | Example |
|---|---|---|
| Music | () | (upbeat rock plays in the background) |
| SFX | <> | <a dog barks in the distance> |
| Dialogue | {} | {hello world}; mark the language for non-CN/EN, e.g. in Japanese say {こんにちは} |
| Caption | 【】 | 【Chapter 1: Departure】 |
A few dialogue tips: keep one language per line (proper nouns aside); the model misreads rare/polyphonic Chinese characters — swap in a common homophone (e.g. 螭龙山 → 吃龙山); and add a "no subtitles" constraint if you do not want captions.
VI. Generate this structure with VideoLens
Writing this by hand means typing every shot. VideoLens runs it in reverse — give it a reference video and its Creation Assistant breaks it down shot by shot and outputs Seedance 2.0 prompts in exactly this structure:
· anchors the recurring characters, scenes and props as reusable entities; · generates a per-shot prompt in shot order (camera move + subject action + scene & light); · closes with a style tail that unifies quality and tone, defaulting to "no subtitles, no logo/watermark"; · separates dialogue, SFX and BGM and maps each onto its shot.
In short: you do not start from a blank page — VideoLens hands you a shot list you can tweak directly.
A Seedance 2.0 prompt is essentially a shot-level "director instruction." Once you internalize the space + time layers, the 8 elements, shot sequencing and the symbol convention, you can write prompts that generate reliably — and when you want to skip the manual work, hand a reference video to VideoLens for a ready-to-tweak shot list.
