Next-gen AI video generation by Alibaba Tongyi Lab. Four modes: Text-to-Video, Image-to-Video, Reference-to-Video, and VideoEdit. Up to 1080p, 15 seconds, thinking mode and audio support
POV shot from inside a lunar lander cockpit, the spacecraft slowly crests over the far side of the Moon. The camera reveals an impossible sight: four green-skinned aliens casually sitting around a campfire on the lunar surface, drinking beer from Earth-brand bottles. No helmets, just Hawaiian shirts. One alien raises his bottle toward the camera in a friendly toast. Earth visible on the horizon. Warm orange campfire glow contrasting with cold blue moonlight. Steven Spielberg style, photorealistic.
Three steps to your first video
Choose T2V, I2V, R2V, or VideoEdit. Upload source images, videos, or audio references depending on the mode.
Describe the scene or editing instruction. Set resolution, duration, aspect ratio, and toggle thinking mode.
Hit generate and receive your 1080p video in 30-90 seconds. Download for any commercial use.
Choose the right mode for your creative task
Generate video from a text prompt. Thinking mode for complex scenes, audio track support, prompt extension via LLM
Animate any image or 9-grid composition. Control first and last frames, add driving audio for motion sync
Use up to 5 references: images, video clips, and audio. Voice cloning, lip sync, and motion replication for character consistency
Edit existing video via text instructions. Style transfer, colorization, object replacement, and local edits without regenerating
What sets Wan 2.7 Video apart from the previous generation
Full HD video by default. Clean detail, no upscaling artifacts. Suitable for social media, ads, and professional projects
Set the start and end points of your animation precisely. Ensures smooth motion arcs and consistent composition across the clip
Upload a 3x3 grid of reference shots. The model reads them as a storyboard and generates coherent multi-element scenes
Provide an image and a voice sample. The model creates a speaking character with consistent appearance and cloned voice across generations
Describe what to change in natural language. Swap styles, recolor, replace objects, or refine specific regions without touching the rest
What Wan 2.7 Video is built for
TikTok, Reels, Shorts in any aspect ratio. Rapid iteration from text or a single reference image
Consistent characters and voice across ad variations. VideoEdit mode for fast creative iteration
Character consistency via R2V references. Voice clone for narration. 9-grid storyboard input for complex scenes
Restyle footage, fix color grading, swap backgrounds, or adjust local details without re-shooting
The advantages of generating through our platform
Start generating immediately. No registration with third-party services, no quota management, no infrastructure to maintain
T2V, I2V, R2V, and VideoEdit available from the same interface with unified credit-based pricing
Switch between Wan 2.7, Kling 3, Seedance 2, Veo 3, and other models without changing platforms or managing separate accounts
Answers to common questions about Wan 2.7 Video
Wan 2.7 Video is the latest video generation model from Alibaba's Tongyi Lab. It supports four modes: Text-to-Video, Image-to-Video, Reference-to-Video, and VideoEdit. Output up to 1080p at 2-15 seconds with thinking mode, audio support, and prompt extension.
Four modes: T2V generates video from text with optional audio overlay. I2V animates images including 9-grid compositions with first/last frame control. R2V uses up to 5 references (image, video, audio) for voice clone and motion replication. VideoEdit edits existing video via text instructions.
Text prompts up to 5,000 characters. Images (single or 9-grid). Video clips for R2V and VideoEdit (2-10 sec). Audio files for driving audio, voice cloning, and audio overlay. Negative prompts up to 500 characters.
VideoEdit takes an existing video and a text instruction, then applies the change: style transfer (anime, oil painting, etc.), colorization, object replacement, background swap, or local region editing. Source video must be 2-10 seconds.
Use T2V when you start from scratch with a text description. Thinking mode helps with complex multi-element scenes. Use I2V when you have a specific starting image and want to animate it with precise control over the first and last frames.
Wan 2.7 adds two new modes (R2V and VideoEdit), increases max duration to 15 seconds, enables thinking mode for complex scenes, supports 5 aspect ratios (added 4:3, 3:4), allows prompts up to 5,000 characters, and defaults to 1080p resolution.
Typical generation time is 30-90 seconds depending on mode, resolution, and duration. T2V with thinking mode may take slightly longer. VideoEdit and R2V with multiple references are usually in the 60-120 second range.