Skip to content
Clipia.
Home
What are we creating?

Pick a mode — Studio opens with the right prompt ready

All templates
Video
Text → videoDescribe a scene and get a clipImage → videoBring a frame to life with motionVideo templatesReady-made scenes and styles
Images
Text → imageGenerate an image from a promptEditModify an existing photoImage templatesReady-made product shots and art
BlogPricingFor Partners
Sign In
  • Home

  • Create Video

  • Create Image

  • Templates

  • My Works

  • Models

  • Support

Clipia.

Think differently — create the impossible.

Product

  • Create Image
  • Create Video
  • AI Models
  • Video Models
  • Image Models
  • Guides
  • Model Rankings
  • Balance

Support

  • About
  • Contact Us
  • Telegram Support

Legal

  • Terms of Service
  • Privacy Policy
  • Payment information
  • Cross-Border Transfers
  • Acceptable Use
  • Cookie Policy
  • Content License
  • Partner Agreement
Terms of Service·Privacy Policy·Cookie Policy·Acceptable Use
© 2026 Clipia.ai. All rights reserved.
  1. Home/
  2. Video Models/
  3. Grok Imagine Video 1.5
Grok Imagine Video 1.5

Grok Imagine Video 1.5 video with audio

Video generation model by xAI on the autoregressive Aurora engine. Turns your image into a clip with built-in audio — speech, effects, and music. Up to 720p, 24 fps, up to 15 seconds, and 7 aspect ratios

Learn More
15 secmax length
720presolution
nativeaudio
Prompt

Portrait of a woman by a cafe window: she smiles warmly and waves, the camera slowly pushes in to a close-up, with quiet city ambience and soft background music

→
Generating
AI
→
Result

Create Video with Audio on Grok Imagine Video 1.5

Turn an image into a clip with native audio, realistic motion, and scene continuation

Pay only for what you create

#1 debut · Video Arena

Debuted at #1 on the Video Arena

Grok Imagine Video 1.5 by xAI shipped in late May 2026 (build dated May 30, 2026). The Grok Imagine line debuted in first place on the Artificial Analysis Video Arena — a blind pairwise comparison of video generators — ahead of Google Veo 3.1, Kling 2.5 Turbo, and Runway Gen-4.5. The arena is a live leaderboard and changes over time.

#1Grok Imagine line debut
~21 secfast clip generation
nativeaudio in one pass

Source: Artificial Analysis Video Arena (artificialanalysis.ai). Rankings update as models and votes are added.

What Grok Imagine Video 1.5 Does

Four standout strengths of the xAI model

Built-in Audio

Grok generates video and audio together: dialogue with intonation and pauses, ambient sound, effects, and background music — synced with the visuals, with no separate audio step.

Image Animation

Upload a photo and the model turns it into a living clip with natural motion, camera movement, and transitions, keeping the composition and character recognizable.

Realistic Motion

The Aurora engine builds video frame by frame: smooth facial expressions, correct lighting, and motion physics deliver a lifelike picture with no jitter or scene drift.

Scene Continuation

Take the last frame of a finished clip and continue the scene — motion, character position, and lighting stay seamless for longer stories.

Model Specs

What you need to know about Grok Imagine Video 1.5

720p

Crisp output up to 720p (1280×704), plus a fast 480p draft mode for tests.

24 fps

Smooth cinematic motion at the standard 24 frames-per-second film rate.

1–15 seconds

Pick your clip length; defaults to about 8 seconds, longer via continuation.

7 aspect ratios

1:1, 16:9, 9:16, 4:3, 3:4, 3:2, and 2:3 — for any platform and orientation.

Native Audio

Speech, sound effects, and music are generated together with the video in one pass.

MP4 H.264

The finished clip is universal MP4 — ready for social, ads, and editing.

Create a Video in 3 Steps

From image to a clip with audio in a couple of minutes

1

Upload an Image

Choose the photo or frame you want to animate. It becomes the first frame of your clip.

2

Describe Motion & Sound

Tell the model what happens in the frame: action, camera, mood, lines, and sound. Set the duration and aspect ratio.

Generate & Download

Grok Imagine Video 1.5 builds a clip with built-in audio in seconds. Download in MP4 for any commercial use.

Built For

Four scenarios where Grok Imagine Video 1.5 shines

Social & Shorts

Vertical 9:16 clips for Reels, TikTok, and YouTube Shorts — with motion and audio out of the box.

Ads & Marketing

Product clips, promos, and ad creatives with sound — fast and without a film crew.

Photo Animation

Turn static shots, portraits, and artwork into living scenes with natural motion.

Audio Content

Talking characters, ambient, and music clips — wherever synced sound matters, not silent video.

Why Clipia for Grok Imagine Video 1.5

The convenient way to access the xAI model

Instant Access

No waitlists, no API keys, no setup. Open the editor, upload a photo, and create right away.

Pay Per Result

Credits are charged per generation, with no mandatory subscription to unused quota.

Commercial License

Every clip you create can be used in ads, client projects, and published content — no royalties.

Frequently Asked

Everything you need to know about Grok Imagine Video 1.5 on Clipia

It's a video generation model by xAI (Elon Musk's team) on the autoregressive Aurora engine. On Clipia it turns an uploaded image into a short clip with built-in audio — dialogue, effects, and background music.

The developer is xAI. The Grok Imagine Video 1.5 Preview build is dated May 30, 2026, and runs on the in-house Aurora engine.

Yes. Video and audio are created together in a single pass: speech, ambient, sound effects, and music are synced to the visuals — no separate audio step needed.

On Clipia, Grok Imagine Video 1.5 runs in image-to-video mode: you upload an image and the model animates it while preserving the composition and recognizability of the frame.

Up to 720p (1280×704) at 24 frames per second; there is also a fast 480p draft mode for tests.

From 1 to 15 seconds per generation (6–15 is ideal, the default is about 8). Longer scenes are built by continuing the clip.

Seven: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, and 2:3 — for landscape, vertical (Reels, TikTok, Shorts), and square clips.

Yes. The continuation feature takes the last frame of a clip and builds the scene further, preserving motion, character position, and lighting.

Grok Imagine Video 1.5 shipped in late May 2026 (build dated May 30, 2026). The Grok Imagine line debuted at #1 on the Artificial Analysis Video Arena (in both text-to-video and image-to-video), ahead of Veo 3.1, Kling 2.5 Turbo, and Runway Gen-4.5. The arena is a live leaderboard and changes over time.

Fine details — typography, fabric textures, complex patterns — may slightly drift during heavy motion, and very dense scenes are less stable than simple ones with clear action. The model is still in Preview status.

Grok Imagine Video 1.5 — AI Video with Audio | Clipia.ai