MCP tool · generate_video (i2v)

Animate an image into video right in the chat

Give the agent a start frame and a motion — generate_video with image_url turns the picture into video. The agent awaits the result via wait_generation and shows it in the conversation.

The image-to-video scenario uses the same generate_video tool: as soon as the arguments include an image_url with an HTTPS link to a start frame, generation switches from text-to-video to animating the image. This is handy for product shots, portraits, and covers: upload a picture, describe the camera motion — and the agent assembles a short clip without leaving the chat.

Tool parameters

Parameter	Type	Req.	Description
`image_url`	string	yes	HTTPS link to the start frame
`prompt`	string	yes	Description of motion and scene
`model`	string	—	Slug of a video model with i2v support
`duration`	number (сек)	—	Clip duration
`resolution`	string	—	480p / 720p / 1080p
`generate_audio`	bool	—	Add audio if the model supports it

Request example (MCP tools/call)

{
  "name": "generate_video",
  "arguments": {
    "image_url": "https://example.com/product.jpg",
    "prompt": "gentle parallax, camera slowly pushes in, soft light",
    "resolution": "720p",
    "duration": 5
  }
}

Response example

{
  "request_id": "c4d9...1fa",
  "status": "IN_PROGRESS",
  "model": "seedance-2-fast-i2v",
  "cost_credits": 12,
  "next_step": "call wait_generation with this request_id"
}

One tool — two modes

generate_video switches to image-to-video itself when image_url is passed. No separate tool needed.

Product shots in motion

Animate a product photo or a cover for Reels and Shorts right from the agent chat.

Verify without charges

A clipia_test_* key returns a mock instantly — debug the scenario beforehand.

Connect in a minute

claude mcp add --transport http clipia https://mcp.clipia.ai/mcp \
  --header "Authorization: Bearer clipia_live_XXXX"

For claude.ai and ChatGPT no key is needed — sign in with your account. Create a key in Clipia settings.

Pricing

The credit cost is returned at launch (cost_credits); image-to-video is billed as video generation.

What start frame do I need?

Any public HTTPS link to an image. The agent passes it in image_url.

How does it differ from text-to-video?

Same tool, but with image_url — the video builds from your frame, not from scratch.

How long to wait?

Same as text-to-video: status is awaited via wait_generation, long-poll up to 30 s per call.

Get an API key All MCP tools