Rate limits

Per-key RPM and concurrency limits, the RateLimit-* response headers, and how to handle a 429 response.

Each API key is capped at 120 requests per minute (RPM) and 10 concurrent generations by default; both values are configurable. When you exceed the RPM cap, the request is rejected with 429 rate_limit_exceeded and a Retry-After header telling you how many seconds to wait before retrying.

Default limits

Prop

Type

Response headers

The RateLimit-* headers are returned on every response, so you can track your remaining budget without waiting for a 429.

Prop

Type

The 429 response

HTTP/1.1 429 Too Many Requests
RateLimit-Limit: 120
RateLimit-Remaining: 0
RateLimit-Reset: 37
Retry-After: 37

{
  "error": {
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Please retry later."
  }
}

Honour Retry-After

On a 429, wait exactly the number of seconds given in Retry-After (or RateLimit-Reset) before retrying. That is more reliable than a fixed delay.

Concurrency is capped separately

The 10 concurrent-generation limit is independent of the RPM cap. If you run many long video jobs, queue them on your side so you don't hit the concurrency ceiling.

Default limits

Prop

Type

Response headers

The RateLimit-* headers are returned on every response, so you can track your remaining budget without waiting for a 429.

Prop

Type

The 429 response

HTTP/1.1 429 Too Many Requests
RateLimit-Limit: 120
RateLimit-Remaining: 0
RateLimit-Reset: 37
Retry-After: 37

{
  "error": {
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Please retry later."
  }
}

Honour Retry-After

On a 429, wait exactly the number of seconds given in Retry-After (or RateLimit-Reset) before retrying. That is more reliable than a fixed delay.

Concurrency is capped separately

The 10 concurrent-generation limit is independent of the RPM cap. If you run many long video jobs, queue them on your side so you don't hit the concurrency ceiling.

Rate limits

Default limits

Response headers

The 429 response

On this page

Rate limits

Default limits

Response headers

The 429 response

On this page