Seedance 2.0 produces some of the best AI-generated video available today, and you do not have to use it through a web interface. Several API platforms now offer programmatic access to the model, letting you integrate video generation directly into your applications, automate content pipelines, or build entirely new products on top of it. This guide covers the available API platforms, walks through Python integration, and provides working code patterns you can adapt to your projects.
If you are new to Seedance 2.0, start with our step-by-step tutorial to understand the model's capabilities before diving into the API.
Available API Platforms
Seedance 2.0 is accessible through multiple API providers, each with different strengths depending on your use case.
Volcano Engine (ByteDance Official)
ByteDance's own cloud platform hosts Seedance 2.0 through the Volcano Engine API. This is the most direct path to the model with the lowest latency and full feature support. Access expanded significantly after February 24, 2025, and the API is now available to developers with a Volcano Engine account. The official API supports all generation modes including text-to-video, image-to-video, and multi-reference input.
FAL.ai
FAL.ai offers serverless GPU inference with a straightforward developer experience. You pay per second of compute with no idle costs, making it well-suited for variable workloads. FAL provides Python and JavaScript SDKs, webhook support for async results, and built-in queue management. This is often the fastest way to get started for developers outside China.
Higgsfield
Higgsfield provides a unified API layer across multiple video generation models including Seedance 2.0. Their platform handles queue management, automatic retries, and fallback routing between providers. This is useful if you want to offer users a choice of models or need reliability guarantees across providers.
Atlas Cloud
Atlas Cloud offers GPU-backed API endpoints designed for high-volume production workloads. Their infrastructure is optimized for batch processing with dedicated capacity options, making it a good fit for enterprise pipelines that need consistent throughput rather than pay-per-use flexibility.
For a breakdown of costs across these platforms, see our pricing page.
Getting Started with Python (FAL.ai)
FAL.ai provides the most developer-friendly onboarding experience, so we will use it for the primary examples. The patterns translate easily to other providers.
Installation and Authentication
Install the FAL client library:
pip install fal-clientSet your API key as an environment variable:
export FAL_KEY="your-api-key-here"Or configure it in your Python code:
import fal_client
fal_client.api_key = "your-api-key-here"Basic Text-to-Video Request
Here is the simplest possible video generation — a text prompt submitted to the Seedance 2.0 endpoint:
import fal_client
result = fal_client.subscribe(
"fal-ai/seedance-2.0",
arguments={
"prompt": (
"A woman walks through a neon-lit Tokyo alley at night, "
"camera follows from behind in a steady tracking shot, "
"rain reflecting neon signs on wet pavement, cinematic 2K"
),
"duration": 5,
"aspect_ratio": "16:9",
},
with_logs=True,
)
video_url = result["video"]["url"]
print(f"Video ready: {video_url}")The subscribe method submits the request and polls until completion. For production systems, you will want to use async patterns instead, which we cover below.
For prompt writing techniques that produce the best results with this API, browse our prompt guide.
API Capabilities
Seedance 2.0 exposes several generation modes through the API, each serving different creative and production needs.
Text-to-Video
Generate a video entirely from a text description. This is the most common mode and supports the full prompt syntax including temporal cues like [0-3s] for multi-shot sequences.
Image-to-Video
Upload a reference image and generate a video that starts from or incorporates that image. Useful for animating still photos, product shots, or character designs.
Video-to-Video
Submit an existing video clip as input and generate a new version with style transfer, motion changes, or scene modifications applied.
Audio-Synced Generation
Include audio references to generate video synchronized to music, voiceover, or sound effects. The model aligns visual motion and lip movements to the audio timeline.
Multi-Reference Input
Attach up to 12 reference files (images, videos, audio) to a single generation request. This enables complex creative briefs with character consistency, style matching, and audio synchronization all in one call.
Batch Processing
Submit multiple generation requests in parallel for high-volume production workflows. Most API providers support concurrent request limits that scale with your plan tier.
Python Code Examples
Image-to-Video with Reference
Upload a reference image and animate it with a text prompt:
import fal_client
image_url = fal_client.upload_file("./character-reference.png")
result = fal_client.subscribe(
"fal-ai/seedance-2.0",
arguments={
"prompt": (
"The person in the reference image walks along a beach "
"at sunset, medium shot, warm golden light, gentle "
"ocean breeze moves their hair, waves lap at the shore"
),
"image_url": image_url,
"duration": 10,
"aspect_ratio": "16:9",
},
)
print(f"Video ready: {result['video']['url']}")Async Generation with Webhooks
For production systems, avoid blocking on each request. Use the queue-based approach to submit work and receive results asynchronously:
import fal_client
request = fal_client.submit(
"fal-ai/seedance-2.0",
arguments={
"prompt": (
"A drone shot rising above a misty mountain forest "
"at dawn, revealing a hidden lake reflecting the "
"pink and orange sky, cinematic 2K"
),
"duration": 10,
"aspect_ratio": "16:9",
},
webhook_url="https://your-app.com/api/webhooks/fal",
)
request_id = request.request_id
print(f"Submitted: {request_id}")Check the status of a pending request:
status = fal_client.status(
"fal-ai/seedance-2.0",
request_id,
with_logs=True,
)
print(f"Status: {status}")Or retrieve the result once complete:
result = fal_client.result(
"fal-ai/seedance-2.0",
request_id,
)
print(f"Video URL: {result['video']['url']}")Error Handling and Retries
Robust production code needs to handle transient failures gracefully:
import time
import fal_client
def generate_video(prompt, max_retries=3, retry_delay=5):
for attempt in range(max_retries):
try:
result = fal_client.subscribe(
"fal-ai/seedance-2.0",
arguments={
"prompt": prompt,
"duration": 5,
"aspect_ratio": "16:9",
},
with_logs=True,
)
return result["video"]["url"]
except fal_client.FalServerError as e:
if e.status_code == 429:
wait = retry_delay * (2 ** attempt)
print(f"Rate limited. Retrying in {wait}s...")
time.sleep(wait)
continue
raise
except fal_client.FalTimeoutError:
if attempt < max_retries - 1:
print(f"Timeout on attempt {attempt + 1}. Retrying...")
continue
raise
raise RuntimeError("Max retries exceeded")Batch Processing Pipeline
Process multiple prompts efficiently with concurrent requests:
import asyncio
import fal_client
prompts = [
"A cat stretches lazily on a sunlit windowsill, dust motes floating in the light beam, close-up shot",
"Aerial drone shot of a winding river through autumn forest, golden and red leaves, morning mist",
"A barista pours latte art in slow motion, creamy milk forming a leaf pattern, macro lens, studio lighting",
]
async def generate_batch(prompts, max_concurrent=3):
semaphore = asyncio.Semaphore(max_concurrent)
results = []
async def process_one(prompt, index):
async with semaphore:
print(f"Starting generation {index + 1}/{len(prompts)}")
result = await fal_client.subscribe_async(
"fal-ai/seedance-2.0",
arguments={
"prompt": prompt,
"duration": 5,
"aspect_ratio": "16:9",
},
)
url = result["video"]["url"]
print(f"Completed {index + 1}: {url}")
return {"prompt": prompt, "url": url}
tasks = [
process_one(prompt, i)
for i, prompt in enumerate(prompts)
]
results = await asyncio.gather(*tasks, return_exceptions=True)
return results
results = asyncio.run(generate_batch(prompts))
for r in results:
if isinstance(r, Exception):
print(f"Failed: {r}")
else:
print(f"Success: {r['url']}")Rate Limits and Best Practices
API rate limits vary by provider and plan tier. Here are the general guidelines:
- FAL.ai: Concurrent request limits scale with your account tier. Free tier supports 1-2 concurrent requests. Paid tiers scale higher.
- Volcano Engine: Rate limits are documented in the official API console and depend on your service agreement.
- General pattern: Most providers return HTTP 429 when you hit the limit. Implement exponential backoff as shown in the error handling example above.
Best Practices for Production
Use webhooks instead of polling. Long-polling ties up connections and wastes resources. Submit requests with a webhook URL and let the provider push results to you when ready.
Cache generated videos. Store the resulting video URLs and associate them with the input parameters. If a user requests the same generation twice, serve the cached result.
Validate prompts client-side. Check prompt length (max 5,000 characters) and file count (max 12 references) before submitting to the API to avoid wasting credits on rejected requests.
Set reasonable timeouts. Video generation takes 30 seconds to several minutes depending on duration and complexity. Set your HTTP client timeout accordingly, or use the async queue pattern.
Cost Optimization Tips
Video generation credits add up quickly in production. These strategies help control costs.
Start with shorter durations for iteration. Generate 5-second clips during development and testing. Switch to 10-second clips only for final production output.
Use lower resolution for drafts. If the provider supports resolution selection, use standard resolution for prompt iteration and switch to 2K only for final renders.
Batch similar requests. Group generation requests by type and submit them during off-peak hours if your provider offers time-based pricing.
Cache aggressively. Implement a content-addressable cache keyed on prompt hash + parameters. Many production workflows regenerate identical or near-identical content frequently.
Monitor usage with alerts. Set up spending alerts and daily budget caps to prevent runaway costs from bugs or unexpected traffic spikes.
For detailed cost breakdowns across providers, visit our pricing page.
FAQ
Is the API publicly available? Yes. FAL.ai and other third-party providers offer public API access. The official Volcano Engine API requires a Volcano Engine account. Access and availability continue to expand.
What video formats does the API return? Most providers return MP4 files via a download URL. The URL is typically valid for 24 to 72 hours depending on the provider.
Can I use the API for commercial projects? Terms vary by provider. Check the specific terms of service for your chosen API platform. Most providers allow commercial use on paid plans.
What is the maximum video duration? The API supports clips of 5 to 15 seconds depending on the generation mode and provider configuration.
How long does generation take? Typical generation time ranges from 30 seconds to 3 minutes depending on duration, resolution, and current queue depth. Use async patterns to avoid blocking your application.
Can I fine-tune the model through the API? Fine-tuning is not currently available through third-party API providers. Custom model training may become available through the official Volcano Engine platform in the future.
For more answers, visit our FAQ page. To see what others are building with the API, explore our use cases gallery.
Next Steps
You now have everything you need to start building with the Seedance 2.0 API. For best results, pair your API integration with effective prompts from our prompt guide, and refer to our tutorial for a deeper understanding of the model's capabilities and the @ reference system.
Start with the basic text-to-video example, verify that your authentication and pipeline work end to end, then expand to image-to-video and batch processing as your application requires.