LipSync API
Overview
Create LipSync Or Talking Photos
How it works
1
Host your assets at a publicly accessible URL
Upload your video, photo, and audio files so our servers can retrieve them.
2
Send an API request with the appropriate parameters
Reference your hosted assets and specify your desired mode (Standard or Precision).
3
Wait or Query status
Use our webhook callback or poll the API with your job ID until processing is complete.
4
Download Video Output
Retrieve the finished talking photo or lip‑synced video from the provided URL.
Usage Limitation:
- You may have up to 5 concurrent jobs (including queued requests).
- Only single‑face videos or photos are supported.
- Estimated queue time: 1–120 minutes, depending on system load.
- Standard Mode processing time: ~10 minutes.
- Precision Mode processing time: ~20 minutes.
If a video or photo contains multiple faces, only the largest detected face will be lip‑synced.
Error Codes
Code | Description |
---|---|
5 | Invalid request parameters. |
104 | Insufficient credits. |
1304 | API key has reached the maximum number of concurrent requests. |
1301 | Face recognition failed. Ensure a single identifiable real face is in the image. |
1302 | API key has been revoked. |