POST /api/speak¶
Synthesize voice.
Caution
If you set a callback-url
, we’ll send data to it using a POST
request in JSON format.
Use the embed URL (share.embed_url
) from the callback response.
Make sure result.speak_url
in the /api/speak
response matches the speak_url
in the callback data.
Request¶
Headers¶
Parameters¶
None
Body as JSON Object¶
Key |
Type |
Required |
Description |
---|---|---|---|
|
string |
Yes |
The sentence to synthesize voice. |
|
string |
Yes |
Identify the mode to generate a speech. One of ‘actor’ and ‘audio_file’ mode mode. Default is ‘actor’. |
|
string |
Yes |
The character ID, which is required only when tts_mode is ‘actor’. Retrieve your character from the Actor API. |
|
string |
Yes |
The speak resource ID, which is required only when tts_mode is ‘audio_file’. Retrieve speak_resource_id from the Resource API. |
|
string |
Yes |
Language code of the |
|
bool |
No |
Specify sample rate. If set to true, you’ll get high-quality audio (44.1 KHz). Default is false (16 KHz). |
|
string |
No |
Specify audio format. If set to “mp3”, you’ll get an mp3 format file. Default is “wav” format. |
|
string |
No |
Specify a model version name or alias. Refer to character version API. Use “latest” for the latest model. |
|
string |
No |
Specify an emotion. Retrieve available emotions for your character from Actor API with actor id. |
|
string |
No |
Specify a custom emotion in natural language (Korean or English). Use when |
|
int |
No |
Specify audio volume. Set to 50 for 0.5x down, 200 for 2x up. Default is 100. |
|
float |
No |
Control speaking speed. Values: 1.5 (slow), 1 (normal), or 0.5 (fast). Default is 1. |
|
float |
No |
Control voice playing speed. Range: 0.5 (0.5x slow) to 2.0 (2x fast). Default is 1.0. |
|
int |
No |
Control voice pitch. Range: -12 to 12. A value of 1 corresponds to one semitone. For example, 4 means 4 semitones higher than the original voice. Default is 0. |
|
float |
No |
Limit maximum length of synthesized speech (1 to 60 seconds). Default is 30. Refer to limit length and fixed duration documents. |
|
float |
No |
Define the length of synthesized speech (1 to 60 seconds). Use default value for |
|
int |
No |
Control pitch of end of sentence. Values: -2 (lowest), -1 (low), 0 (normal), 1 (high), or 2 (highest). |
Example with cURL
curl --request POST \
--url https://typecast.ai/api/speak \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $API_TOKEN" \
--data '{
"text": "My name is Juncheol.",
"lang": "auto",
"model_version": "latest",
"emotion_tone_preset": "${emotion}",
"actor_id": "${24-letters-your_actor_id}",
"xapi_hd": true,
"xapi_audio_format": "mp3",
"max_seconds": 20,
"volume": 100,
"speed_x": 1,
"tempo": 1,
"pitch": 0,
}'
Check out an example of using a custom emotion with emotion_prompt
in the following guide:
Advanced speech synthesis: Apply custom emotion in your script
Response¶
Status Code¶
Status Code |
Description |
---|---|
401 |
|
400 |
JSON object representing an error. See how the error looks. |
429 |
JSON object representing an error indicating that the request limit has been exceeded. See how the error looks. |
200 |
JSON object containing the |
result
consists of the following¶
Key |
Description |
---|---|
|
URL to view detailed information about the created speak. |
|
(Deprecated) URL to view details about the speak. |
Example
{
"result": {
"speak_v2_url": "https://typecast.ai/api/speak/v2/{your-speak-id}",
"speak_url": "https://typecast.ai/api/speak/{your-speak-id}"
}
}
error_code
s in 400
response¶
Error Code |
Description |
---|---|
|
Some required fields are not included in the request body. |
|
The length of |
|
|
Example
{
"message": {
"msg": "need params as actor_id",
"error_code": "app/param/not-enough"
}
}