# **POST** /api/speak Synthesize voice. ```{caution} If you set a `callback-url`, we'll send data to it using a `POST` request in JSON format. Use the embed URL (`share.embed_url`) from the callback response. Make sure `result.speak_url` in the `/api/speak` response matches the `speak_url` in the callback data. ``` ## Request ### Headers * [Required headers](reference/request.md#headers) ### Parameters * None ### Body as JSON Object | Key | Type | Required | Description | |---------------------------|--------|----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | **`text`** | string | Yes | The sentence to synthesize voice. | | **`tts_mode`** | string | Yes | Identify the mode to generate a speech. One of 'actor' and ['audio_file' mode](../advanced/voice_cloning.md) mode. Default is 'actor'. | **`actor_id`** | string | Yes | The character ID, which is required only when tts_mode is 'actor'. Retrieve your character from the [Actor API](reference/get_api_actor.md#request). | | **`speak_resource_id`** | string | Yes | The speak resource ID, which is required only when tts_mode is 'audio_file'. Retrieve speak_resource_id from the [Resource API](reference/post_api_speak_resource.md#request). | | **`lang`** | string | Yes | Language code of the `text`. Available language codes are ['en-us', 'ko-kr', 'ja-jp', 'es-es', 'zh-cn', 'auto'] for 'actor' mode. Use `auto` for automatic language detection from the text.
When using `tts_mode: audio_file`, Available language codes are [ 'en-us', 'ko-kr', 'ja-jp', 'zh-cn', 'fr-fr', 'de-de', 'es-es', 'pt-pt', 'it-it', 'ru-ru', 'ta-in', 'ar-sa', 'bg-bg', 'uk-ua', 'hr-hr', 'cs-cz', 'pl-pl', 'sk-sk', 'fi-fi', 'ro-ro', 'el-gr', 'nl-nl', 'sv-se', 'tl-ph', 'id-id', 'auto']. | | **`xapi_hd`** | bool | No | Specify sample rate. If set to true, you'll get high-quality audio (44.1 KHz). Default is false (16 KHz). | | **`xapi_audio_format`** | string | No | Specify audio format. If set to "mp3", you'll get an mp3 format file. Default is "wav" format. | | **`model_version`** | string | No | Specify a model version name or alias. Refer to [character version API](reference/get_api_actor_oid_versions.md#request). Use "latest" for the latest model. | | **`emotion_tone_preset`** | string | No | Specify an emotion. Retrieve available emotions for your character from [Actor API with actor id](reference/get_api_actor_oid_versions.md#response). | | **`emotion_prompt`** | string | No | Specify a custom emotion in natural language (Korean or English). Use when `emotion_tone_preset` is `emotion-prompt`. Ensure `emotion_prompt` is activated for your selected character. | | **`volume`** | int | No | Specify audio volume. Set to 50 for 0.5x down, 200 for 2x up. Default is 100. | | **`speed_x`** | float | No | Control speaking speed. Values: 1.5 (slow), 1 (normal), or 0.5 (fast). Default is 1. | | **`tempo`** | float | No | Control voice playing speed. Range: 0.5 (0.5x slow) to 2.0 (2x fast). Default is 1.0. | | **`pitch`** | int | No | Control voice pitch. Range: -12 to 12. A value of 1 corresponds to one semitone. For example, 4 means 4 semitones higher than the original voice. Default is 0. | | **`max_seconds`** | float | No | Limit maximum length of synthesized speech (1 to 60 seconds). Default is 30. Refer to [limit length](../advanced/limit_length.md) and [fixed duration](../advanced/fixed_amount.md) documents. | | **`duration`** | float | No | Define the length of synthesized speech (1 to 60 seconds). Use default value for `tempo` and `speed_x`. Refer to [fixed duration](../advanced/fixed_amount.md). | | **`last_pitch`** | int | No | Control pitch of end of sentence. Values: -2 (lowest), -1 (low), 0 (normal), 1 (high), or 2 (highest). |
Example with cURL ```bash curl --request POST \ --url https://typecast.ai/api/speak \ --header "Content-Type: application/json" \ --header "Authorization: Bearer $API_TOKEN" \ --data '{ "text": "My name is Juncheol.", "lang": "auto", "model_version": "latest", "emotion_tone_preset": "${emotion}", "actor_id": "${24-letters-your_actor_id}", "xapi_hd": true, "xapi_audio_format": "mp3", "max_seconds": 20, "volume": 100, "speed_x": 1, "tempo": 1, "pitch": 0, }' ```
Check out an example of using a custom emotion with `emotion_prompt` in the following guide: [Advanced speech synthesis: Apply custom emotion in your script](../advanced/custom_emotion.md) ### Response ### Status Code | Status Code | Description | |-------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 401 | [Authorization Error](reference/error.md#error-codes) | | 400 | JSON object representing an error. [See how the error looks](reference/error.md#structure). | | 429 | JSON object representing an error indicating that the request limit has been exceeded. [See how the error looks](reference/error.md#structure). `error_code` is `app/too-many-requests`. | | 200 | JSON object containing the `result`. | ### `result` consists of the following | Key | Description | |--------------------|-------------------------------------------------------------| | __`speak_v2_url`__ | URL to view detailed information about the created _speak_. | | __`speak_url`__ | ___(Deprecated)___ URL to view details about the _speak_. |
Example ```json { "result": { "speak_v2_url": "https://typecast.ai/api/speak/v2/{your-speak-id}", "speak_url": "https://typecast.ai/api/speak/{your-speak-id}" } } ```
### `error_code`s in `400` response | Error Code | Description | |----------------------------|-------------------------------------------------------------------------------------------| | __`app/param/not-enough`__ | Some required fields are not included in the request body. | | __`app/invalid/text`__ | The length of `text` is over 350, or `text` consists entirely of unpronounceable letters. | | __`app/invalid/actor_id`__ | `actor_id` is incorrect or disallowed. |
Example ```json { "message": { "msg": "need params as actor_id", "error_code": "app/param/not-enough" } } ```