# **POST** /api/speak

Synthesize voice.

```{caution}
If you set a `callback-url`, we'll send data to it using a `POST` request in JSON format.
Use the embed URL (`share.embed_url`) from the callback response.
Make sure `result.speak_url` in the `/api/speak` response matches the `speak_url` in the callback data.
```

## Request

### Headers

* [Required headers](reference/request.md#headers)

### Parameters

* None

### Body as JSON Object

| Key                       | Type   | Required | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|---------------------------|--------|----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **`text`**                | string | Yes      | The sentence to synthesize voice.                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| **`tts_mode`**            | string | Yes      | Identify the mode to generate a speech. One of 'actor' and ['audio_file' mode](../advanced/voice_cloning.md) mode. Default is 'actor'.
| **`actor_id`**            | string | Yes      | The character ID, which is required only when tts_mode is 'actor'. Retrieve your character from the [Actor API](reference/get_api_actor.md#request).                                                                                                                                                                                                                                                                                                                                                                           |
| **`speak_resource_id`**   | string | Yes      | The speak resource ID, which is required only when tts_mode is 'audio_file'. Retrieve speak_resource_id from the [Resource API](reference/post_api_speak_resource.md#request).                                                                                                                                                                                                                                                                                                |
| **`lang`**                | string | Yes      | Language code of the `text`. Available language codes are ['en-us', 'ko-kr', 'ja-jp', 'es-es', 'zh-cn', 'auto'] for 'actor' mode. Use `auto` for automatic language detection from the text.<br>When using `tts_mode: audio_file`, Available language codes are [ 'en-us', 'ko-kr', 'ja-jp', 'zh-cn', 'fr-fr', 'de-de', 'es-es', 'pt-pt', 'it-it', 'ru-ru', 'ta-in', 'ar-sa', 'bg-bg', 'uk-ua', 'hr-hr', 'cs-cz', 'pl-pl', 'sk-sk', 'fi-fi', 'ro-ro', 'el-gr', 'nl-nl', 'sv-se', 'tl-ph', 'id-id', 'auto']. |
| **`xapi_hd`**             | bool   | No       | Specify sample rate. If set to true, you'll get high-quality audio (44.1 KHz). Default is false (16 KHz).                                                                                                                                                                                                                                                                                                                                                                     |
| **`xapi_audio_format`**   | string | No       | Specify audio format. If set to "mp3", you'll get an mp3 format file. Default is "wav" format.                                                                                                                                                                                                                                                                                                                                                                                |
| **`model_version`**       | string | No       | Specify a model version name or alias. Refer to [character version API](reference/get_api_actor_oid_versions.md#request). Use "latest" for the latest model.                                                                                                                                                                                                                                                                                                                  |
| **`emotion_tone_preset`** | string | No       | Specify an emotion. Retrieve available emotions for your character from [Actor API with actor id](reference/get_api_actor_oid_versions.md#response).                                                                                                                                                                                                                                                                                                                          |
| **`emotion_prompt`**      | string | No       | Specify a custom emotion in natural language (Korean or English). Use when `emotion_tone_preset` is `emotion-prompt`. Ensure `emotion_prompt` is activated for your selected character.                                                                                                                                                                                                                                                                                       |
| **`volume`**              | int    | No       | Specify audio volume. Set to 50 for 0.5x down, 200 for 2x up. Default is 100.                                                                                                                                                                                                                                                                                                                                                                                                 |
| **`speed_x`**             | float  | No       | Control speaking speed. Values: 1.5 (slow), 1 (normal), or 0.5 (fast). Default is 1.                                                                                                                                                                                                                                                                                                                                                                                          |
| **`tempo`**               | float  | No       | Control voice playing speed. Range: 0.5 (0.5x slow) to 2.0 (2x fast). Default is 1.0.                                                                                                                                                                                                                                                                                                                                                                                         |
| **`pitch`**               | int    | No       | Control voice pitch. Range: -12 to 12. A value of 1 corresponds to one semitone. For example, 4 means 4 semitones higher than the original voice. Default is 0.                                                                                                                                                                                                                                                                                                               |
| **`max_seconds`**         | float  | No       | Limit maximum length of synthesized speech (1 to 60 seconds). Default is 30. Refer to [limit length](../advanced/limit_length.md) and [fixed duration](../advanced/fixed_amount.md) documents.                                                                                                                                                                                                                                                                                |
| **`duration`**            | float  | No       | Define the length of synthesized speech (1 to 60 seconds). Use default value for `tempo` and `speed_x`. Refer to [fixed duration](../advanced/fixed_amount.md).                                                                                                                                                                                                                                                                                                               |
| **`last_pitch`**          | int    | No       | Control pitch of end of sentence. Values: -2 (lowest), -1 (low), 0 (normal), 1 (high), or 2 (highest).                                                                                                                                                                                                                                                                                                                                                                        |

<details open>
  <summary>Example with cURL</summary>

  ```bash
    curl --request POST \
        --url https://typecast.ai/api/speak \
        --header "Content-Type: application/json" \
        --header "Authorization: Bearer $API_TOKEN" \
        --data '{
            "text": "My name is Juncheol.",
            "lang": "auto",
            "model_version": "latest",
            "emotion_tone_preset": "${emotion}",
            "actor_id": "${24-letters-your_actor_id}",
            "xapi_hd": true,
            "xapi_audio_format": "mp3",
            "max_seconds": 20,
            "volume": 100,
            "speed_x": 1,
            "tempo": 1,
            "pitch": 0,
        }'
  ```

</details>

Check out an example of using a custom emotion with `emotion_prompt` in the following guide:
[Advanced speech synthesis: Apply custom emotion in your script](../advanced/custom_emotion.md)

### Response

### Status Code

| Status Code | Description                                                                                                                                                                              |
|-------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 401         | [Authorization Error](reference/error.md#error-codes)                                                                                                                                    |
| 400         | JSON object representing an error. [See how the error looks](reference/error.md#structure).                                                                                              |
| 429         | JSON object representing an error indicating that the request limit has been exceeded. [See how the error looks](reference/error.md#structure). `error_code` is `app/too-many-requests`. |
| 200         | JSON object containing the `result`.                                                                                                                                                     |

### `result` consists of the following

| Key                | Description                                                 |
|--------------------|-------------------------------------------------------------|
| __`speak_v2_url`__ | URL to view detailed information about the created _speak_. |
| __`speak_url`__    | ___(Deprecated)___ URL to view details about the _speak_.   |

<details open>
  <summary>Example</summary>

```json
{
  "result": {
    "speak_v2_url": "https://typecast.ai/api/speak/v2/{your-speak-id}",
    "speak_url": "https://typecast.ai/api/speak/{your-speak-id}"
  }
}
```

</details>

### `error_code`s in `400` response

| Error Code                 | Description                                                                               |
|----------------------------|-------------------------------------------------------------------------------------------|
| __`app/param/not-enough`__ | Some required fields are not included in the request body.                                |
| __`app/invalid/text`__     | The length of `text` is over 350, or `text` consists entirely of unpronounceable letters. |
| __`app/invalid/actor_id`__ | `actor_id` is incorrect or disallowed.                                                    |

<details open>
<summary>Example</summary>

```json
{
  "message": {
    "msg": "need params as actor_id",
    "error_code": "app/param/not-enough"
  }
}
```

</details>