Text-to-Speech (TTS) Service User Guide¶

Welcome! This guide will help you quickly get started with the Language Server Text-to-Speech (TTS) API. You'll learn how to convert text into speech using a simple API call, understand the required parameters, and see example requests and responses.

Supported Providers: Azure, Google, Sarvam, and Open Source (Indic-Parler)

Quickstart¶

Get your API key from your admin or dashboard.
Choose your provider (Azure, Google, Sarvam, or Karya Local).
Prepare your text and language code (see supported languages below).
Send a POST request to the TTS endpoint with the required headers and form data.
Download your audio from the response URLs.

1. Request Headers¶

Include these headers in every request:

Accept: application/json
X-API-Key: YOUR_API_KEY      # Replace with your actual API key
Content-Type: multipart/form-data

2. Request Body: Form Data¶

Send the request body as multipart/form-data with a field named payload_data (see below for structure).

Synchronous Only

Text-to-Speech is only available as a synchronous (real-time) service. Unlike transcription and completion, there is no batch/async TTS endpoint — results are returned directly in the API response.

3. Endpoint¶

URL: https://languageserver.karya.in/v1/task/text-to-speech
Method: POST

4. Example Request (cURL)¶

curl -X POST 'https://languageserver.karya.in/v1/task/text-to-speech' \
  -H 'accept: application/json' \
  -H 'X-API-Key: YOUR_API_KEY' \
  -H 'Content-Type: multipart/form-data' \
  -F 'payload_data={
    "user_email": "your_email@example.com",
    "project_name": "Individual",
    "provider": "AZURE",
    "models": ["NEURAL"],
    "voice_gender": "FEMALE",
    "task_name": "my_first_tts_task_unique_001",
    "datapoints": {
      "datapoint 1": {"text": "नमस्ते, यह एक परीक्षण है।", "speed": "1.0x", "lang": "hi-IN"}
    }
  }'

5. `payload_data` Structure¶

This is a JSON string containing your task and datapoints. Example:

{
  "user_email": "your_email@example.com",
  "project_name": "your_project_name",
  "task_name": "my_first_tts_task_001",
  "provider": "AZURE",
  "models": ["NEURAL"],
  "voice_gender": "FEMALE",
  "datapoints": {
    "datapoint 1": {
      "text": "नमस्ते, यह एक परीक्षण है।",
      "speed": "1.0x",
      "lang": "hi-IN"
    },
    "datapoint 2": {
      "text": "This is a test for English audio generation.",
      "speed": "1.2x",
      "lang": "en-IN"
    }
  }
}

Important: The models field is mandatory. It must be a non-empty list of model names (e.g., ["NEURAL"]). Omitting this field or leaving it empty will result in an error.

You can add as many datapoints as you need.

Required Fields in `payload_data`¶

Field	Description	Type
`user_email`	Your registered email address	string
`project_name`	Name of your project	string
`provider`	Backend provider (see table below)	string
`models`	Required. List of model names to use (e.g., `["NEURAL"]`).	list of strings
`voice_gender`	Optional. Voice gender: `"MALE"` or `"FEMALE"`. Defaults to `"FEMALE"` if not provided.	string
`datapoints`	Dictionary of datapoint objects. Each key (e.g., `"datapoint 1"`) is a unique identifier.	dict

Optional Fields in `payload_data`¶

Field	Description	Type	Default
`task_name`	Optional label for the task (does not need to be unique)	string	Auto-generated

Provider & Model Mapping¶

Provider	`provider` value	Required `models` value	Notes
Azure	`"AZURE"`	`["NEURAL"]`	Commercial. Only the model name `NEURAL` is supported.
Google	`"GOOGLE"`	`["WAVENET"]` or `["STANDARD"]`	Commercial. Use `["WAVENET"]` for premium, `["STANDARD"]` for basic.
Sarvam	`"SARVAM"`	`["BULBULV2"]`	Commercial. Only `BULBULV2` is supported.
Karya Local (Indic-Parler)	`"KARYA_LOCAL"`	`["INDIC_PARLER"]`	Open Source. Requires a separately deployed Indic-Parler TTS service. Contact the server administrator to confirm availability.

Tip: For personal use, set project_name to "Individual".

Available Models for Each Provider¶

Azure¶

Set models to ["NEURAL"] (the only supported model for Azure TTS).

Google¶

Set models to either ["WAVENET"] (premium) or ["STANDARD"] (basic).

Google WAVENET Fallback: If WAVENET is not available for a specific language (e.g., Telugu), the server automatically falls back to the STANDARD model.

Sarvam¶

Set models to ["BULBULV2"] (the only supported model).

Karya Local (Indic-Parler)¶

Set models to ["INDIC_PARLER"] (the only supported model).

Warning: Leaving models as ["default"] or omitting it will not work. You must specify the correct model for your provider as shown above.

Note: - If a model exists but is currently unserviceable (e.g., due to downtime or errors), the system will fall back to the most basic model for that provider.

Each datapoint must include:

Field	Description	Type
`text`	Text to convert to speech	string
`speed`	Playback speed as a string ending with "x" (e.g., `"1.0x"`, `"1.2x"`). Valid ranges differ by provider — see table below.	string
`lang`	Language code (see below)	string

Speed Ranges by Provider¶

Provider	Speed Range	Notes
Azure	Any positive value	Interpreted as percentage adjustment
Google	`0.25x` – `4.0x`	Values outside range are clamped silently
Sarvam	`0.5x` – `2.0x`	Values outside range are clamped silently
Karya Local	Any positive value	Converted to "slow" (< 0.9x), "moderate" (0.9x–1.1x), or "fast" (> 1.1x)
> Note on Voice Gender: The `voice_gender` field is specified at the task level (not per datapoint) and applies to all datapoints in the request. If not provided, it defaults to `"FEMALE"`.

About the `models` Field¶

The models field is required for all TTS requests.
It must be a non-empty list of model names. For most providers, use ["NEURAL"].
If you omit this field or provide an empty list, your request will be rejected.
The available model names may vary by provider. See the Provider & Model Mapping table above for details.

About the `voice_gender` Field¶

The voice_gender field is optional and can be set to "MALE" or "FEMALE".
If not provided, it defaults to "FEMALE".
This is a task-level setting that applies to all datapoints in the request.
Voice availability varies by provider and language. Some provider/language pairs currently expose only one configured gender.
```
{
  "provider": "AZURE",
  "models": ["NEURAL"],
  "voice_gender": "MALE",
  "datapoints": { ... }
}
```

Supported Languages¶

Language	Code	Supported By
Assamese	`as-IN`	Azure, Sarvam, Karya Local
Bengali	`bn-IN`	All providers
Bodo	`brx-IN`	Karya Local only
Chhattisgarhi	`hne-IN`	Karya Local only
Dogri	`doi-IN`	Karya Local only
English (India)	`en-IN`	All providers
Gujarati	`gu-IN`	All providers
Hindi	`hi-IN`	All providers
Kannada	`kn-IN`	All providers
Malayalam	`ml-IN`	All providers
Manipuri	`mni-IN`	Karya Local only
Marathi	`mr-IN`	All providers
Nepali	`ne-IN`	Karya Local only
Odia	`or-IN`	Azure, Sarvam, Karya Local
Punjabi	`pa-IN`	All providers
Sanskrit	`sa-IN`	Karya Local only
Tamil	`ta-IN`	All providers
Telugu	`te-IN`	All providers*
Urdu	`ur-IN`	Azure, Google

*Telugu uses STANDARD model only for Google (WAVENET not available; falls back automatically)

Provider Language Support¶

Language	Azure	Google	Sarvam	Karya Local
Assamese (as-IN)	✅	❌	✅	✅
Bengali (bn-IN)	✅	✅	✅	✅
Bodo (brx-IN)	❌	❌	❌	✅
Chhattisgarhi (hne-IN)	❌	❌	❌	✅
Dogri (doi-IN)	❌	❌	❌	✅
English (en-IN)	✅	✅	✅	✅
Gujarati (gu-IN)	✅	✅	✅	✅
Hindi (hi-IN)	✅	✅	✅	✅
Kannada (kn-IN)	✅	✅	✅	✅
Malayalam (ml-IN)	✅	✅	✅	✅
Manipuri (mni-IN)	❌	❌	❌	✅
Marathi (mr-IN)	✅	✅	✅	✅
Nepali (ne-IN)	❌	❌	❌	✅
Odia (or-IN)	✅	❌	✅	✅
Punjabi (pa-IN)	✅	✅	✅	✅
Sanskrit (sa-IN)	❌	❌	❌	✅
Tamil (ta-IN)	✅	✅	✅	✅
Telugu (te-IN)	✅	✅*	✅	✅
Urdu (ur-IN)	✅	✅	❌	❌

*Telugu uses STANDARD model only for Google (WAVENET not available; falls back automatically)

6. Example Response¶

On success, you'll get a JSON response with audio URLs:

{
  "task_id": "your_task_id",
  "status": "COMPLETED",
  "result": {
    "user_email": "your_email@example.com",
    "project_name": "your_project_name",
    "task_name": "my_first_tts_task_001",
    "provider": "AZURE",
    "models": ["NEURAL"],
    "voice_gender": "FEMALE",
    "datapoints": {
      "datapoint 1": {
        "text": "नमस्ते, यह एक परीक्षण है।",
        "speed": "1.0x",
        "lang": "hi-IN",
        "saas_url": "https://<storage_account_name>.blob.core.windows.net/audio-output1/..."
      }
    }
  }
}

The outer task_id and status fields are always present. status will be "COMPLETED" for a successful request, or the request will return HTTP 500 for failures.

Field Descriptions:

Field	Description
`task_id`	Unique task identifier
`status`	Task status (`COMPLETED`)
`saas_url`	Temporary download link for the audio file

Warning: Download your audio files within 24 hours. Links expire after that.

All-or-Nothing Processing

If any datapoint fails to synthesize (e.g., unsupported language for the chosen provider), the entire task fails with HTTP 500. There is no partial-success behavior — all datapoints must succeed for the request to return successfully.

7. Response & Error Codes¶

Code	Description
200	Request successful
400	Invalid or missing parameters (invalid provider, model, speed format, missing required fields)
500	Internal server error (TTS synthesis failure, provider API error)