Skip to content

Text-to-Speech (TTS) Service User Guide

Welcome! This guide will help you quickly get started with the Language Server Text-to-Speech (TTS) API. You'll learn how to convert text into speech using a simple API call, understand the required parameters, and see example requests and responses.

Supported Providers: Azure, Google, Sarvam, and Open Source (Indic-Parler)


Quickstart

  1. Get your API key from your admin or dashboard.
  2. Choose your provider (Azure, Google, Sarvam, or Karya Local).
  3. Prepare your text and language code (see supported languages below).
  4. Send a POST request to the TTS endpoint with the required headers and form data.
  5. Download your audio from the response URLs.


1. Request Headers

Include these headers in every request:

Accept: application/json
X-API-Key: YOUR_API_KEY      # Replace with your actual API key
Content-Type: multipart/form-data

2. Request Body: Form Data

Send the request body as multipart/form-data with a field named payload_data (see below for structure).

Synchronous Only

Text-to-Speech is only available as a synchronous (real-time) service. Unlike transcription and completion, there is no batch/async TTS endpoint — results are returned directly in the API response.


3. Endpoint

  • URL: https://languageserver.karya.in/v1/task/text-to-speech
  • Method: POST

4. Example Request (cURL)

curl -X POST 'https://languageserver.karya.in/v1/task/text-to-speech' \
  -H 'accept: application/json' \
  -H 'X-API-Key: YOUR_API_KEY' \
  -H 'Content-Type: multipart/form-data' \
  -F 'payload_data={
    "user_email": "your_email@example.com",
    "project_name": "Individual",
    "provider": "AZURE",
    "models": ["NEURAL"],
    "voice_gender": "FEMALE",
    "task_name": "my_first_tts_task_unique_001",
    "datapoints": {
      "datapoint 1": {"text": "नमस्ते, यह एक परीक्षण है।", "speed": "1.0x", "lang": "hi-IN"}
    }
  }'


5. payload_data Structure

This is a JSON string containing your task and datapoints. Example:

{
  "user_email": "your_email@example.com",
  "project_name": "your_project_name",
  "task_name": "my_first_tts_task_001",
  "provider": "AZURE",
  "models": ["NEURAL"],
  "voice_gender": "FEMALE",
  "datapoints": {
    "datapoint 1": {
      "text": "नमस्ते, यह एक परीक्षण है।",
      "speed": "1.0x",
      "lang": "hi-IN"
    },
    "datapoint 2": {
      "text": "This is a test for English audio generation.",
      "speed": "1.2x",
      "lang": "en-IN"
    }
  }
}

Important: The models field is mandatory. It must be a non-empty list of model names (e.g., ["NEURAL"]). Omitting this field or leaving it empty will result in an error.

You can add as many datapoints as you need.


Required Fields in payload_data

Field Description Type
user_email Your registered email address string
project_name Name of your project string
provider Backend provider (see table below) string
models Required. List of model names to use (e.g., ["NEURAL"]). list of strings
voice_gender Optional. Voice gender: "MALE" or "FEMALE". Defaults to "FEMALE" if not provided. string
datapoints Dictionary of datapoint objects. Each key (e.g., "datapoint 1") is a unique identifier. dict

Optional Fields in payload_data

Field Description Type Default
task_name Optional label for the task (does not need to be unique) string Auto-generated

Provider & Model Mapping

Provider provider value Required models value Notes
Azure "AZURE" ["NEURAL"] Commercial. Only the model name NEURAL is supported.
Google "GOOGLE" ["WAVENET"] or ["STANDARD"] Commercial. Use ["WAVENET"] for premium, ["STANDARD"] for basic.
Sarvam "SARVAM" ["BULBULV2"] Commercial. Only BULBULV2 is supported.
Karya Local (Indic-Parler) "KARYA_LOCAL" ["INDIC_PARLER"] Open Source. Requires a separately deployed Indic-Parler TTS service. Contact the server administrator to confirm availability.

Tip: For personal use, set project_name to "Individual".


Available Models for Each Provider

Azure

Set models to ["NEURAL"] (the only supported model for Azure TTS).

Google

Set models to either ["WAVENET"] (premium) or ["STANDARD"] (basic).

Google WAVENET Fallback: If WAVENET is not available for a specific language (e.g., Telugu), the server automatically falls back to the STANDARD model.

Sarvam

Set models to ["BULBULV2"] (the only supported model).

Karya Local (Indic-Parler)

Set models to ["INDIC_PARLER"] (the only supported model).

Warning: Leaving models as ["default"] or omitting it will not work. You must specify the correct model for your provider as shown above.

Note: - If a model exists but is currently unserviceable (e.g., due to downtime or errors), the system will fall back to the most basic model for that provider.

Each datapoint must include:

Field Description Type
text Text to convert to speech string
speed Playback speed as a string ending with "x" (e.g., "1.0x", "1.2x"). Valid ranges differ by provider — see table below. string
lang Language code (see below) string

Speed Ranges by Provider

Provider Speed Range Notes
Azure Any positive value Interpreted as percentage adjustment
Google 0.25x4.0x Values outside range are clamped silently
Sarvam 0.5x2.0x Values outside range are clamped silently
Karya Local Any positive value Converted to "slow" (< 0.9x), "moderate" (0.9x–1.1x), or "fast" (> 1.1x)
> Note on Voice Gender: The voice_gender field is specified at the task level (not per datapoint) and applies to all datapoints in the request. If not provided, it defaults to "FEMALE".

About the models Field

  • The models field is required for all TTS requests.
  • It must be a non-empty list of model names. For most providers, use ["NEURAL"].
  • If you omit this field or provide an empty list, your request will be rejected.
  • The available model names may vary by provider. See the Provider & Model Mapping table above for details.

About the voice_gender Field

  • The voice_gender field is optional and can be set to "MALE" or "FEMALE".
  • If not provided, it defaults to "FEMALE".
  • This is a task-level setting that applies to all datapoints in the request.
  • Voice availability varies by provider and language. Some provider/language pairs currently expose only one configured gender.
    {
      "provider": "AZURE",
      "models": ["NEURAL"],
      "voice_gender": "MALE",
      "datapoints": { ... }
    }
    

Supported Languages

Language Code Supported By
Assamese as-IN Azure, Sarvam, Karya Local
Bengali bn-IN All providers
Bodo brx-IN Karya Local only
Chhattisgarhi hne-IN Karya Local only
Dogri doi-IN Karya Local only
English (India) en-IN All providers
Gujarati gu-IN All providers
Hindi hi-IN All providers
Kannada kn-IN All providers
Malayalam ml-IN All providers
Manipuri mni-IN Karya Local only
Marathi mr-IN All providers
Nepali ne-IN Karya Local only
Odia or-IN Azure, Sarvam, Karya Local
Punjabi pa-IN All providers
Sanskrit sa-IN Karya Local only
Tamil ta-IN All providers
Telugu te-IN All providers*
Urdu ur-IN Azure, Google

*Telugu uses STANDARD model only for Google (WAVENET not available; falls back automatically)

Provider Language Support

Language Azure Google Sarvam Karya Local
Assamese (as-IN)
Bengali (bn-IN)
Bodo (brx-IN)
Chhattisgarhi (hne-IN)
Dogri (doi-IN)
English (en-IN)
Gujarati (gu-IN)
Hindi (hi-IN)
Kannada (kn-IN)
Malayalam (ml-IN)
Manipuri (mni-IN)
Marathi (mr-IN)
Nepali (ne-IN)
Odia (or-IN)
Punjabi (pa-IN)
Sanskrit (sa-IN)
Tamil (ta-IN)
Telugu (te-IN) ✅*
Urdu (ur-IN)

*Telugu uses STANDARD model only for Google (WAVENET not available; falls back automatically)


6. Example Response

On success, you'll get a JSON response with audio URLs:

{
  "task_id": "your_task_id",
  "status": "COMPLETED",
  "result": {
    "user_email": "your_email@example.com",
    "project_name": "your_project_name",
    "task_name": "my_first_tts_task_001",
    "provider": "AZURE",
    "models": ["NEURAL"],
    "voice_gender": "FEMALE",
    "datapoints": {
      "datapoint 1": {
        "text": "नमस्ते, यह एक परीक्षण है।",
        "speed": "1.0x",
        "lang": "hi-IN",
        "saas_url": "https://<storage_account_name>.blob.core.windows.net/audio-output1/..."
      }
    }
  }
}

The outer task_id and status fields are always present. status will be "COMPLETED" for a successful request, or the request will return HTTP 500 for failures.

Field Descriptions:

Field Description
task_id Unique task identifier
status Task status (COMPLETED)
saas_url Temporary download link for the audio file

Warning: Download your audio files within 24 hours. Links expire after that.

All-or-Nothing Processing

If any datapoint fails to synthesize (e.g., unsupported language for the chosen provider), the entire task fails with HTTP 500. There is no partial-success behavior — all datapoints must succeed for the request to return successfully.


7. Response & Error Codes

Code Description
200 Request successful
400 Invalid or missing parameters (invalid provider, model, speed format, missing required fields)
500 Internal server error (TTS synthesis failure, provider API error)