Documentations

Text to speech

Service: https://tts.api.yating.tw/v1
The Speech service allows you to convert text into synthesized speech and get a list of supported voices for a region by using a REST API. In this article, you'll learn about authorization options, query options, how to structure a request, and how to interpret a response.
The text-to-speech REST API supports neural text-to-speech voices, which support specific languages and dialects that are identified by locals. Each available endpoint is associated with a region. A subscription key for the endpoint or region that you plan to use is required. How to make a tts synthesize http request?
Step1: Call the /speeches/short endpoint of API server Step2: After the audio file is synthesized, you can decode `audioContent` by base64 and save to file.

TTS restful API

Synthesizes speech synchronously: receive results after all text input has been processed. Expected response time should be under 60 seconds.
Request
URL: https://tts.api.yating.tw/v1/speeches/short
Method: POST
Header
Name
Type
Info
*key
String
*Content-Type
String
Only “application/json”
Body
Name
Type
Info
*textConfig
JSON
1. text: the text that you want to generate audio. Maximum text length is 100 characters.
2. type: Put only “text”.
*modelConfig
JSON
See variables in tts modelConfig
*audioConfig
JSON
See variables in tts audioConfig
{
  "input":{
    "text":"這是測試",
"type":"text"
  },
  "voice":{
    "model":"zh_en_female_2",
  },
  "audioConfig":{
    "encoding":"MP3",
    "samplingRate":"16K"
  }
}
variables in tts modelConfig
variables
Type
Info
*model
string
see tts model list
variables in tts audioConfig
variables
Type
Info
*encoding
string
see audio encodings
*sampleRate
string
Only put
16K here.
Response
Body
{
  "audioContent": "//NExAARqoIIAAhEuWAAAGNmBGMY4EBcxvABAXBPmPIAF//yAuh9Tn5CEap3/o..."
}
[201] audio file encoded by base64 in `audioContent`
[400] invalid request format: wrong body format, voice not found, sampleRate not support for the voice
{
  statusCode: 400,
  message: string[],
  error: "Bad Request"
}
[401] unauthorized: key not exist or exceeds the limitation.
[422] pipeline error Please check below table(pipeline error message) for more detail.
[500] internal server error: token not generated(found)
Pipeline Error message
error message
type
Info
internal pipeline error: unknown payload [for reqId(...) segId(...)]
Internal pipeline error
Pipeline error, encountering a message unprocessable (wrong format), usually what happened is that the pipeline stage worker is different from the previous version, so the middle interface doesn’t match.
internal pipeline error: inferencer unavailable
Internal pipeline error
Pipeline error, the inferencer has no respond
ssml validation error: no text to synthesis
Ssml validation error
Input an empty SSML
ssml validation error: parsing error at line: {line}, column: {col}: {msg}
Ssml validation error
SSML grammar error
ssml validation error: context error at line: {line}, column: {col}: {msg}
Ssml validation error
Error in the SSML context, when the error occurs in the context, E.g. If Jason has vocal in the future, but Jason doesn’t support 16k,only 8k, so when you call the API choose 16k and specify Jason in the SSML, then it will happen that the vocal for Jason and the context 16k doesn’t match.
encoding error: encode error for reqId(...) segId(...)
Encoding error
The encoder successfully received and processed the msg, but there was an error in the process.
vocoding error: vocode error for reqId(...) segId(...)
Vocoding error
The vocoder successfully received and processed the msg, but there was an error in the process.
transcoding error: transcode error for reqId(...) segId(...)
Transcoding error
The transcoder successfully received and processed the msg, but there was an error in the process.
(There is not such error yet, but it should added in the future)
unknown error
Unknown error
Unknown error, it can be a program bug, but in theory it should not appear.
service busy
Service busy
The system is currently busy, please try again later to access.

Model list

Currently you can choose between three types of vocals, one male and two female vocals. You need to choose one and configure it.
Model code
Info
language
zh_en_female_1
Female1 (Yating)
Mandarin and English
zh_en_male_1
Male1 (Jiahao)
Mandarin and English
zh_en_female_2
Female2 (Yiqing)
Mandarin and English

Audio encodings

As well as the vocals you also can choose the output format of the audio, currently TTS system supports MP3 and LINEAR16.
codec
Name
Note
MP3
MPEG Audio Layer III
LINEAR16
Linear PCM
16-bit linear pulse-code modulation (PCM) encoding. The header must contain the sample rate.

Limit

Max concurrent request per key: 1

Samples

Pricing