ASR custom language model
The Yating ASR service offers a customization interface that you can use to augment its speech recognition capabilities. You can use customization to improve the accuracy of speech recognition requests by customizing a base model for your domain and audio.
The language model customization interface can improve the accuracy of speech recognition for domains such as medicine, law, information technology, and others. By using language model customization, you can expand and tailor the vocabulary of a base model to include domain-specific terminology.
You create a custom language model and add paragraphs, corpora and words specific to your domain. Once you train the custom language model on your enhanced vocabulary, you can use it for customized speech recognition. The service can typically train any custom model in a matter of minutes. The time and effort that are needed to create a custom model depend on the data that you have available for the model.
After creating a language model, you can use this model when doing streaming or batching asr services.
Create a customized Language mode
A corpus is a plain text document that uses terminology from the domain in context. You can generate a customized language model by giving a txt corpus file url. After a few minutes, you should be able to get a customized language model ID.
After you create your custom model, you can use it with speech recognition requests. If the audio that is passed for transcription contains domain-specific words that are defined in the custom model's corpora and custom words, the results of the request reflect the model's enhanced vocabulary. You can use only one model at a time with a speech recognition request and this model can only be used by the user who generated it.
There are 2 kinds of training data, text and words. You should put paragraph text and sentences in a text file, and put phrases and words in a word file. Properly format it and save it in one or more text files. Make sure total files’ size under 5 MB and each text file:
Is in plain text (it's not a file such as a Microsoft Word document, comma-separated value file, or PDF).
Is encoded in UTF-8.
Doesn't contain any formatting characters, such as HTML tags.
Is less than 1MB in size if you intend to use the file as training data. Amazon Transcribe only accepts a maximum of 1MB of training data.
Txt file urls.
word file urls.
Get models status
Get a model status
Delete a custom LM
If you succeed, you’ll see.