Convert spoken audio to text. The API can be directed to turn on and recognize audio coming from the microphone in real-time, recognize audio coming from a different real-time audio source, or to recognize audio from within a file. In all cases, real-time streaming is available, so as the audio is being sent to the server, partial recognition results are also being returned. The Speech to Text API enables you to build smart apps that are voice triggered. To see how it works select your target language then click on the microphone and start speaking. Or simply click on one of the sample speech phrases to see how speech recognition works. When you use this demo you consent to providing your voice input data to Microsoft for service improvement purposes.
To try out the demo with your own voice using a microphone,please change to a different browser with WebRTC support,for example recent version of Microsoft Edge, Firefox or Chrome.
Text to Speech
Convert text to spoken audio. When applications need to “talk” back to their users, this API can be used to convert text that is generated by the app into audio that can be played back to the user. The Text-To-Speech API enables you to build smart apps that can speak. You can test it now, simply choose your target language, add your sentences then click on the play button to see how speech synthesis works. When you use this demo you consent to providing your voice input data to Microsoft for service improvement purposes.
Speech Intent Recognition
Convert spoken audio to intent. This is similar to Speech Recognition. With Speech Intent Recognition -in addition to returning recognized text from audio input- the server returns structured information about the incoming speech so that apps can easily parse the intent of the speaker, and subsequently drive further action. Models trained by Microsoft Language Understanding Intelligent Service (LUIS) service are used to generate the intent.
|Free||5K calls per month||Free|
|Text to speech||1000 characters per call||$4 per 1000 calls|
|Speech to text||15 sec per call||$4 per 1000 calls|
|Buy on Azure|