Universal Speech Access API applications use barge-in detection to interrupt an active voice prompt when a caller begins speaking before the prompt finishes. USAI enables applications to implement barge-in detection in the following ways:
Synchronizing the actions of a recognizer and a synthesizer that run in different sessions. USAI notifies the synthesizer through SAIEVN_START_OF_SPEECH. When the recognizer detects speech input, the synthesizer stops generating its speech prompt.
Using a vendor-specific link between the synthesizer and recognizer.
This topic describes two scenarios for implementing a barge-in enabled prompt:
Performing barge-in with recognizers and synthesizers in different sessions
Performing barge-in with linked recognizers and synthesizers
When speech resources are not on the same session and the recognizer and synthesizer processes are not linked, the MRCP client can act as a proxy by sending start-of-speech messages received from a recognizer to a specific synthesizer. This process works in the following way:
|
Step |
Action |
|
1 |
The recognizer detects the beginning of speech input. |
|
2 |
USAI returns SAIEVN_START_OF_SPEECH to the client application. |
|
3 |
The client application uses saiNotifyBargeInToSynthesizer to send a BARGE-IN-OCCURRED event to the synthesizer that indicates that the recognizer resource has detected speech input. |
|
4 |
The synthesizer receives a BARGE-IN-OCCURRED event and stops playing the prompt |
|
5 |
The recognizer begins recognizing the input voice data. |
To implement a barge-in enabled prompt, the application must start and configure the recognizer and synthesizer in the following way:
Create and configure the recognizer so that the recognizer detects speech input before the synthesizer begins to play a voice prompt.
Start the synthesizer with barge-in enabled.
To implement kill-on-barge-in for unlinked synthesizers and recognizers:
|
Step |
Action |
|
1 |
The application creates a recognizer resource with saiCreateRecognizer, after making sure (with saiAsrSetRecognitionStartTimer) that all timers are disabled. |
|
2 |
The application creates a synthesizer resource with saiCreateSynthesizer. |
|
3 |
The application enables barge-in for the synthesizer with saiTtsSetKillOnBargeIn (setting the value to TRUE). |
|
4 |
The application plays a voice prompt with saiSpeakSynthesizer. |
|
5 |
If the recognizer detects voice input before the synthesizer finishes generating its voice prompt, the recognizer sends a START-OF-SPEECH message to USAI. |
|
6 |
USAI returns SAIEVN_START_OF_SPEECH to the application. |
|
7 |
The application associates the received SAIEVN_START_OF_SPEECH with a particular synthesizer session and invokes saiNotifyBargeInToSynthesizer to notify the synthesizer that a barge-in has occurred. |
|
8 |
The synthesizer automatically stops playing the active voice prompt (the application does not have to call saiStopSynthesizer) and eliminates all voice prompts currently in its queue. |
|
9 |
The application invokes saiStartTimerRecognizer to start the recognizer’s no-input-timeout timer. The no-input-timeout timer specifies the length of time to wait during which no voice input is detected before ending the recognizer task. |
Note: saiAsrSetRecognitionStartTimer specifies a TRUE or FALSE value that enables or disables recognizer timers (the recognition timeout as well as the no-input-timeout). saiAsrSetRecognitionTimeout specifies the maximum length of time that the recognizer waits before terminating a recognition request when speech is detected.
If the synthesizer or recognizer vendor supports linked recognizer and synthesizer tasks, you can interrupt an active synthesizer prompt more quickly than if it is managed by the application and USAI.
To implement kill-on-barge-in for linked synthesizers and recognizers:
|
Step |
Action |
|
1 |
The application creates a recognizer resource with saiCreateRecognizer, with its no-input-timeout timer disabled (this is the default configuration). |
|
2 |
The application creates a synthesizer resource with saiCreateSynthesizer. |
|
3 |
The application enables barge-in for the synthesizer with saiTtsSetKillOnBargeIn (setting the value to TRUE). |
|
4 |
The application plays a voice prompt with saiSpeakSynthesizer. |
|
5 |
If the recognizer detects voice input before the synthesizer finishes generating its voice prompt, the recognizer sends a START-OF-SPEECH message to USAI. |
|
6 |
USAI returns SAIEVN_START_OF_SPEECH to the application. |
|
7 |
The application associates SAIEVN_START_OF_SPEECH with a particular synthesizer session and invokes saiNotifyBargeInToSynthesizer to notify the synthesizer that a barge-in has occurred. The recognizer sends a stop command to the synthesizer before USAI returns SAIEVN_START_OF_SPEECH to the client application. Even if the synthesizer is already stopped, the MRCP client application must close the loop. A ProxySyncId identifier is embedded into the SAIEVN_START_OF_SPEECH event returned to the application. The application must specify this ProxySyncId when invoking saiNotifyBargeInToSynthesizer. |