Sends a data array containing raw commands to an actively running recording function. Use adiCommandRecord to enable and configure voice activity detection.
AG
CG
DWORD adiCommandRecord ( CTAHD ctahd, WORD *data [], DWORD nwords)
|
Argument |
Description |
|
ctahd |
Context handle returned by ctaCreateContext or ctaAttachContext. |
|
data |
Pointer to an array of 16-bit data containing the commands. |
|
nwords |
Number of 16-bit data words. |
|
Return value |
Description |
|
SUCCESS |
|
|
CTAERR_FUNCTION_NOT_ACTIVE |
ADI service recording function was not started before calling adiCommandRecord. |
|
CTAERR_INVALID_SEQUENCE |
adiStopRecording has already been invoked. |
|
Event |
Description |
|
ADIEVN_RECORD_EVENT |
Contains information sent by the custom recording function. The event value field may contain one of the following reason codes (also defined in /nms/include/evad.h): EVAD_EVN_FUNCTION_DISABLED Voice activity detection disabled. EVAD_EVN_FUNCTION_ENABLED Voice activity detection enabled. EVAD_EVN_FUNCTION_ERROR Unknown or invalid parameter. EVAD_EVN_SIGNALLING_DISABLED Voice activity detection messaging disabled. EVAD_EVN_SIGNALLING_ENABLED Voice activity detection messaging enabled. EVAD_EVN_SPEECH_BEGIN Speech started. The event buffer contains the energy of the frame generating the event and the energy of the background noise in dB. EVAD_EVN_SPEECH_END Speech stopped. The event buffer contains the energy of the frame generating the event and the energy of the background noise in dB. EVAD_EVN_STREAMING_PAUSED Voice streaming from board to application paused. EVAD_EVN_STREAMING_RESUMED Voice streaming from board to application resumed. |
Note: The application receives ADIEVN_RECORD_EVENT asynchronously, while the speech buffers arrive every buffersize x framerate / framesize msec, attached to ADIEVN_RECORD_BUFFER_FULL (when speech is detected).
The following DSP file must be loaded to the board to enable voice activity detection:
|
For these boards... |
Add this DSP file... |
|
AG |
rvoice_vad.m54 |
|
CG |
rvoice_vad.f54 |
To configure CG boards for voice activity detection, specify rvoice_vad in the resource definition. For example:
Resource[0].Definitions = (dtmf.det_all & rvoice_vad.rec_alaw & rvoice_vad.play_alaw...
To enable voice activity detection, call adiCommandRecord on an actively running ADI recording function (such as adiRecordAsync). Automatic gain control and energy detection must be disabled when using voice activity detection. Recording must be using ADI_ENCODE_MULAW, ADI_ENCODE_ALAW, or ADI_ENCODE_PCM8M16.
Call adiCommandRecord after receiving ADIEVN_RECORD_STARTED.
The first parameter must be one of the following voice activity detector commands:
|
Command |
Description |
|
EVAD_CDE_FUNCTION_ENABLE |
Enable voice activity detection or update parameters. Default is disabled. |
|
EVAD_CDE_FUNCTION_DISABLE |
Disable voice activity detection. Silence is no longer suppressed. |
|
EVAD_CDE_DEFAULT_ENABLE |
Enable voice activity detection with default parameters. |
|
EVAD_CDE_STREAMING_PAUSE |
Pause sending voice data (silence or speech) to the host application. Useful for keeping voice activity detection energy thresholds update active when ASR is not active on the host. |
|
EVAD_CDE_STREAMING_RESUME |
Resume sending voice data to the host application. |
|
EVAD_CDE_SIGNALLING_ENABLE |
Send voice activity detection events to the host application (even if voice activity detection or record streaming are disabled). Default is disabled. |
|
EVAD_CDE_SIGNALLING_DISABLE |
Stop sending voice activity detection events (EVAD_SPEECH_BEGIN and EVAD_SPEECH_END) to the host application. |
When enabling voice activity detection (EVAD_CDE_FUNCTION_ENABLE), modify the voice activity detector's default behavior with the following parameters (also defined in /nms/include/evad.h):
|
Parameter |
Type |
Default |
Units |
Description |
|
snr |
INT16 |
14 |
dB |
Signal to noise ratio. Valid range is 5 to 30 dB. |
|
hold_stop |
INT16 |
1000 |
ms |
Speech hangover time. Valid range is 300 to 2000 ms. |
|
min_noise |
INT16 |
-65 |
dB |
Minimum noise floor. Valid range is -100 to -40 dB. |
|
max_noise |
INT16 |
-40 |
dB |
Maximum noise floor. Valid range is -65 to -25 dB. |
|
signal_attack |
INT16 |
30 |
ms |
Signal attack time constant. Valid range is 10 to 200 ms. |
|
signal_release |
INT16 |
60 |
ms |
Signal release time constant. Valid range is 10 to 200 ms. |
|
noise_attack |
INT16 |
3000 |
ms |
Noise attack time constant. Valid range is 500 to 5000 ms. |
|
noise_release |
INT16 |
600 |
ms |
Noise release time constant. Valid range is 100 to 2000 ms. |
Convert signal_attack, signal_release, noise_attack, and noise_release into DSP format using the following formula:
/* time constant (tc) and period need to have the same units of time */
int epsilon(float tc,float period)
{
float eps;
eps = (float)(1.0 - (float) exp((double)((-1.0 * period)/ tc)));
return (int) (eps * 32767); // return in S.15 format
}
LATENCY = 10; /* 10 msec record DPF period */
DSP_value = epsilon ( time_constant, LATENCY );
The custom recording DPF sends data to the host by calling the DSPOS function dspkSendEvent. The first parameter sent by the function displays in the value field of the CTA_EVENT structure. All remaining parameters display in an attached buffer. The application is responsible for freeing the buffer after it processes the data.
For more information, refer to Detecting voice activity.
/* This code sends a command to custom record DPF and prints the events from the DPF. */
myWaitForEvent( ctaqueuehd, event );
switch(event->id)
{
/* etc... */
case ADIEVN_RECORD_STARTED:
{
WORD myParms[3] = { 0x1111, 0x2222, 0x3333 );
adiCommandRecord( ctahd, myParms, 3);
/* At the DSP level, the DPF will see the following
* command packet.
*
* 0x3 -> size of command packet
* 0x1111 -> parm 1
* 0x2222 -> parm 2
* 0x3333 -> parm 3
*/
}
case ADIEVN_RECORD_EVENT:
if (event->buffer != NULL) // event with multiple data
{
WORD i;
WORD *pData = (WORD *) event->buffer;
printf("event->value %x\n", event->value );
printf("event->size %x\n", event->size );
for (i=0; i < event->size / sizeof(WORD); i++)
{
printf("data[%d] %x\n", i, pData[i]);
}
if (event->size & CTA_INTERNAL_BUFFER)
{
ctaFreeBuffer( event->buffer );
printf("Buffer freed\n");
}
}
else
{ // event with only 1 data
printf("event->value %x\n", event->value );
}
break;
}