adiCommandRecord

Sends a data array containing raw commands to an actively running recording function. Use adiCommandRecord to enable and configure voice activity detection.

Supported board types

Prototype

DWORD adiCommandRecord ( CTAHD ctahd, WORD *data [], DWORD nwords)

Argument	Description
ctahd	Context handle returned by ctaCreateContext or ctaAttachContext.
data	Pointer to an array of 16-bit data containing the commands.
nwords	Number of 16-bit data words.

Return values

Return value	Description
SUCCESS
CTAERR_FUNCTION_NOT_ACTIVE	ADI service recording function was not started before calling adiCommandRecord.
CTAERR_INVALID_SEQUENCE	adiStopRecording has already been invoked.

Events

Event

Description

ADIEVN_RECORD_EVENT

Contains information sent by the custom recording function. The event value field may contain one of the following reason codes (also defined in /nms/include/evad.h):

EVAD_EVN_FUNCTION_DISABLED

Voice activity detection disabled.

EVAD_EVN_FUNCTION_ENABLED

Voice activity detection enabled.

EVAD_EVN_FUNCTION_ERROR

Unknown or invalid parameter.

EVAD_EVN_SIGNALLING_DISABLED

Voice activity detection messaging disabled.

EVAD_EVN_SIGNALLING_ENABLED

Voice activity detection messaging enabled.

EVAD_EVN_SPEECH_BEGIN

Speech started. The event buffer contains the energy of the frame generating the event and the energy of the background noise in dB.

EVAD_EVN_SPEECH_END

Speech stopped. The event buffer contains the energy of the frame generating the event and the energy of the background noise in dB.

EVAD_EVN_STREAMING_PAUSED

Voice streaming from board to application paused.

EVAD_EVN_STREAMING_RESUMED

Voice streaming from board to application resumed.

Note: The application receives ADIEVN_RECORD_EVENT asynchronously, while the speech buffers arrive every buffersize x framerate / framesize msec, attached to ADIEVN_RECORD_BUFFER_FULL (when speech is detected).

Details

The following DSP file must be loaded to the board to enable voice activity detection:

For these boards...	Add this DSP file...
AG	rvoice_vad.m54
CG	rvoice_vad.f54

To configure CG boards for voice activity detection, specify rvoice_vad in the resource definition. For example:

Resource[0].Definitions = (dtmf.det_all & rvoice_vad.rec_alaw & rvoice_vad.play_alaw...

To enable voice activity detection, call adiCommandRecord on an actively running ADI recording function (such as adiRecordAsync). Automatic gain control and energy detection must be disabled when using voice activity detection. Recording must be using ADI_ENCODE_MULAW, ADI_ENCODE_ALAW, or ADI_ENCODE_PCM8M16.

Call adiCommandRecord after receiving ADIEVN_RECORD_STARTED.

The first parameter must be one of the following voice activity detector commands:

Command	Description
EVAD_CDE_FUNCTION_ENABLE	Enable voice activity detection or update parameters. Default is disabled.
EVAD_CDE_FUNCTION_DISABLE	Disable voice activity detection. Silence is no longer suppressed.
EVAD_CDE_DEFAULT_ENABLE	Enable voice activity detection with default parameters.
EVAD_CDE_STREAMING_PAUSE	Pause sending voice data (silence or speech) to the host application. Useful for keeping voice activity detection energy thresholds update active when ASR is not active on the host.
EVAD_CDE_STREAMING_RESUME	Resume sending voice data to the host application.
EVAD_CDE_SIGNALLING_ENABLE	Send voice activity detection events to the host application (even if voice activity detection or record streaming are disabled). Default is disabled.
EVAD_CDE_SIGNALLING_DISABLE	Stop sending voice activity detection events (EVAD_SPEECH_BEGIN and EVAD_SPEECH_END) to the host application.

When enabling voice activity detection (EVAD_CDE_FUNCTION_ENABLE), modify the voice activity detector's default behavior with the following parameters (also defined in /nms/include/evad.h):

Parameter	Type	Default	Units	Description
snr	INT16	14	dB	Signal to noise ratio. Valid range is 5 to 30 dB.
hold_stop	INT16	1000	ms	Speech hangover time. Valid range is 300 to 2000 ms.
min_noise	INT16	-65	dB	Minimum noise floor. Valid range is -100 to -40 dB.
max_noise	INT16	-40	dB	Maximum noise floor. Valid range is -65 to -25 dB.
signal_attack	INT16	30	ms	Signal attack time constant. Valid range is 10 to 200 ms.
signal_release	INT16	60	ms	Signal release time constant. Valid range is 10 to 200 ms.
noise_attack	INT16	3000	ms	Noise attack time constant. Valid range is 500 to 5000 ms.
noise_release	INT16	600	ms	Noise release time constant. Valid range is 100 to 2000 ms.

Convert signal_attack, signal_release, noise_attack, and noise_release into DSP format using the following formula:

/* time constant (tc) and period need to have the same units of time */
int epsilon(float tc,float period)
{
float eps;
eps = (float)(1.0 - (float) exp((double)((-1.0 * period)/ tc)));

return (int) (eps * 32767); // return in S.15 format
}
LATENCY = 10; /* 10 msec record DPF period */
DSP_value = epsilon ( time_constant, LATENCY );

The custom recording DPF sends data to the host by calling the DSPOS function dspkSendEvent. The first parameter sent by the function displays in the value field of the CTA_EVENT structure. All remaining parameters display in an attached buffer. The application is responsible for freeing the buffer after it processes the data.

For more information, refer to Detecting voice activity.

Example

   /* This code sends a command to custom record DPF and prints the events from the DPF. */
myWaitForEvent( ctaqueuehd, event );
switch(event->id)
{
    /* etc... */
    case ADIEVN_RECORD_STARTED:
        {
               WORD myParms[3] = { 0x1111, 0x2222, 0x3333 );
               adiCommandRecord( ctahd, myParms, 3);
               /* At the DSP level, the DPF will see the following
                * command packet.
                *
                * 0x3        -> size of command packet
                * 0x1111     -> parm 1
                * 0x2222     -> parm 2
                * 0x3333     -> parm 3
                */
        }
    case ADIEVN_RECORD_EVENT:
        if (event->buffer != NULL) // event with multiple data
        {
            WORD i;
            WORD *pData = (WORD *) event->buffer;
            printf("event->value %x\n", event->value );
            printf("event->size %x\n", event->size );
            for (i=0; i < event->size / sizeof(WORD); i++)
            {
                printf("data[%d] %x\n", i, pData[i]);
            }
            if (event->size & CTA_INTERNAL_BUFFER)
            {
                ctaFreeBuffer( event->buffer );
                printf("Buffer freed\n");
            }
        }
        else
        {                          // event with only 1 data
            printf("event->value %x\n", event->value );
        }
        break;
}