Technical Helpweb

- more articles

Using DivaSendVoiceEx to send multiple voice prompts together

When an application using the Dialogic® Diva® SDK API needs to send a single voice prompt from a WAV file, it will normally use the function DivaSendVoiceFile.

However, in some cases prompts may be constructed from a combination of fragments from different files (or from memory), and audio data is not always stored in WAV files.  For example, IVR applications may store prompts in mu-law or ADPCM format to reduce the disk space required.  In that case, prompts can be played using DivaSendVoiceEx or DivaAppendVoice, which read their instructions from a descriptor that describes the format and location of the audio data.

DivaSendVoiceEx example

This example plays a prompt made up of four files in different formats (that say "one", "two", "three" and "four") and a beep that is streamed from memory.  In each case the files contain only audio samples with no header.  By convention, Dialogic® audio files often have the extension '.vox', but the content of these files can be G.711 (mu-law or a-law) or 4-bit ADPCM.  The file 'test-pcm.lin' contains only 16-bit PCM samples with no header. 
void PlayPrompts(CallInstance* call)
{
  const int nprompts=5;
  DivaVoiceDescriptor dv[nprompts];
  int count;

  // File1 is VOX mu-law
  dv[0].Size = sizeof( DivaVoiceDescriptor );
  dv[0].StartOffset = 0;
  dv[0].Duration = 0;
  dv[0].DataFormat = DivaAudioFormat_Raw_uLaw8K8BitMono;
  dv[0].DataSource = DivaVoiceDataSourceFile;
  dv[0].PositionFormat = DivaVoicePositionFormatBytes;
  dv[0].Source.File.pFilename = "test-mu.vox";

  // File2 is VOX ADPCM
  dv[1].Size = sizeof( DivaVoiceDescriptor );
  dv[1].StartOffset = 0;
  dv[1].Duration = 0;
  dv[1].DataFormat = DivaAudioFormat_Raw_ADPCM_8K4BitMono;
  dv[1].DataSource = DivaVoiceDataSourceFile;
  dv[1].PositionFormat = DivaVoicePositionFormatBytes;
  dv[1].Source.File.pFilename = "test-adpcm.vox";

  // File3 is PCM16 (LIN)
  dv[2].Size = sizeof( DivaVoiceDescriptor );
  dv[2].StartOffset = 0;
  dv[2].Duration = 0;
  dv[2].DataFormat = DivaAudioFormat_Raw_PCM_8K16BitMono;
  dv[2].DataSource = DivaVoiceDataSourceFile;
  dv[2].PositionFormat = DivaVoicePositionFormatBytes;
  dv[2].Source.File.pFilename = "test-pcm.lin";

  // File4 is VOX a-law
  dv[3].Size = sizeof( DivaVoiceDescriptor );
  dv[3].StartOffset = 0;
  dv[3].Duration = 0;
  dv[3].DataFormat = DivaAudioFormat_Raw_aLaw8K8BitMono;
  dv[3].DataSource = DivaVoiceDataSourceFile;
  dv[3].PositionFormat = DivaVoicePositionFormatBytes;
  dv[3].Source.File.pFilename = "test-a.vox";

  // Finally 5 is not a file, but read from a memory buffer
  GenerateBeep(beeplen, &count, beep);  //1second of 1khz beep
  dv[4].Size = sizeof( DivaVoiceDescriptor );
  dv[4].StartOffset = 0;
  dv[4].Duration = 0;
  dv[4].DataFormat = DivaAudioFormat_Raw_PCM_8K16BitMono;
  dv[4].DataSource = DivaVoiceDataSourceMemory;
  dv[4].Source.Memory.DataLength = count;
  dv[4].PositionFormat = DivaVoicePositionFormatBytes;
  dv[4].Source.Memory.pBuffer = (char*)beep;

  DivaSendVoiceEx( call->handle, nprompts, dv, FALSE, 0 );

}
As with DivaSendVoice, DivaSendVoiceEx returns immediately, but when the prompt play completes (perhaps many seconds later), your application will receive a DivaEventSendVoiceDone event.

Note: an application should test the result that DivaSendVoiceEx returns to make sure that the play was accepted successfully and that the play will in fact take place.

GenerateBeep is a function that creates a 1kHz beep of specified length:
 
BYTE* GenerateBeep(int count, int *lengthinbytes, BYTE* bp)
{
    short onekhz[]={1008, 2361, 2361, 1008, -1008, -2361, -2361, -1008};  //1 cycle of 1kHz
    int fragsize=sizeof(onekhz);
    int i,nbytes;
    BYTE *p;

    nbytes = count * fragsize;
    if(bp==NULL) return NULL;

    for(i=0,p=bp; i<count; i++){
        CopyMemory( p, (char*)&onekhz, fragsize);
        p += fragsize;
    }

    *lengthinbytes = nbytes;
    return bp;
}

Because DivaSendVoiceEx can read audio from memory (with the DivaVoiceDataSourceMemory option), it can be used to stream data from other applications via a memory buffer.  For example, an application may receive streamed audio from a text-to-speech (TTS) system such as Nuance® or Microsoft® Speech Server.  An application can also receive audio streamed from a sound card, or a call agent that sends audio via TCP/UDP from another server.

If an application streams music, then it may be necessary to use audio in formats such as MP3, AAC, WMA and OGG.  The Dialogic® Diva® SDK does not support these music codecs, so the application must convert the data (for example, to Linear PCM-16) in a memory buffer, and then instruct DivaSendVoiceEx to play it.  

There are a number of third party codec libraries available that perform various conversions. Note that many of these libraries may need to be licensed from the respective intellectual property owners, and Dialogic encourages all users of its products to procure all necessary intellectual property licenses required to implement any concepts or applications and does not condone or encourage any intellectual property infringement and disclaims any responsibility related thereto. These intellectual property licenses may differ from country to country and it is the responsibility of those who develop concepts or applications to be aware of and comply with different national license requirements.

 





See also:
SDK: Convert audio files - DivaConv
VOX and VAP files


Feedback

Please rate the usefulness of this page:  
0 - not useful at all
1 - potentially useful
2 - quite useful
3 - very useful
4 - exactly the information I needed     

Please enter a comment about this page:

First published: 24-Oct-2008
Open access: Product rule: ; Page rule: Auto

Service Center Logon