Yesterday, I attended SpeechTek in New York. With all the hype and bling in our industry revolving around video and data, SpeechTek makes no bones about expanding beyond voice to include “multimodal” self service. However, Siri has revitalized speech itself, and I found the show fairly vitalized because of this. This show is definitely different than it used to be.

The SpeechTek website explains this change best: “Smartphone and tablet applications provide convenient and intuitive interactions by allowing customers to use different input methods (talk, touch, and type) and see and listen to results. Multimodal applications have raised customer expectations for service across all modalities, and organizations need to understand and deploy self-service technologies wisely to meet the needs of today’s connected customers. That’s why SpeechTEK 2012 is expanding its focus beyond IVR to include smartphone and tablet applications.” The show did execute on this promise, at least what I saw.
While I spoke on a panel about HD Voice, I also took the time to attend some sessions. One session in particular about Advanced Spoken Language Research, done by someone from ICSI (International Computer Science Institute), was interesting.
2012 is the 60th Birthday of Speech Recognition. Speech Recognition has come a long way since its early beginnings, due to technology improvements (processing power) and research.
Current “hot” topics in Speech are as follows. I list them here since you as a reader may think of some cool, money-making application:
- Speech Retrieval – searching the web for spoken words (i.e. looking for words in a YouTube video)
- Speech Synthesis – computer talking back to you intelligently
- Speaker identification – for passwords, etc.
- Non-linguistic information – for instance, detecting lying (better than humans)
- Speaker diarization – who is speaking when, in a continuous stream of speakers
So all you innovators, let’s get going!
Posted
08-14-2012 10:38 AM
by
Jim Machi
Dialogic, the Network Fuel™ company, inspires the world’s leading service providers and application developers to elevate the performance of media-rich communications across the most advanced networks. We boost the reliability of any-to-any network connections, supercharge the impact of applications and amplify the capacity of congested networks. Forty-eight of the world’s top 50 mobile operators and nearly 3,000 application developers rely on Dialogic to redefine the possible and exceed user expectations.