Volume 1, Issue 8 - August/September 2001
   
 

Object-Oriented VoiceXML

By John Hicks

Suppose you wanted to bring Automatic Speech Recognition to the mass market; to offices of any size and budget; to agencies, foundations, non-profits, and small businesses. Suppose you wanted development prices to start at less than $10,000. How would you go about that?

Well, two answers, of course: VoiceXML for deployability, and reusable parts.

VoiceXML and Reusable Parts

Reusable parts accumulate in software the best practices and proven expertise of the industry. These objects or components can be invoked as supplied, or further subclassed. The first is simpler, and reuses tested code from other developers and projects. The second is more ambitious, and may require further testing, including usability testing and tuning. Either way, reusable parts make industry expertise available to a new and larger pool of (less expensive) developers.

That's also the effect we want from VoiceXML, right?

But don't these two largely exclude one another? How do you putVoiceXML and reusable parts, compiled from object-oriented languages C++ and Java, to work together? In addition, how do we accomplish this without confining ourselves to a particular platform or vendor?

At SpeechBrowser we found three ways.

Reusable Parts from Vendors and Third Parties

First, our Java hierarchy encapsulates reusable parts from SpeechWorks, Nuance and other vendors.. Though written in the object-oriented languages C++ and Java, those parts can be made available to VoiceXML as well.

For our clients, we've written applications in both C++ and Java, that pre-date the emergence of VoiceXML. In one case, for example, we built a prescription request line in both a C++ version (SpeechWorks) and a Java version (Nuance) inside of five months. That was only possible using pre-built and reusable Dialog Modules from SpeechWorks, and SpeechObjects from Nuance.

By invoking reusable parts from third parties, developers can drop into their VoiceXML such involved data types as Date, Quantity, USCurrency, and AlphaDigitString, and plug in entire reusable dialogues with the caller (Confirm, ImplicitConfirm, ConfirmAndCorrect...).

Two More Ways to Combine VoiceXML and Reusable Parts

But at SpeechBrowser we devised two more ways to combine the power of VoiceXML and the power of reusable parts:

1) A Java class library to generate our VoiceXML
2) A VoiceXML equivalent of an object or class

A Java Class Library Generates Our VoiceXML

We make reusable parts of our own, in a class library or hierarchy that grows over time, as our expertise grows.

You don't want expertise accumulating solely in the memory of your team. Team members don't always stay, no matter how lavishly you've invested your time and money to keep them. One day their replacements must be found and won and taught your business all over again. You want your best expertise to accumulate and become reusable software.

We generate VoiceXML during development, as well as at runtime.

VoiceXML Generation for Developers

First, we generate VoiceXML for developers (internal and external) as they build new applications.

Developers either write in Java and generate their VoiceXML, or revise and extend the generated VoiceXML, or both. Both give developers advanced VoiceXML constructs for rapid reuse.

VoiceXML Generation At Runtime

Second, we generate VoiceXML at runtime, from inside a Java servlet.

Instead of invoking a URL that ends in .vxml, we invoke a servlet URL, and the servlet returns the VoiceXML. We decided we couldn't afford the overhead of a general-purpose XML parser at runtime, either to generate or validate the new VoiceXML with a general-purpose.

Java Classes for VoiceXML

The architecture diagram shown in Figure 1 shows how the the classes in our hierarchy build upon one another in three layers. In the first layer we simply reproduce the standard VoiceXML tags, that are reusable anywhere. A second layer specializes those tags into widely-used data types such as Date and Airport. A third layer combines those tags into complex data types we use in turn-key voice products such as our Talking Catalog.

Continued...

back to the top

 

Copyright © 2001 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).