VoiceXML Review - Feature Articles

Volume 3, Issue 5 - September/October 2003

Developing X+V Applications Using the Multimodal Tools

Continued from previous page...

Phase 1: Planning the application

It is vital to plan the application in detail before the actual development begins. During this phase, you make high- and low-level design decisions that can assist developers in creating standardized, well-behaved multimodal applications, as well as reduce development time by taking some of the guesswork out of interface design. Planning will increase the usability of the interface, reduce the end-user’s learning curve, and increase the effectiveness of the application in meeting your business needs. A thorough description of design decisions is included in the VoiceXML Programmer's Guide, packaged with the toolkit and the Voice Server SDK.

Just for starters, you will have to decide how and when end-users will add information so that you can match the visual and voice design. Keep in mind that all the words users could say must be added to a grammar file so that the speech recognition engine can match the spoken words to usable data. Effective designs use simple and consistent language for instructions, clear action verbs, such as "Select your credit card type," and consistent phrasing throughout the application.

Designers can also investigate and take advantage of the tools and features that are included in the toolkit. One of the key features is the built-in set of Reusable Dialog Components. When building an X+V application using the toolkit, you can use dialog components, which are ready-to-use sets of VoiceXML source code for common functions. This will enable you to quickly and easily add voice functions to your applications, thus saving time and reducing the amount of code you need to write. Dialog components can also be customized and reused within an application or in other applications.

Phase 2: Creating the multimodal project and file

With the design aspects in mind, you are now ready to start an X+V application. After opening the toolkit, you will use the New Project wizard to create a multimodal project in which you will store all the files needed for developing your application. Then you will use the New File wizard to create your first multimodal file, with file extension .mxml, the file format used by the Multimodal Toolkit for an X+V application. The X+V editor creates a new file in the toolkit with a pre-filled heading and basic tags as shown in the following figure.

Phase 3: Adding the visual component

If you already have visual applications that you want to voice-enable, you can use an import wizard, drag & drop files into the Navigator, or cut & paste existing HTML into the multimodal project. You can also write XHTML code directly in the X+V editor.

All visual markup must comply with XHTML conventions. This revision can be done before or after you add the HTML in the toolkit. XHTML documents must be well formed, with each element properly closed. All HTML elements must be nested within the root element. See the XHTML 1.0 specification included in the References section.

For best results, test the XHTML file to make sure it is correctly revised before adding it into the project. A good resource for testing and development is http://validator.w3.org. You can use the Web site to check whether your page is valid XHTML.

Phase 4: Adding the voice component

We will need to come back to the visual part later, but now you are ready to begin coding the VoiceXML, or voice portion. The VoiceXML code should be nested within the element of the XHTML document, as you will see in the following example. You can write the voice portion using two methods: coding VoiceXML from scratch, or using the built-in Reusable Dialog Components to add the voice input.

To demonstrate the options, we will use a simple example of each to voice-enable the city field and the credit card field.

Coding the easy way: Using Reusable Dialog Components

In place of, or in addition to, writing your own code, you can use the built-in Reusable Dialog Components to add common functions to your VoiceXML file. Each subdialog includes sample calling code (file extension .mxml) that you can copy and paste into your VoiceXML file.

In the Reusable Dialog Components wizard (shown below), you can select and customize a dialog component and then import it within a VoiceXML

Continued...

back to the top

Copyright © 2001-2003 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).