|
Let's Talk Office Supplies
One of the profound changes taking place in society is telecom immersion. For those of us living in industrialized nations, not only have telephones become our constant companions, they are providing voice connectivity to computers as well as to people. In fact, "virtual telephone operators" and "virtual call center agents" are increasingly prevalent as improvements are made in automatic speech recognition for interpreting telephone requests. At this juncture, and at the leading edge of "speech recognition automation" in customer relationship management, is VoiceXML.
VoiceXML looks a lot like HTML. In fact, its designers hope that VoiceXML will mediate the creation of a "voice web" just as HTML mediated the creation of the visual Web. While this may very well happen, the more immediate beneficiaries of VoiceXML are not Internet users, but telephone companies, call center operators, and telephone-based CRM operations. VoiceXML brings the benefits of Web infrastructure and tools to serve the telephone-using public.
VoiceXML benefits from being a member of the XML family that is revolutionizing Internet communication. XML is being put squarely in the middle of all Internet communication through the efforts of Microsoft's ".NET" initiative. Even if it weren't for Microsoft, XML would be revolutionizing business databases, inventory control systems and CRM systems& making it easier to share data between organizations, inside and outside of firewalls. XML has volumes written about it so we won't presume to explain it further here. However, it should be understood that VoiceXML benefits from all the ongoing XML technology development.
VoiceXML itself is a high-level language for authoring voice applications. It allows developers to write voice applications using simple markup tags and scripts rather than in traditional and more complex programming languages. This speeds up the development process enormously. VoiceXML scripts work by orchestrating speech dialog systems accessed on the telephone using TTS (text-to-speech), recorded prompts and ASR (automatic speech recognition). Extensive use is made of Internet and web development technology to develop applications and deploy them, but the telephone calls themselves need not involve the Internet at all.
To develop "voice applications" in VoiceXML, you create XML scripts that specify "audio prompts" to be played to callers and "recognition grammars" to tell the recognizer what words to look for in callers' responses to prompts. VoiceXML scripts themselves may be created using any text editor (although XML validating editors offer advantages). To deploy VoiceXML applications, VoiceXML scripts are hosted on a Web server. The Web server is then accessed by "voice browsers," which are computers running VoiceXML interpreters and speech recognizers and interfaces to the telephone network. When a "voice browser" answers a telephone call, it retrieves VoiceXML scripts from a Web server. The scripts may be generated dynamically by the Web server. The VoiceXML scripts instruct the voice browser to play prompts and to start the speech recognizer with the appropriate grammar(s). When the caller's utterance is recognized, the voice browser will select and transition to another dialog that may be within the same VoiceXML script or have to be fetched from the Web server.
Note that "browsing" is somewhat of a misnomer in this context. People typically do not browse on the telephone. The name was chosen to point out the relationship of VoiceXML to the Web, not to suggest that people follow hypertext links on voice enabled Web pages.
As we mentioned above, VoiceXML was conceived with the vision of a "Voice Web" in mind, modeled after the visual/graphic Web that HTML helped create. However, there is a huge hurdle to explosive growth of the Voice Web. The "visual Web" was created by downloading HTML interpreters, e.g. Web browsers such as Internet Explorer, that could run on any of the millions of computers connected to the Internet. The "Voice Web," on the other hand, cannot be created by simply downloading VoiceXML interpreters to run onto today's "typical computers" because "typical computers" lack voice interfaces to telephones. This will not always be the case. But for the time being, Voice Browsers are not being set up by average computer users. Rather, they are being set up and maintained by Voice Portal providers such as Tellme, BeVocal, HeyAnita, and Voxeo. In addition, and more importantly, they are being set up and maintained by Voice Applications Service Providers such as NetByTel and General Magic to serve the "automated customer service representative" needs of business and government.
Continued...
back to the top
Copyright © 2001-2002 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).
|