Volume 2, Issue 7 - November/December 2002
   
 

Enhancing VoiceXML Application Performance By Caching

Continued from page 1

Using HTTP headers
VoiceXML 2.0 explicitly mandates the minimum support of the Cache-Control and Expires header fields. The Expires header indicates a date for when the object is to be considered no longer fresh. The Cache-Control: max-age header indicates that the object is stale after max-age seconds.. For example, listing 2 above indicates that index.html should be considered stale after 24 hours.

When an object expires, it must not be used unless it has been revalidated. There are two main ways of doing this. HTTP 1.0 and HTTP 1.1 use an efficient method that performs conditional fetches. A request for an object can be made supplying the If-Modified-Since header and an associated date. Assuming no errors, the origin server will return either a 304 status code indicating that the object has not modified and the cached version is the latest or a 200 status code followed by the new version of the object (similar to the response listing 2) if the object has changed. The second method (HTTP 1.1 only) uses a unique identifier called an Entity Tag or ETag to uniquely identify an object. An If-None-Match request header followed by the ETag (obtained in a previous response) indicates that the origin server should (assuming no errors) return a 304 not modified if the server's current ETag for the requested document matches the requested ETag or a 200 response followed by the new document if they do not.

If a document specifies Cache-Control: no-cache or Pragma: no-cache, the user agent will not cache the document and is useful for dynamic content that changes unpredictably.

If none of Expires, Cache-Control, Pragma, or ETag appears in the response, a cache may use the Last-Modified date to calculate an expiration time. This is called heuristic caching and formulas usually work on the basis of choosing an expiration time based on a fraction of the interval since the object was last modified. Since heuristic caching might cause problems with dynamic content generation mechanisms such as JSPs or ASPs etc, these mechanisms typically omit the Last-Modified date and are not cached.

Setting HTTP header fields in responses are specific to the server-side technology used. Typically, however, setting the headers is straightforward. Listing 3 illustrates setting a header for a JavaServer Page (JSP).

<%
response.setHeader("Cache-Control", "max-age=86400");
%>

Listing 3: Setting a HTTP head in a JavaServer Page

Using VoiceXML attributes
VoiceXML 1.0 and VoiceXML 2.0 have slightly different mechanisms for controlling caching policy and we will mention both here. In VoiceXML 1.0, an attribute called caching is specified on elements requiring resources to be fetched (e.g.

VoiceXML 2.0 brings more advanced control with the introduction of maxage and maxstale attributes. When these attributes are not specified, the behaviour is the same as the default for VoiceXML 1.0 i.e. use the cached version of the object if it has not expired. The maxage attribute allows the developer to effectively specify an earlier expiration date than that associated with the object in the cache. Thus,

Listing 4: Specifying maxage caching of an audio resource in VoiceXML 2.0

indicates that the cached version should be used up to a maximum age of 24 hours. Setting maxage to 0 forces the user agent to ensure it has the latest version of the resource and is thus equivalent to safe in VoiceXML 1.0.

The maxstale attribute indicates that a cached object may be used up to a specified number of seconds after it has expired and could be used, for example, to ensure that the same audio is used during a call.

Listing 5: Specifying maxstale caching of an audio resource in VoiceXML 2.0

Caching Recommendations
In this section we suggest a couple of recommendations to observe when employing caching with VoiceXML

Meta tags
Although HTTP header equivalent values can be specified with the tag in VoiceXML, these are generally to be discouraged. Even though a VoiceXML interpreter might understand their meaning, it is unlikely that a proxy cache will and thus will be ignored.

Caching does not mean High Availability
By definition, caching is designed to provide a temporary store of information and thus should not be considered a substitute for proper high availability mechanisms. Use proven high availability mechanisms such as clustering, mirror sites etc.

Consider ETags formulation
Large-scale deployments that employ web server farms with load balancing should ensure that the ETag generation algorithm is identical on each server for identical content. Otherwise a user agent might receive unnecessary content when revalidating with different servers.

Separate static content from dynamic content
It is important to determine which content is suitable for caching and for how long. It is typical to consider the largest objects as the most suitable caching candidates and for VoiceXML applications these are typically large audio and grammar files. A useful strategy is to employ an industry grade web server to serve static content and apply the correct caching attributes (HTTP headers) and use the typically more computationally expensive application server for dynamic content.

Remember that there is no 'Reload' button
Whilst developing HTML applications, hitting the 'Reload' button or equivalent may refresh an updated file that is erroneously cached. This is usually not possible in a VoiceXML paradigm and careful consideration to avoid these scenarios should be employed. As a rule of thumb, it is advisable to use no caching whilst developing an application and to only introduce it when performance tuning the application at the end of the development cycle. If a forced update is required, it is possible to use the maxage attribute set to 0 to achieve this. Alternatively, changing the name of the referenced resource will have the same effect.

POST vs. GET
The HTTP POST method is exceptionally useful for sending large amounts of data to an origin server in a reliable and secure (at least over SSL) manner. However, it is not possible to cache this and the alternative GET method with query string should be considered if large amounts of data are not being sent to the server and the resultant content is suitable for caching.

Avoid depending on Pragma: no-cache
The HTTP 1.1 specification does not actually explicitly mandate what this means in a response header. Use Cache-Control: no-cache and perhaps an Expires with a date in the past in addition to the Pragma: no-cache to be sure that the object is not stored in any cache along the way.

Cache-Control: max-age takes precedence over Expires
To avoid confusion use only one mechanism.

Heuristic algorithms are meant as a fallback strategy
For consistency, it is recommended to always specify your caching requirements so as not to depend on platform dependent heuristic algorithms.

Conclusion
HTTP caching provides a powerful mechanism for improving performance of applications. A performant VoiceXML application that yields customer satisfaction will promote customer retention and also save money on deployment costs. Caching is often poorly understood and under-utilised on the Internet, yet can be effectively harnessed by observing some simple practices as outlined in this article.

References

[1] "Hypertext Transfer Protocol -- HTTP/1.1 ", IETF RFC 2616, 1999.
See http://www.ietf.org/rfc/rfc2616.txt

[2] "Voice eXtensible Markup Language 1.0", Boyer et al, W3C Note, May 2000.
See http://www.w3.org/TR/2000/NOTE-voicexml-20000505/

[3] "Voice eXtensible Markup Language 2.0", W3C Working Draft.
See http://www.w3.org/TR/2001/WD-voicexml20-20011023/

back to the top

 

Copyright © 2001-2002 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).