Home | API | VoiceXML | Download | About

We introduce what you need to know to implement your own vxml browser and describe how VoiceXML::Client expects to talk to your audio device.

The Dummy Telephony Device
Sources of VoiceXML
Bare Bones Example Client
Navigating the Example
The Telephony Device Interface
Device Handle Methods

VoiceXML::Client API

The VoiceXML::Client library's job is to do the dirty work involved in understanding and executing the contents of Voice Extensible Markup Language (VoiceXML) pages. It does this by parsing a vxml page and controlling some type of telephony device according to the instructions in the VoiceXML. This means you, as the programmer of a VoiceXML client, need two things before you start:

Dummy Device

The telephony device controled by VoiceXML::Client must support certain functions using a specific API in order to be used by the module. This interface will be detailed below, for the moment we can use a safe default that is included in the package: VoiceXML::Client::Device::Dummy.

This dummy device doesn't actually do anything, but it does implement the full device API by printing out a message for every call made to the device. It also requests that you enter DTMF data on the command line whenever user input is requested by a VoiceXML form field.

Sources of VoiceXML

The VoiceXML you use may be static pages stored on the same system. Most interesting applications, such as the VOCP voice messaging system for which this module was originally developed, will likely be generating the VoiceXML on the fly, for instance through a CGI.

In any case, the pages are fetched using LWP::UserAgent, which means that any method supported by that module should function nicely. For example:

Whether you plan to write your own VoiceXML or use some existing source, it is probably a good idea to have a look at our VoiceXML introduction and the details of the supported VoiceXML elements.

Bare Bones Example

In it's absolute simplest form, all we need is to figure out a starting URL for the vxml page, initialize the device, and hand both of those off to a VoiceXML::Client::UserAgent. Here's the code:

	use VoiceXML::Client;
	use strict;
	# basic info for VoiceXML source
	my $sourceSite = 'voicexml.psychogenic.com';
	my $startURL = '/vocp.cgi';
	# using dummy device here, to get started
	my $telephonyDevice = VoiceXML::Client::Device::Dummy->new();
	# our workhorse: the user agent
	my $vxmlUserAgent = VoiceXML::Client::UserAgent->new($sourceSite);
	# go for it:
	$vxmlUserAgent->runApplication($startURL, $telephonyDevice);
	# done. Insert gleeful hand wringing here..

If you stick the above in a script and run it, it will function quite nicely by fetching an example VoiceXML page from this server.

The above is all you really need to have a functional client to parse and interpret VoiceXML.

Navigating the Example

This example vxml fetched by the script above is a type of vocal bulletin board system. You can listen to the system message, leave your own, listen to messages left by others and even rate those messages. It is a good example of the power available through a single well-crafted VoiceXML page.

The first thing you'll see if you run the program is:

./dummytest.pl [9483]: Audio element could not find file '/path/to/vocpmessages/guestbox500.rmd'
./dummytest.pl [9483]: Audio element could not find file '/path/to/vocpmessages/system/guestinstructions.rmd'

Enter DTMF Selection ([0-9]+): 

Since the audio files requested are most likely not on your system, the audio elements output an error message. This is actually helpful as it lets you know what file would have been played. Since the first form requires user DTMF input, the dummy device requests you enter a selection. Valid selections at this stage are 0, 1 and 2. Enter something else and the system will "play" an error message and reprompt.

The call-flow of this box is as follows:

Under normal circumstances, the points marked as "refetch" would likely change the VoiceXML content returned by the system thanks to the data submitted along with the submission (e.g. the rating you've set or the fact you wish to return to the root box). For this example the same content is returned each time.

Go into the listen to messages left by others menu and move around. You'll see the system update the message played, and tell you when you've hit the first or last message. The neat thing is that all this intelligence doesn't reside in VoiceXML::Client but in the VoiceXML page itself--it's a little programming language!

Enter invalid or empty content too often, and the system will eventually "hang up" on you. This is a nice way to exit the program.

The Telephony Device Interface

Since getting the User Agent up and running is so simple, most of your work will involve getting whatever type of telephony interface you're interested in implemented. To really get going, you might want to have a look at the internals of the VoiceXML::Client::Device::Dummy module and maybe the VoiceXML::Client::DeviceHandle class from which it is derived.

Here, we'll go over the bare minimum you need to have in order for your device to be useable by the system.

For starters, your module must inherit some basic methods from the DeviceHandle class:

package MyNamespace::MyDevice;
use strict;
use base qw(VoiceXML::Client::DeviceHandle);

Then you need to implement the following methods according to your device specifics. Note that each of these methods has, as a first parameter, the $self reference--i.e. it is called as $deviceHandle->method()--so don't forget to pop $self off the parameter stack before anything else (i.e. my $self = shift;).

Device Handle Methods

connect [$PARAMSHREF]

Do whatever you like here, passing optional parameters hash reference for the connection. This method isn't actually called by the library at this point, so you need to do your connection before handing of control to the UserAgent.


Called to close the call where appropriate. Do your cleanup, hang up the phone. Return true on success.


Called to play audio. Currently, only $PLAYTHIS gets passed as a the path of a file to play, the optional $TYPE is not yet implemented, but may be used to play text-to-speech or somesuch in the future. Assume that if TYPE is false, it is a file to play.


Called when we need to play a beep to prompt users to begin recording messages. Frequency and milliseconds indicate the tone and length of beep sound. Return true on success.


Called to record user voice input, which is saved in the file $FILETOSAVEAS. Return true on success.


Do nothing but wait, presumably for user input. Device should accept and queue user input during this interval.


Fetch user DTMF input. Default $REPEATTIMES to a sane value (e.g. 3) if none passed. If $PLAYTHIS is undefined, well... don't play anything. It returns failure (undefined value) if no digits are entered after the message has been played $REPEATTIMES time (if set).

The readnum method is a bit involved, owing to the nature of input collection. readnum is called at points where user input is required, such as a <filled> element but things aren't that simple. In effect, users may begin entering input at any moment, for instance while a prompt is being played.

Most of the time, we want to interupt play at that point. In addition, if the field looks something like this:

	<field name="movetosel" type="digits" numdigits="1">

then we would want to skip the readnum call altogether, since the field only requires a single digit.

Hence, some elements like <field>, use some internal magic to tell the system about their interest in hearing about user input as soon as possible. This is done by calling the VoiceXML::Client::DeviceHandle registerInputReceivedCallback() method. The internals take care of these callbacks but it is still your job to notify the system of spontaneous user input.

Thus, however your device is implemented, when you receive a digit from your caller you need to do two things:

	# got DTMF by whatever means and stashed in $dtmf, then:
	# do whatever you need to do, then add this digit to our stack

That second step keeps track of multi-digit input. Finally, your task when implementing readnum itself, is to start by retrieving any pending input:

	my $dtmfInQueue = $self->fetchPendingInput();

Then get any additional input required and return everything from your method. Note that the call to fetchPendingInput() actually clears any pending input, so don't loose it.

Home | API | VoiceXML | Download | About

Copyright (C) 2008, Pat Deegan Psychogenic.com
All Rights Reserved.