Tuesday, March 22, 2011

twilio4j – Twilio Call Flows in Java

I have been using Twilio for over a year now for a variety of things, and I love those Twilio folks! They have well thought out APIs, easy to read documentation, responsive employees on their user forums, and a nice pay-as-you go model of doing business.

This past week I put together a Java library called twilio4j, which makes assembling call flow state machines easy and readable (and kinda fun IMHO!). This library offers some readability and maintainability improvements:

  • It is very easy to identify the states of my state machine, and how they connect.
  • Receiving parameters and maintaining user state variables is uniform in the API, and easy to read.
  • Composing outbound TwiML is pure java, but in a very declarative style, and very readable.

I’ve made twilio4j open source under the Apache 2.0 license, and hosted the project at code.google.com. The code is functioning, and I have the bulk of the documentation filled in. So it’s time to put it “out there” and get feedback and find bugs. A jar file can be downloaded here.

So, what does it look like?

(Want the short version? Here’s the Quick Start.)

Here’s an example application: I want to call my Twilio phone number and be prompted to enter a code, then be prompted to record a voice message, have the opportunity to hear it and re-record it if I want, and finally be able to accept the recorded message. The state machine on paper looks like this (click pic to enlarge):

TwilioCallFlowStateMachinenocode

First, you represent all the states of the state machine with an enumeration. Enumerated values are easy to spot in code, and their names can capture some meaning of the state, addressing one of my readability concerns. Our enum class looks like this:

public enum VoiceRecordState {
    GATHER_CALL_IN_CODE,
    CHECK_CALL_IN_CODE,
    RECORD_MESSAGE,
    REVIEW_MESSAGE,
    REVIEW_MESSAGE_CHOICE,
    MESSAGE_READY_GOODBYE,
}

 

And then you extend the (generic) TwilioStateMachineServlet<E> class, using your enumeration class to type the new class. It looks something like this:

public class VoiceRecordServlet extends TwilioStateMachineServlet<VoiceRecordState> {
    public VoiceRecordServlet() {
      handler(GATHER_CALL_IN_CODE).respondsWith(
          gather(
              say("Enter your 6 digit code.").voiceWOMAN()
          )
          .numDigits(6)
          .action(CHECK_CALL_IN_CODE)
      );
      // other state handlers here, etc.
      handler(CHECK_CALL_IN_CODE).respondsWith( ... );
      handler(RECORD_MESSAGE).respondsWith( ... );
      handler(REVIEW_MESSAGE).respondsWith( ... );
      handler(REVIEW_MESSAGE_CHOICE).respondsWith( ... );
      handler(MESSAGE_READY_GOODBYE).respondsWith(new TwilioHandler() {
          @Override
          public TwiML getTwiML(TwilioParameters params) {
            String recordingUrl = params.getUserParams().get("my_param_key");
            doTheThingWithTheRecording(recordingUrl);
            return say("The message is now ready. Goodbye.").voiceWOMAN();
          }
      });
    }
}

 

Whenever an ‘action’ (or ‘callback’) attribute is needed, TwiML expects a URL. But you just supply any of your enumerated states. The parent servlet class will map that enum into the path of the servlet’s url, e.g. /t/voicerecord/CHECK_CALL_IN_CODE  would be action url that our ‘gather’ uses in the example code above. It’s nice to know this, but frankly we should not have to worry about the actual urls. We can just think in the realm of our clean enum state names.

Note how the TwiML is expressed in a declarative style, although it is still just Java code. This resolves another of my readability issues. The trick was to use constructors with variable number of arguments. Then you can write a comma separated list of verbs in a nested style. One minor nit is that the corresponding attributes are chained on at the end of a verb, after the closing parenthesis, rather than appearing above like it would in XML. But I still find it much more readable than this:

                TwiMLResponse twiml = new TwiMLResponse();
                Gather gather = new Gather();
                gather.setNumDigits(1);
                Say say = new Say("Review your message.");
                say.setVoice("woman");
                gather.append(say);
                Play play = new Play(recordingUrl);
                gather.append(play);
                Say say2 = new Say("Press 1 to accept this recording. Press 2 to record the message again. Press 3 to quit.");
                say2.setVoice("woman");
                gather.append(say2);
                twiml.append(gather);

 

Another noteworthy item is that you are given access to a userParams object, which is a HashMap of Strings which gets persisted as a cookie to the Twilio client for the duration of the phone call. So there is a very easy place to keep some user application state, and it is consistently presented to you in a TwilioParameters arg in your handler.

Another thing to highlight here is that once you leave a particular state (from a Gather or a Record, or a StatusCallback – whatever), knowing what parameters to expect are now nicely organized in the API. You are always passed a TwilioParameters object, which lets you access everything. For example, to access the parameters posted by a Record, use the params.Record() object to maybe grab the recordingUrl for example:

        handler(REVIEW_MESSAGE).respondsWith(new TwilioHandler() {
            @Override
            public TwiML getTwiML(TwilioParameters params) {
                RecordParameters rp = params.Record();
                if ( rp.isHangup() ) {
                    doAbandonedCallCleanup();
                    return hangup();
                } else {
                    String recordingUrl = rp.getRecordingUrl();
                    params.getUserParams().put(U_RECORDING_URL, recordingUrl); // save the recording url for another state.
                    return gather(
                        say("Review your message.").voiceWOMAN(),
                        play(recordingUrl),
                        say("Press 1 to accept this recording. Press 2 to record the message again.").voiceWOMAN()
                    )
                    .numDigits(1)
                    .action(REVIEW_MESSAGE_CHOICE);
                }
            }
        });

 

I hope this post shows enough highlights to demonstrate some of the readability improvements that twilio4j offers. I am already finding it more fun to code with this library rather than forging machines from raw urls, cookies, and posted string params! :-)

A jar file can be downloaded here.