Technology

Building an SMS receptionist with the help of Twilio and a Finite State Machine

Every time you book an appointment with us you receive a text confirmation. Also, we send out a text reminder an hour before your appointment. Our clients really dig it. We get compliments and even "thank you"s at least once a week. 

The Problem

What was happening is people began responding to our text messages. Looking back, it makes a lot of sense that they would be inclined to do that. After all, for anyone who booked through their mobile phone (90% of our clients), the only communication they've ever received from us has been via text message. But, out of the gate, Twilio doesn't start you off with a very friendly message.

The Tools

Having an API for sending/receiving text message is obviously crucial. You need a way to send messages. Twilio has that handled. But, how do we build out a "Digital Receptionist"? How do we manage many conversations with many clients where each can be at a different point in the conversation. For example, Client A could be in the middle of trying to cancel an appointment (which is a 2 step processes) while Client B is trying to book one. 

How do we maintain state for these interactions? Twilio sends us a POST request for every text message we receive. These POSTs have no state, no way of telling us how far through a process they may be. How can we make sure that when Client A texts "y" to confirm his cancelation that the system knows the context? 

Finite State Machine

What is a FSM? I was thinking about forwarding you to the wikipedia definition, but that isn't really helpful for someone like me. I more have to see a system in action to really grasp it. So, lemme start with the basics. I'll cover this in the context of nodejs and Altair

To start, the FSM is a simple instantiate'able class, just like you're used to in any OO environment, but there are a few things that make it special.

this.fsm = new StateMachine({
    states: ['mainMenu', 'book', 'reschedule', 'cancel', 'randomCompliment'],
    state:  'mainMenu',
    delegate: this
}); 

The Altair implementation of the FSM has 3 important bits; `states`, `state`, and `delegate`. The first is a list of possible states the FSM can be in. Makes sense, yeah? The thing about a FSM is that while it can have many (but a finite number of) possible states, it can only be in one state at a time. The `state` option tells the FSM where to start. Since everyone who firsts texts our digital receptionist will need to know what their options are, starting in the `mainMenu` state makes the most sense.

The last and arguably most critical part of our FSM implementation is the `delegate` option. In non-tech terms, a delegate is someone who you've chosen to speak and/or take action on your behalf. In tech terms, it's pretty much the same thing. Anytime the FSM transitions between states (someone on the main menu says they want to book), the `delegate` gets notified. The `delegate` is expected then to take some action, do some things, maybe crunch some numbers. The FSM doesn't really care.

Ok, neat, but lets see some code, yeah? I've removed a bunch of the source for brevity, but if you are feeling adventurous you can view the full source here.


define(['altair/facades/declare',
        'altair/Lifecycle',
        'altair/mixins/_AssertMixin',
        'altair/StateMachine',
        'altair/facades/__',
        'altair/plugins/node!moment',
        'lodash'
], function (declare,
             Lifecycle,
             _AssertMixin,
             StateMachine,
             __,
             moment,
             _) {

    return declare([Lifecycle, _AssertMixin, StateMachine], {

        fsm:            null,
        
        compliments: [
            'Your text messages look fantastic!',
            'Personally, I think you already look great. But, I\'m a computer, so what do I know?',
            'That is a slick phone you have there!',
            'Are you doing this to see how many compliments I have?',
            'I know I\'m just a machine, but I really don\'t feel like giving any more compliments right now.',
            'Gah! Go away, I don\'t have anything else to say to you!',
            'Seriously... just.... seriously...',
            'Ok, for reals... I\'m out of ideas here. I\'ll just start over now.'
        ],

        startup: function (options) {

            //setup our statemachine
            this.fsm = new StateMachine({
                states: ['mainMenu', 'book', 'reschedule', 'cancel', 'randomCompliment'],
                state:  'mainMenu',
                delegate: this
            });

            return this.inherited(arguments);
        },

        execute: function (options) {

            var message = options.Body,
                state   = this.fsm.state;

            return this.fsm.transitionTo(state, options);
        },


        onStateMachineWillEnterMainMenu: function (e) {
            //the 'willEnter' callback is fired as the FSM is starting to transition
            //into its new state. you can cancel, redirect, load dependencies, or anything
            //else you want to do before the FSM enters its new state
            return { foo: 'bar' };
        },

        onStateMachineDidEnterMainMenu: function (e) {
            var foo = e.get('foo'); //i can load anything returned from the `willEnter` callback.
            return ['{{stateForNextTextMessage}}', '{{text returned to twilio}}']
        },

        onStateMachineDidEnterBook: function (e) {
        },

        onStateMachineDidEnterReschedule: function () {
        },

        onStateMachineDidEnterCancel: function (e) {
        },

        onStateMachineDidEnterRandomCompliment: function (e) {
            //i left this in here so you can see how easy it was to add this feature
            var text = this.compliments[this.complimentIndex];
            this.complimentIndex ++;

            if (this.complimentIndex > this.compliments.length - 1) {
                this.complimentIndex = 0;
            }

            return ['mainMenu', text];

        }

    });

});

Our FSM comes with 3 callbacks per state. So, if your state was called `mainMenu` the callbacks you could implement are:

  1. onStateMachineWillEnterMainMenu
  2. onStateMachineDidEnterMainMenu
  3. onStateMachineDidExitMainMenu

I purposely did not create a `onStateMachineWillExitMainMenu` callback because I've yet to see a case where you need to do setup for teardown. =)

Two People at Once?

You may be asking yourself, "I see how you can manage the communication with 1 client this way, but how do you do it with many clients at once?"

The answer, "Create a FSM for each client!"

No joke, every time a client sends us a text message we create a new FSM if one for that client does not already exist. The clients will hang out for 15 minutes of inactivity before being destroyed. Here is a screen from our Twilio Controller so you can get an idea of how that works:

In conclusion

If you find yourself in a situation where your are managing an experience and that experience requires your users to be in only a single state at a time, then a FSM is the ticket. By breaking each state in the interaction to a separate method, you keep yourself from having to depend on nested `if` statements, switch statements, or all the other things that could lead you to a heaping plate of spaghetti code!

Plus, an extra bonus you get by decoupling your states is that your code is much more reliable. Now I can spend as much time as I need messing with the main menu without having to worry about accidentally messing up how we give complements.

I try when I can to take it one more level and break out all my loading/setup into the 'WillEnter' callback. Then when I hit 'DidEnter' I focus solely on doing my work. Of course it's tough to stay disciplined about it, but I'm always happy I did.

1 comment

By Hemstreet, on

Great article, very informative on the uses of a finite state machine!

Leave a comment