Your chatbot should have a game loop
This article was originally published on Botpublication.com.
Many of the “smart” chatbots with Natural Language Processing and Artificial Intelligence still have a hard time keeping a conversation going. They keep forgetting whatever it was you were talking about even if you can see it plainly in the message right above!
This is a problem if we want chatbots to succeed. While many chatbots are programmed as call-and-respond systems (also known as “websites”), we should be programming them like we program games: using a game loop.
The difference between rendering a website and rendering a game
Websites have it easy: Your server is happily doing nothing at all, up until the point a visitor comes by and requests a page. At that point, your server takes the request, finds the pages and serves it. It gets more complex if the server needs to render a page specifically for you, or needs to retrieve and combine data from a database or other sources, but it’s very much on-demand and tied to your request.
Games don’t have that luxury. If your game just stops rendering images onto the screen and waits until you finally press that button, they wouldn’t be very popular. They have to conjure up a new image at the rate of about 60 images per second, with or without the playing actively providing input. To help with this, Games have something called a “game loop”.
What is a game loop?
In order to keep this 60 fps going, games usually operate on a game loop. A game loop consist of ‘ticks’ that are called 60 times per second (so about every 16ms), and they’re the function(s) that take care of rendering the next increment, the next image, of the game. For this, they need to gather data, such as “has the player pushed any buttons in the last 16ms” and “Where did the enemies move to in the last 16ms”. They take this data and update the game state. From this game state, the game can then render a new image to display on screen.
This means that your game is always visible and always has the latest (or at least very recent) input for a current frame, but is not dependent on that input.
Another way to describe it, is that games are programmed in a way that decouples the various inputs from the eventual output.
On to chatbots
Because most chatbots nowadays are built by web developers, you see web development paradigms being applied to them: messages behave like a stack of web-pages with no state between them. This really shows whenever users want to carry on with information from a previous message and the chatbot no longer has the context. It leads to a very disappointing experience. This is not the fault of web developers, or a knock on people developing chatbots. The API’s of different chat systems and languages we use all encourage us to think in the same way we think about websites or apps: call-response systems.
The game loop in games solves a very simple problem: it decouples the player state (input) from the rendering of game (output). Chatbots have this same problem: Whilst sending and receiving messages is the primary interaction with your user, it’s not the natural place to keep the context of your chat.
Enter player state
Most games create a “game state” at the start. This game state contains all the information a game will need to determine what to render to the screen. Things like the player level, their health, how many coins they have or what items they have with them, but also what game level they’re currently playing, and which baddies are on the screen right now.
With this information, the game can determine every 16ms if it needs to change something: The player buys a new sword, so we deduct a few coins. The player wins a battle so we award them with a new level, The player loses the last of their health, so we render a ‘game over’ screen instead. These changes are all done in the game loop, or specifically in a “tick”.
Your chatbot would benefit from such a game state too. The full state of your conversation should not depend on the message you just received. Instead, Keep track of your conversation state, and update it with each new message you get.
Let me give you a simple example with a weather bot that gives you the current weather or the temperature in a given location:
- Player: Get started
- [Bot’s introduction message gets triggered]
- Bot: Hi, how can I help you? You can ask me questions like “What’s the weather like in San Francisco” or “How many degrees is it in New York?”
- Player: What’s the weather in Amsterdam?
- [Bot does some NLP and ends up with this conversation state: {type:weather, location: Amsterdam, Netherlands}]
- Bot: It’s cloudy today with a 10% change of rain.
- Player: And the temperature?
- [Bot again does some NLP and this time ends up with: {type: temperature, location: unknown} but is smart and attempts to compensate:]
- Bot: Where do you want to know the temperature?
- [Player is now annoyed. They just typed Amsterdam!]
Now here’s how decoupling the player state from the response plays out:
- Player: Get started
- [Bot starts with a created player state of, lets just say: {type: unknown, location:unknown}, and sends the introduction message.]
- Bot: Hi, how can I help you? You can ask me questions like “What’s the weather like in San Francisco” or “How many degrees is it in New York?”
- Player: What’s the weather in Amsterdam?
- [Bot does some NLP, updates the player state to {type:weather, location: Amsterdam, Netherlands} and formulates a response based on this state.]
- Bot: It’s cloudy today with a 10% change of rain.
- Player: And the temperature?
- [Bot does some NLP and only determines that the player state should be updated by switching to temperature. The state is now {type:temperature, location: Amsterdam, Netherlands}]
- Bot: It’s 12 degrees Celcius in Amsterdam, Netherlands.
- [Player is ecstatic!]
The above might be a little contrived, but many chatbots operate on the first model and it leads to sub-par experiences for people that expect a conversation. Chatbot users can see the correct context right above the chatbots’ ignorant response!
I’m convinced! How do I get a game loop?
Games have to draw something on the screen every 16ms. Chatbots don’t. For the most part, only have to reply when a person sends a message first.
If we really run the game loop every 16ms for all our chatbot users, we’d end up with a lot of wasted time (because you’re not going to send a message for the vast majority of those ticks). So let’s cheat a little and only run your tick function whenever we do get a message sent to us. (Don’t tell your gamedev friends, or they might get jealous.)
When you receive a message, you have to do two things:
- Update your user state
- Render a message out of the user state and send it
The immediate effect is that, apart from parsing the message you just received and keeping some trivial user info accessible, you should have another part of your code that, for each user, keeps their conversation state. This conversation state is updated after you parsed the message, and the message you send back is based on that conversation state.
Your conversation state can be as simple as the weather example above, that just takes permutations of “most recently asked type of info” and “most recently asked location” and deals with that, but of course it can be much more complex as well. My Messenger Adventure chatbot for example, keeps track of where in the game world you are, what places you have already visited and which items you have with you.
How do I learn to build a game loop?
If you want to experiment with how to program a game loop before committing to one for your chatbot, I highly recommend checking out Phaser.io, a HTML5 gaming engine that’s very friendly and easy to get started with. It will introduce you to the concepts I describe in this article, and you will immediately see the results in game form, which is very satisfying.
In closing
Game state is easiest to keep track of in a game loop, a concept that games employ to keep their input and output decoupled. You can not depend on each single input of your chatbot to be full-context, so you similarly need to separate the input and output of your chatbot for the best results. To do this, adapt the concepts of game state and game loop to your chatbot.
Your chatbot should have a game loop and a conversation state.