Neural Network Chatbot using Tensorflow (Keras) and NLTK

We, humans, are social beings. We socialise and rely on one another, but sometimes, that’s just not sufficient enough. We’re usually isolated, and many times, we just want someone to acknowledge us. Just like how a neural network chatbot can.

Using neural networks, chatbots are able to communicate to us just like how we communicate with each other (sometimes), and understand us in ways that other people don’t using Natural Language Understanding. Whether the chatbot is a waifu bot, or just a generic weather bot, with such unlimited power (not really), anything’s possible.

But what exactly is a chatbot?

What exactly is a Chatbot?

Bots are the future of software.

– Amir Shevat, Designing Bots

A chatbot is a piece of software that can converse with you using natural language, usually through text-to-text services like Discord, Messenger, and Slack.

Funny enough, most chatbots are actually just a bunch of “if” statements, asking an input from the user and replying something generic, it could look something like this:

def ask_the_best_bot(input):
    if ('Hi' in input):
        return "Don't talk to me."
    return "What?"

ask_the_best_bot("Hey! How's it going?") # 'What?'

And that could be a 100% legit way to create your chatbot if that was your intention!

We don’t want to over-engineer something sophisticated when the solution could be straightforward. For example, a simple chatbot could be one that tells you the weather every morning, or a chatbot that reminds you to finish your tasks, and of course, a bot that just chats with you.

While most chatbots differ from one another in terms of functionality, there’s one that’s always constant, that they love to chat! Hence the chat in a chatbot.

With this tutorial, we’ll create a simple chatbot friend that only chats. A simple task that we can use Natural Language Processing and Neural Networks for.

Preparing our data for our chatbot

Before we start building our chatbot, we have to first prepare its vocabulary. Basically, creating a training dataset that will be used to train our neural network.


  1. We have to describe what kind of user interactions we want to cater for, their intentions (e.g. greetings, saying goodbye, questions).
  2. Then, we come up with a bunch of sample queries for those intentions, questions that you would think they would ask (e.g. asking “How are you?”, “Hello”, and “Good morning” could be an intent to greet the bot).
  3. Finally, we come up with a bunch of responses to these kinds of intents (e.g. “Good day mate, how are you?”, “Heyo!”, “Hi domo”).

Once we’ve thought of our user’s intents, questions, and the responses we’ll give, we can then create a simple JSON file that will store all this information. Just like below:

    "intents": [
            "intent": "greeting",
            "queries": [
                "How are you?",
                "Hi there",
                "Good day mate",
                "Good afternoon"
            "responses": [
                "Hi, good to see you!",
                "How can I help mate?",
                "Heyo, what can I do to help?"
            "intent": "farewell",
            "queries": [
                "I'm gonna go",
                "See you later",
                "I'll see you next time",
                "See ya",
                "Okay, bye"
            "responses": [
                "Cools! I'll see you when I see you.",
                "See ya mate.",
                "Okie dokes, I'll see you around."
            "intent": "help",
            "queries": [
                "Could you help me?",
                "What can you do?",
                "Help! I need help!",
                "What can you help me with?"
            "responses": [
                "Nah, mate. Can't help ya with anything.",
                "Mate, my creator has given me nothing to help you with..."
            "intent": "default",
            "queries": [
            "responses": [
                "Mate, do you speak English?",
                "Mate, if I can't help ya, who can?",

In the above JSON file, you can see that we’re catering for users trying to greet, say their farewell, and asking help from our bot. We also described a “default” intent, this would be our catch-all in case our bot doesn’t understand the user’s intention.

Note. The reason we use the above format for our data is largely due to my preference. Though, if you have your own style of how you want your data to be formatted, you should definitely use your own format (or the format your company is forcing you to use 😢).

Reading and cleaning our chatbot data using Pandas and NLTK

Once we’ve got our training data, we’re can start importing our modules such as Pandas, a module to easily read and manipulate data; NLTK, a module to tokenize our words and stem them; and Tensorflow & Keras, to build our neural network chatbot.

import nltk
from nltk.stem.snowball import SnowballStemmer
from nltk.lm.preprocessing import pad_sequence
stemmer = SnowballStemmer("english")
from nltk.tokenize import TweetTokenizer
tt = TweetTokenizer()

# Tensorflow requirements
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout
import keras.backend.tensorflow_backend as tb

import random
import json
import pandas as pd
import string

Once we’ve imported our essential modules, we can start reading the data from our file easily using Pandas:

with open('../data/intents.json') as json_file:
    skills_df = pd.read_json(json_file)
    skills_df = pd.DataFrame(skills_df.intents.values.tolist())

Our intents, queries, and responses data

Our intents, queries, and responses data

Great! We’ve imported our data in successfully!

Though, you might have noticed that our queries and responses are embedded in arrays. This is not ideal for us as it’ll be hard to access the data later on. As so, we can expand/explode the columns so that for each query, we have a separate row. Then, the same can be done for the responses column.

queries = skills_df.explode('queries').reset_index().drop(['responses', 'index'], axis = 1)

Intents and their queries

Intents and their queries

Now that looks a lot better!

Every query is now separated into their own row with their respective intents. We dropped the responses column because we don’t need it to train to our neural network chatbot.

Instead, we can do the expanding for responses and leave it in another variable until we actually need to use it.

responses = skills_df.explode('responses').reset_index().drop(['queries', 'index'], axis = 1)

Tokenization and Stemming

Now, we want to tokenize our queries, the process of chopping our sentence into words; and then stem the words/tokens, reducing a word to its stem, cutting off affixes, suffixes, and anything that doesn’t add value to the words.

That being said, we also want to casing is consistent between words. This to ensure that our chatbot doesn’t treat “Hi” and “hi” differently just because of the casing. And besides that, we also want to ensure that only alphabetical characters come through, so characters such as semicolons, quotation marks, don’t pollute the chatbot.

We’ll do all the above by first, creating a helper function to do the tokenization and stemming, then utilising Pandas’ apply() function so that we can apply the function to every single query in our data.

def clean_query(query):
    words = nltk.word_tokenize(query)
    words_token = [stemmer.stem(word.lower()) for word in words if word.isalpha()]
    return words_token

queries['queries'] = queries['queries'].astype('str')
queries['queries_token'] = queries['queries'].apply(clean_query)

Queries, tokenized, and stemmed

Queries, tokenized, and stemmed

As you can see in our data, all our words are now lowercased, split from one another in an array, and some are even stemmed (e.g. “Goodbye” to “goodby”).

Our next step now is to vectorise the tokens.

Vocabulary, Intent Classes, and Bag of Words

Because neural networks can’t work with raw text directly, we have to turn our queries into a vector of numbers. This applies to our intent classes as well.

As so, we have to create an array of our vocabulary and intent classes, and then create helper functions to turn our data into vectors. We can first build our vocabulary and intent classes as below:

def create_vocab(queries):
    vocab = []
    for query in queries:
    return list(set(vocab))

intents = list(skills_df['intent'])
vocab = create_vocab(queries['queries_token'])

Once we have those ready, we’re ready to vectorise our training data:

def bag_words(query):
    bow = [0] * len(vocab)
    for word in query:
        index = vocab.index(word) if word in vocab else None
        if index == None: continue
        bow[index] += 1
    return bow

def intent_no(intent):
    intent_arr = [0] * len(intents)
    intent_arr[intents.index(intent)] = 1
    return intent_arr

queries['bow'] = queries['queries_token'].apply(bag_words)
queries['class'] = queries['intent'].apply(intent_no)

Vectorised data

Vectorised data

As we can see from our result, we’ve managed to vectorise our data. Because our data is in a vectorised format, this allows us to easily feed the data to our neural network.

Creating our Neural Network Chatbot Model

Now that we’ve completed the hard part, it’s all down to shoving our data down our neural network to create our chatbot. For simplicity sake, we’re going to create a simple linear stack neural network model.

We’ll start off by first,

  1. Creating our input layer, this will use our bag of words’ shape as the input and ReLU as our activation function.
  2. Adding a dropout layer of 0.25, which helps with overfitting.
  3. Creating a hidden layer with ReLu as our activation function.
  4. Adding another dropout layer of 0.15
  5. Finishing off with the last layer, that will be used to output one of the four intent classes (vectorised) by using Softmax as our activation function.
  6. Finally, we compile the neural network using Categorical Cross-Entropy as our loss function, and RMSProp as our optimiser. The accuracy metric is just there to evaluate the performance of our model.
model = Sequential()
model.add(Dense(256, input_shape=(len(queries['bow'][0]),), activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(len(queries['class'][0]), activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

Once we’ve compiled our model, it’s time to start training it with the data we’ve created.

In our training, we’ll set our epochs to 200, which means our training dataset is going to pass through the neural network 200 times. We’ll also set our batch size to 5, which sets how many training examples will go through the neural network at a time.['bow'].tolist()), np.array(queries['class'].tolist()), epochs=200, batch_size=5, verbose=1)

Training our neural networks chatbot

Training our neural networks chatbot

The training might take a while depending on the size of your training dataset and whether your machine is a potato or not.

If you’re having trouble with how “slow” your model is training, I would recommend creating a smaller neural network, reducing the size of your training dataset, or possibly, buying a better PC or cloud solution.

Prediction and chatting with our Neural Network Chatbot

Once the training of our model is completed, we’ll create additional helper functions to enable our users to communicate with our chatbot model.

Because our model will output a vectorised confidence/probability array of our intent classes like below:

query = "Could you help me with something?"
query = clean_query(query)
query_bag = bag_words(query)
predictions = model.predict(np.array([query_bag]))[0]
# [1.6060410e-11 1.2441898e-12 1.0000000e+00 7.1343530e-12]

We’ll have to a helper function to classify the user’s query. As so, we would want to first get the intent class with the highest confidence, and then validate the class is higher than our confidence threshold, which we’ll set to 0.6.

Similarly to how we’ve cleaned and processed our training data, queries coming from our users need to be tokenized, stemmed, and turned into a bag of words. And because we’ve already created helper functions for it, we can just pass these queries into those functions.

def classify_query(query):
    confidence = 0.6
    query = clean_query(query)
    query_bag = bag_words(query)
    predictions = model.predict(np.array([query_bag]))[0]
    choosen_class = -1 # Set to default answer
    best_guest = np.where(predictions == np.amax(predictions))[0]
    if predictions[best_guest] > confidence:
        choosen_class = best_guest[0]

    return intents[choosen_class]
classify_query("Could you help me with something?")
# 'help'

Now that we get back a intent class back from our helper function, we can create one last helper function that chooses a random response from the predicted intent class.

In our case, we’ll utilise Pandas’ query() function to get the responses which are allocated to the intent class, and then sample 1 randomly from the responses.

def ask_bot(query):
    intent = classify_query(query)
    logic = 'intent == "' + intent + '"'
    return responses.query(logic).sample(n = 1)
answer = ask_bot("Could you help me with something?")

Our neural networks chatbot cold cold response

Our neural networks chatbot cold cold response

And we’ve completed our chatbot! You might notice that the response provides the intent class as well. You can omit this by just returning: answer[‘responses’] but I’ll like to keep it for debugging purposes, just so that I’m aware of what intent the chatbot is perceiving.


In summary, we’ve managed to create our training dataset to suit our chatbot needs. We cleaned the data by tokenizing and stemming it using NLTK, then prepare our data through vectorisation and creating a vocabulary. Lastly, we created a neural network model to predict the intent of the user, and created helper functions to allow the user to easily chat with our chatbot model.

You might have noticed, but creating the neural network was the easy part. Preparing the data was in fact, more time-consuming and complex. And that will always be the case as no matter the data quality, which I’ve seen it many a time in my current job.

Also, even though we’re creating a chatbot, it does seem like we’re creating a classifier!

While this isn’t completely wrong as we’re essentially using our a neural network to classify the users’ queries into different intent classes. But that doesn’t mean chatbots are solely classifiers. If we implement additional features to it that doesn’t involve machine learning, the chatbot will definitely be transformed into something more.

Further Reading