How to Build a Customer Service Chatbot with Python, Flask, and Pinecone | HackerNoon (2022)

Similarity search is a subset of the machine learning field that deals with finding closely related items to the original input. It’s incredibly useful for things like products, music, or movie recommendations. You watched The Office on Netflix, so here are some other shows you may like. You frequently listen to Bayside on Spotify, so go check out these other pop-punk bands.

Similarity search can also be used to automate customer support. What if a customer asks a question, you could easily find previously asked similar questions and answers that could help them?

In this article, we’ll build a Python Flask app that uses Pinecone — a managed similarity search service — to do just that.

Motivation and Real-World Application

Before we jump into the demo app, let’s take a minute to examine the problem we’re trying to solve. Imagine you’re an executive at a large company with thousands or even millions of customers. Your customer support team is repeatedly asked the same questions day after day. To save time and money, you could streamline your support process by having good public-facing documentation and FAQ pages. But how can you ensure that customers find the information they need? After all, creating the documentation is only half the battle.

One approach that many companies take is to use a customer service chatbot. When a customer first initiates a conversation, they’re chatting with a robot. The customer enters their question and the bot tries to help solve their problem. If the bot can respond with accurate, related questions and answers, then the customer may be able to solve their problem on their own. And if that doesn’t work, then the customer can request to speak with an actual human being who can help. Artificial intelligence and machine learning can’t solve all of our problems — at least not yet.

Demo App Overview

Let’s now take a look at our demo app. Below you can see a brief animation of how the app works. The user enters a question and submits the form, and then related questions appear in hopes of answering the user’s original question.

Pretty neat, right? So how does this all work?

In building the app, we first found a dataset of questions and answers from Quora. This dataset contains hundreds of thousands of questions, but we’re just using the first 50,000. We then took those questions and ran them through an embedding model to create what are called vector embeddings. A vector embedding is essentially a list of numbers that provides metadata for machine learning algorithms to determine similarities between various inputs. We used the Average Word Embeddings Model. We then inserted these vector embeddings into an index managed by Pinecone.

Now, when the user submits their question, a request is made to an API endpoint that uses Pinecone’s SDK to query the index of vector embeddings. The endpoint returns five similar questions, and those results are displayed to the user in the app’s UI.

In other words, Pinecone — as a managed similarity search solution — provides the engine for returning recommendations. You just bring your vector embeddings, which are generated by running data through an embedding model.

If you’d like to try it out for yourself, you can find the code for this app on GitHub. The README contains instructions for how to run the app locally on your own machine.

(Video) Hide API keys in Python scripts using python-dotenv, .env, and .gitignore

Demo App Code Walkthrough

Now that we understand the motivation behind the project and have a high-level overview of how the app works let’s dig into the actual code to see what’s going on under the hood. To keep things simple, all of the backend code is found in the file, which we’ve reproduced in full below:

from dotenv import load_dotenvfrom flask import Flaskfrom flask import render_templatefrom flask import requestfrom flask import url_forimport jsonimport osimport pandas as pdimport pineconeimport requestsfrom sentence_transformers import SentenceTransformerapp = Flask(__name__)pinecone_index_name = "question-answering-chatbot"DATA_DIR = "tmp"DATA_FILE = f"{DATA_DIR}/quora_duplicate_questions.tsv"DATA_URL = ""def initialize_pinecone(): load_dotenv() PINECONE_API_KEY = os.environ["PINECONE_API_KEY"] pinecone.init(api_key=PINECONE_API_KEY)def delete_existing_pinecone_index(): if pinecone_index_name in pinecone.list_indexes(): pinecone.delete_index(pinecone_index_name)def create_pinecone_index(): pinecone.create_index(name=pinecone_index_name, metric="cosine", shards=1) pinecone_index = pinecone.Index(name=pinecone_index_name) return pinecone_indexdef download_data(): os.makedirs(DATA_DIR, exist_ok=True) if not os.path.exists(DATA_FILE): r = requests.get(DATA_URL) with open(DATA_FILE, "wb") as f: f.write(r.content)def read_tsv_file(): df = pd.read_csv( f"{DATA_FILE}", sep="\t", usecols=["qid1", "question1"], index_col=False ) df = df.sample(frac=1).reset_index(drop=True) df.drop_duplicates(inplace=True) return dfdef create_and_apply_model(): model = SentenceTransformer("average_word_embeddings_glove.6B.300d") df["question_vector"] = df.question1.apply(lambda x: model.encode(str(x))) pinecone_index.upsert(items=zip(df.qid1, df.question_vector)) return modeldef query_pinecone(search_term): query_question = str(search_term) query_vectors = [model.encode(query_question)] query_results = pinecone_index.query(queries=query_vectors, top_k=5) res = query_results[0] results_list = [] for idx, _id in enumerate(res.ids): results_list.append({ "id": _id, "question": df[df.qid1 == int(_id)].question1.values[0], "score": res.scores[idx], }) return json.dumps(results_list)initialize_pinecone()delete_existing_pinecone_index()pinecone_index = create_pinecone_index()download_data()df = read_tsv_file()model = create_and_apply_model()@app.route("/")def index(): return render_template("index.html")@app.route("/api/search", methods=["POST", "GET"])def search(): if request.method == "POST": return query_pinecone(request.form.question) if request.method == "GET": return query_pinecone(request.args.get("question", "")) return "Only GET and POST methods are allowed for this endpoint"

Let’s break down what’s happening here, method by method, line by line.

On lines 1–11, we import our app’s dependencies. Our app relies on the following:

  • dotenv for reading environment variables from the .env file
  • flask for the web application setup
  • json for working with JSON
  • os also, for getting environment variables
  • pandas for working with the dataset
  • pinecone for working with the Pinecone SDK
  • requests for making API requests to download our dataset
  • sentence_transformers for our embedding model

On line 13, we provide some boilerplate code to tell Flask the name of our app.

On lines 15–18, we define some constants that will be used in the app. These include the name of our Pinecone index, the directory in which we’ll store our question data, the file name of the dataset, and the URL from which we’ll download the dataset.

On lines 20–23, our initialize_pinecone method gets our API key from the .env file and uses it to initialize Pinecone.

On lines 25–27, our delete_existing_pinecone_index method searches our Pinecone instance for indexes with the same name as the one we’re using (“question-answering-chatbot”). If an existing index is found, we delete it.

On lines 29–33, our create_pinecone_index method creates a new index using the name we chose (“question-answering-chatbot”), the “cosine” proximity metric, and only one shard.

(Video) How to Using Pinecone to get Semantically Similar QnA from Google's Natural Questions Dataset

On lines 35–41, our download_data method downloads the dataset of Quora question-answers pairs if needed. If the file already exists in the tmp directory, then we just use that file.

On lines 43–50, our read_tsv_file method reads the TSV file using the pandas library and inserts each row into a data frame. We also remove any duplicate questions found in the dataset.

On lines 52–57, our create_and_apply_model method uses the sentence_transformers library to work with the Average Word Embeddings Model. We then create a vector embedding for each question by encoding it using our model. The vector embeddings are then inserted into the Pinecone index.

Each of the methods we’ve described so far is called on lines 77–82 when the backend app is started. This work prepares us for the final step of actually querying the Pinecone index based on user input.

On lines 84–94, we define two routes for our app: one for the home page and one for the API endpoint. The home page serves up the index.html template file along with the JS and CSS assets, and the API endpoint provides the search functionality for querying the Pinecone index.

Finally, on lines 59–75, our query_pinecone method takes the user’s input, converts it into a vector embedding, and then queries the Pinecone index to find similar questions. This method is called when the /api/search endpoint is hit, which occurs any time the user submits a new search query.

For the visual learners out there, here’s a diagram outlining how the app works:

(Video) How to Build Custom Q&A Transformer Models in Python

Example Scenario

So, putting this all together, what does the user experience look like?

A user could visit our site, enter the question “How to learn Python”, find similar questions that have been asked in the past, and then click on the links to see the questions and answers on Quora.

Following along with our customer service scenario, a user might ask a question about how to use our company’s product, find similar questions, click on a link, and be directed to a helpful support page that answers their question, all without interacting with a support representative.


We’ve now created a simple Python app to solve a real-world problem. To make this app even better, we could include new questions and answers to our index every time a question is asked. We could also use customer feedback to fine-tune the model to learn whether the returned results are relevant or not. After all, feedback is what helps the model get better at providing useful results.

(Video) How to Schedule & Automatically Run Python Code!

The moral of the story should be clear: Similarity search helps provide better results to your customers. And as a managed service, Pinecone makes it easy to take vector-based recommendation systems to production.

Also published here.

. . . comments & more!


How do you make a chatbot with Python and Flask? ›

Step 1: Import necessary methods of Flask and ChatterBot. Step 2: Then, we will initialize the Flask app by adding the below code. Flask( name ) is used to create the Flask class object so that Python code can initialize the Flask server. Step 3: Now, we will give the name to our chatbot.

How do you make a chatbot in Python? ›

How To Make A Chatbot In Python?
  1. Prepare the Dependencies. The first step in creating a chatbot in Python with the ChatterBot library is to install the library in your system. ...
  2. Import Classes. ...
  3. Create and Train the Chatbot. ...
  4. Communicate with the Python Chatbot. ...
  5. Train your Python Chatbot with a Corpus of Data.

How do you use ChatterBot in Python? ›

For using a storage adapter, we need to specify it. We will position the storage adapter by assigning it to the import path of the storage we want to use. Here we are using SQL Storage Adapter, which permits chatbot to connect to databases in SQL. By using the database parameter, we will create a new SQLite Database.

How do you make a chatbot with Python and deep learning? ›

Follow below steps to create Chatbot Project Using Deep Learning
  1. Import the libraries: import tensorflow. ...
  2. Declaring Constants: ...
  3. Loading our dataset that is intents. ...
  4. Preprocess Data: ...
  5. Lemmatizing Each word: ...
  6. Save words and labels list (using pickle): ...
  7. Creating our Training data: ...
  8. Shuffle and Convert our Training data to array:

How do you add a chatbot to a website in Python? ›

Build & Integrate your own custom chatbot to a website (Python ...

Which language is best for chatbot? ›

Here are the 10 most popular programming languages that you should know of while building chatbots.
  • Python. Python is the main coding language for around 80% of developers. ...
  • Java. Java is one of the most powerful programming languages that is currently used in more than 3 billion devices. ...
  • JavaScript. ...
  • Kotlin. ...
  • R. ...
  • PHP. ...
  • Go. ...
  • C.
Feb 19, 2022

Is Python good for chatbot? ›

In the past few years, chatbots in the Python programming language have become enthusiastically admired in the sectors of technology and business. These intelligent bots are so adept at imitating natural human languages and chatting with humans that companies across different industrial sectors are accepting them.

Which Python library is used for chatbot? ›

ChatterBot is a Python library built based on machine learning with an inbuilt conversational dialog flow and training engine. The bot created using this library will get trained automatically with the response it gets from the user.

What is difference between chatbot and ChatterBot? ›

A chatbot is a computer program that simulates human conversation through voice commands or text chats or both. Chatbot, short for chatterbot, is an artificial intelligence (AI) feature that can be embedded and used through any major messaging application.

How do I make a simple AI in Python? ›

How to make your first AI in Python
  1. Step 1: Create a new Python program. ...
  2. Step 2: Create greetings and goodbyes for your AI chatbot to use. ...
  3. Step 3: Create keywords and responses that your AI chatbot will know. ...
  4. Step 4: Import the random module. ...
  5. Step 5: Greet the user.
Mar 28, 2022

How do you make an AI based chatbot? ›

How to make a chatbot from scratch in 8 steps
  1. Step 1: Give your chatbot a purpose. ...
  2. Step 2: Decide where you want it to appear. ...
  3. Step 3: Choose the chatbot platform. ...
  4. Step 4: Design the chatbot conversation in a chatbot editor. ...
  5. Step 5: Test your chatbot. ...
  6. Step 6: Train your chatbots. ...
  7. Step 7: Collect feedback from users.
Aug 23, 2022

How do you create an accurate chat bot response system in Python tutorial 2021? ›

How to create an accurate Chat Bot Response System in Python ...

How do you make a chatbot in Python using NLP? ›

Building an NLP chatbot
  1. Step one: Importing libraries. Imports are critical for successfully organizing your Python code. ...
  2. Step two: Creating a JSON file. ...
  3. Step three: Processing data. ...
  4. Step four: Designing a neural network model. ...
  5. Step five: Building useful features.
Apr 14, 2022

How do you make a chatbot GUI? ›

Create A Chatbot GUI Application With Tkinter - Python Tutorial - YouTube

Does Dialogflow require coding? ›

As it turns out, creating this FAQ bot needs no coding whatsoever. And unlike the typical chatbot you would create, it needs no entities or contexts either. In other words, it is just a whole bunch of intents.


1. Adding New Doc Stores to Haystack
(James Briggs)
2. George Hotz | Programming | speech recognition (ask good questions and I'll stream more) | tinyvoice
(george hotz archive)
3. Getting Started with Python in Atom | Python with Atom editor
4. Sentence Similarity With Sentence-Transformers in Python
(James Briggs)
5. Twitter API with Python: Part 2 -- Cursor and Pagination
6. Let's Build - Singlish 'Essay' Generator
(Ze Xuan)

Top Articles

Latest Posts

Article information

Author: Gov. Deandrea McKenzie

Last Updated: 11/25/2022

Views: 6197

Rating: 4.6 / 5 (46 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Gov. Deandrea McKenzie

Birthday: 2001-01-17

Address: Suite 769 2454 Marsha Coves, Debbieton, MS 95002

Phone: +813077629322

Job: Real-Estate Executive

Hobby: Archery, Metal detecting, Kitesurfing, Genealogy, Kitesurfing, Calligraphy, Roller skating

Introduction: My name is Gov. Deandrea McKenzie, I am a spotless, clean, glamorous, sparkling, adventurous, nice, brainy person who loves writing and wants to share my knowledge and understanding with you.