Build a Chatbot with OpenAI API & Python

Q: How do I handle rate limiting from the OpenAI API?

Catch the `openai.RateLimitError` exception and implement exponential backoff. Use the `tenacity` library or a simple sleep loop. For production, queue requests and process them at a controlled rate.

DodaTech 12 min read

Build a conversational AI chatbot with Python and the OpenAI API. You will learn streaming responses, conversation memory, system prompts, and deployment.

What You'll Build

You are going to build a chatbot that talks to the OpenAI API. It will remember past messages, respond in a custom personality you design, and run on a web interface built with Streamlit. By the end you will have a working AI assistant you can deploy online. This is the same pattern used in DodaTech's internal support assistant and many customer-facing AI chat products.

Why This Matters

Chatbots powered by large language models are transforming customer support, education, and content creation. Companies like Intercom, Shopify, and Duolingo use similar architecture to power their AI features. Learning to integrate the OpenAI API Guide gives you a skill you can apply to automation, SaaS products, and internal tools.

Real-World Use

Support teams at DodaTech use this exact architecture to power the help assistant inside Doda Browser and DodaZIP. A user asks a question, the frontend sends it to a Python backend, the backend calls OpenAI, and the response comes back through the same chain. You will build the same pipeline.

Architecture

sequenceDiagram
    participant User
    participant UI as Streamlit Frontend
    participant App as Python Backend
    participant API as OpenAI API

    User->>UI: Types a message
    UI->>App: Sends request
    App->>API: Chat completion call
    API-->>App: Streams response tokens
    App-->>UI: Returns response text
    UI-->>User: Displays reply

Prerequisites

Python 3.9+ installed on your machine
An API key from OpenAI (sign up at platform.openai.com)
Basic familiarity with the terminal
Git for version control (optional but recommended)

Step 1: Project Setup

Create a project directory and a virtual environment. A virtual environment keeps your dependencies isolated so different projects do not interfere with each other.

mkdir ai-chatbot
cd ai-chatbot
python -m venv venv
source venv/bin/activate
pip install openai streamlit python-dotenv

Expected output:

Successfully installed openai-1.x.x streamlit-1.x.x python-dotenv-1.x.x ...

The openai package gives you access to the OpenAI API Guide client. streamlit is the web framework for your chat interface. python-dotenv loads environment variables from a .env file.

Step 2: API Key Setup

Create a .env file in your project root. This file stores your API key outside your code so you never commit it to version control.

OPENAI_API_KEY=sk-your-key-here

Replace sk-your-key-here with your actual OpenAI API key. Then create a file called config.py to load it:

# config.py
import os
from dotenv import load_dotenv

load_dotenv()

api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
    raise ValueError("OPENAI_API_KEY not found in .env file")

print("API key loaded successfully")

Expected output:

API key loaded successfully

If you see the error message, check that your .env file exists in the right directory and contains the correct key name.

Why do we load the key from a file instead of hardcoding it? Hardcoding exposes your key in every copy of your code. If you push to Git by accident, anyone can use your key and you get billed. A .env file stays local and you add it to .gitignore.

Step 3: Basic Chat Loop (CLI)

Now build a command-line chatbot that takes your input, sends it to OpenAI, and prints the response.

# cli_chat.py
import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

print("Chatbot ready. Type 'quit' to exit.")
while True:
    user_input = input("\nYou: ")
    if user_input.lower() == "quit":
        break

    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": user_input}]
    )

    print(f"Bot: {response.choices[0].message.content}")

Run it:

python cli_chat.py

Expected conversation:

Chatbot ready. Type 'quit' to exit.

You: What is the capital of France?
Bot: The capital of France is Paris.

You: Tell me a fun fact about space.
Bot: A day on Venus is longer than a year on Venus — it takes 243 Earth days to rotate but only 225 to orbit the Sun.

You: quit

This works, but there is a problem. Each message is sent in isolation. The model does not know what you said before. If you ask "What is its atmosphere like?" right after the space fact, it will not know what "its" refers to because the history is not passed along. The next step fixes that.

Step 4: Conversation Memory

To make the chatbot remember the conversation, keep a list of all messages and send the full list with every request. The OpenAI chat completion API accepts a messages array where each entry has a role -- either "user", "assistant", or "system".

# memory_chat.py
import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

messages = []

print("Memory chatbot ready. Type 'quit' to exit.")
while True:
    user_input = input("\nYou: ")
    if user_input.lower() == "quit":
        break

    messages.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages
    )

    bot_reply = response.choices[0].message.content
    messages.append({"role": "assistant", "content": bot_reply})

    print(f"Bot: {bot_reply}")

Expected conversation:

Memory chatbot ready. Type 'quit' to exit.

You: My name is Alex.
Bot: Nice to meet you, Alex! How can I help you today?

You: What is my name?
Bot: Your name is Alex — you told me a moment ago.

The bot remembers your name because the full conversation history is sent in each API call. This is the foundation of any production chatbot. Without it, every interaction starts from scratch.

A note on token limits: Every model has a maximum context window. For gpt-3.5-turbo that is 4,096 tokens (roughly 3,000 words). Once your conversation exceeds that limit, you need to trim older messages. A common Strategy is to keep only the last N exchanges or summarize the conversation periodically.

Step 5: System Prompt + Personality

A system prompt sets the behavior and personality of the chatbot. It is the first message in the messages array with role: "system". The model treats it as an instruction that guides all subsequent responses.

# personality_chat.py
import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

system_prompt = {
    "role": "system",
    "content": (
        "You are a friendly pirate named Captain Codebeard. "
        "You speak in pirate slang, call the user 'matey', "
        "and always relate answers to sailing or the sea. "
        "Keep responses under three sentences."
    )
}

messages = [system_prompt]

print("Captain Codebeard at yer service! Type 'quit' to leave.")
while True:
    user_input = input("\nYou: ")
    if user_input.lower() == "quit":
        break

    messages.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages
    )

    bot_reply = response.choices[0].message.content
    messages.append({"role": "assistant", "content": bot_reply})

    print(f"Captain: {bot_reply}")

Expected conversation:

Captain Codebeard at yer service! Type 'quit' to leave.

You: What is Python?
Captain: Arrr, Python be a mighty programming tongue, matey! Used for buildin' all sorts of digital treasure — from web apps to sea-monster-sized data charts.

You: How do I learn it?
Captain: Hoist the sails and start with the basics, matey! Practice yer coding every day, and soon ye'll be navigatin' them waters like a true pirate coder!

The system prompt is powerful. You can use it to set tone, constrain output format, enforce rules, or roleplay any character. Customer support bots use system prompts to define the brand's voice, acceptable responses, and escalation rules.

Step 6: Web Interface with Streamlit

A CLI chatbot is functional but not user-friendly. Streamlit lets you turn your Python script into a web app with minimal code. It handles the input box, the chat display, and the rerun loop automatically.

# app.py
import os
from dotenv import load_dotenv
from openai import OpenAI
import streamlit as st

load_dotenv()

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

system_prompt = {
    "role": "system",
    "content": "You are a helpful assistant. Answer concisely and clearly."
}

if "messages" not in st.session_state:
    st.session_state.messages = [system_prompt]

st.title("AI Chatbot")

for message in st.session_state.messages:
    if message["role"] != "system":
        with st.chat_message(message["role"]):
            st.markdown(message["content"])

if prompt := st.chat_input("Type your message here..."):
    st.session_state.messages.append({"role": "user", "content": prompt})
    with st.chat_message("user"):
        st.markdown(prompt)

    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=st.session_state.messages
    )

    bot_reply = response.choices[0].message.content
    st.session_state.messages.append({"role": "assistant", "content": bot_reply})
    with st.chat_message("assistant"):
        st.markdown(bot_reply)

Run it:

streamlit run app.py

Expected output:

You can now view your Streamlit app in your browser.

Local URL: http://localhost:8501
Network URL: http://192.168.x.x:8501

Open http://localhost:8501 in your browser. You will see a chat interface with a title, a message area, and a text input at the bottom. Every message you send gets added to the chat history and the bot responds inline.

How it works: Streamlit reruns the script from top to bottom every time you interact. The st.session_state dictionary persists data across reruns -- we store the message list there so it does not reset. st.chat_message renders each message as a chat bubble and st.chat_input provides the input box.

Step 7: Deployment Options

Your chatbot is ready. Now let people use it. Here are three free or low-cost deployment options.

Hugging Face Spaces

Push your code to a Git Repository.
Go to huggingface.co/spaces and create a new Space.
Choose Streamlit as the SDK.
Connect your Repository or upload the files directly.
Add your OPENAI_API_KEY as a Space secret (Settings -> Secrets).

Hugging Face Spaces gives you a public URL like https://your-username-ai-chatbot.hf.space.

Railway

Create a requirements.txt file listing your dependencies.
Create a Procfile with web: streamlit run app.py --server.port $PORT.
Push to Git and connect your Repository on railway.app.
Add OPENAI_API_KEY to the environment variables in the dashboard.
Railway deploys automatically on every push.

Replit

Create a new Replit and import your files.
Add OPENAI_API_KEY to the Secrets tab.
Set the run command to streamlit run app.py.
Click Deploy.

All three platforms offer a free tier that can handle low to moderate traffic. For production workloads with thousands of users, consider a cloud VM or a containerized deployment with Docker.

Common Errors

1. ModuleNotFoundError: No module named 'openai' You forgot to activate your virtual environment or install the dependencies. Run source venv/bin/activate then pip install openai streamlit python-dotenv.

2. AuthenticationError: Incorrect API key Your .env file has a typo or the key is invalid. Check that OPENAI_API_KEY=sk-... has no extra spaces around the = sign. Regenerate the key from the OpenAI dashboard if needed.

3. RateLimitError: 429 Too Many Requests Free-tier OpenAI accounts have rate limits (typically 3 requests per minute for gpt-3.5-turbo). Wait a minute between tests or upgrade your plan. Add time.sleep(1) between calls during development.

4. The chatbot does not remember past messages You are not appending the conversation history to the messages array before each API call. Make sure you send the full messages list (including all previous user and assistant messages) with every request.

5. Streamlit app shows blank page Check that your app.py has st.title("AI Chatbot") or similar content before any other Streamlit commands. If the app starts but shows nothing, look at the terminal output for Python errors.

6. Context window exceeded Long conversations hit the model's token limit. Implement Sliding Window logic: keep only the system prompt and the last 10-20 exchanges. Alternatively, use a model with a larger context window like gpt-4-turbo or gpt-4o.

Practice Questions

1. What is the purpose of the system prompt in the messages array? The system prompt sets the behavior, tone, and constraints for the model. It is sent as the first message with role: "system" and influences every response the model generates for that conversation.

2. Why do we use python-dotenv instead of hardcoding the API key? Hardcoding the key exposes it to anyone who reads the code. The .env file is never committed to version control (add it to .gitignore), keeping the key secure across development and deployment.

3. How does conversation memory work in the OpenAI chat API? The client sends the full messages array with every request. Each entry has a role (system, user, or assistant) and content. The model reads the entire history to generate context-aware responses.

4. What happens when a conversation exceeds the token limit? The API returns a 400 error or silently truncates older messages depending on the client version. To prevent this, trim the conversation history by keeping only the last N exchanges or use a model with a larger context window.

5. Challenge: Add a typing indicator Modify the Streamlit app to show a "Bot is thinking..." message while waiting for the API response. Use st.session_state to track the loading state and st.chat_message("assistant") to display the indicator.

Challenge

Build a multi-personality chatbot selector. Modify the Streamlit app to let the user choose from three system prompts: a friendly pirate, a strict teacher, and a poetic writer. Store the options in a st.selectbox and update the system prompt when the selection changes. The conversation history should reset when switching personalities so the old personality does not bleed into the new one.

Real-World Task

Customer support triage bot. You work at DodaTech and need to build an internal bot that handles common support questions for Doda Browser. The bot should:

Identify if the question is about installation, updates, or troubleshooting
Provide canned answers for known issues
Escalate to a human agent (simulated with a "transfer to agent" message) if it cannot answer

Write a system prompt that classifies the intent and routes the response. Test it with at least five different user queries covering each category.

FAQ

Do I need a paid OpenAI account to follow this tutorial?

You need an API key from platform.openai.com. New accounts get free credits that cover the examples in this tutorial. Once those run out, the pay-as-you-go rate for gpt-3.5-turbo is about $0.002 per 1,000 tokens -- roughly 750 responses for one dollar.

What is the difference between gpt-3.5-turbo and gpt-4?

gpt-4 is more capable with complex reasoning, larger context Windows (up to 128K tokens), and better instruction following. gpt-3.5-turbo is faster and cheaper. For this tutorial gpt-3.5-turbo is sufficient. You can swap the model name in the API call to upgrade.

How do I handle rate limiting from the OpenAI API?

Catch the openai.RateLimitError exception and implement exponential backoff. Use the tenacity library or a simple sleep loop. For production, queue requests and process them at a controlled rate.

Can I use a local model instead of OpenAI?

Yes. Libraries like Ollama and llama-cpp-python let you run models locally. The API interface is similar -- you send a messages array and get a completion back. The architecture you build in this tutorial works with any provider that follows the OpenAI-compatible chat format.

Next Steps

Build an Image Classifier
RAG Systems
[LangChain Guide](/machine-learning/LangChain-guide/)

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

← Previous Build a Discord Bot with Python (Step by Step) Next → Build a Portfolio Website with Hugo (Step by Step)

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Build Projects