How to Implement ChatGPT with OpenAI API in Python Synchronously and Asynchronously

Since the advent of ChatGPT, it has brought tremendous shock to human society. Especially for us developers, our lives have been reshaped dramatically because of it. ChatGPT can answer all kinds of technical and non-technical questions correctly, accurately, and efficiently.
However, ChatGPT can do more than just answer our questions. We can also make chats programmatically by implementing it in our application and use it to answer customer questions or boost the efficiency of our business in general.
A typical use case is category prediction in the product search service of online shops. We used to build machine learning or deep learning models based on the product category data we could get. However, these models are limited by the training data we can have, no matter how sophisticatedly the models are trained. In comparison, with ChatGPT, the models behind the scenes are built on a lot more data than we can ever have access to and are also trained with more advanced algorithms. Therefore, the predictions by ChatGPT are normally more accurate, even for products we have never indexed before.
In this post, we will introduce how to make chats programmatically using the OpenAI API in Python. Fundamental concepts will be introduced in simple languages so you can get started with it quickly.
Preparation
Let's create a virtual environment so we can try out the latest versions of Python and the libraries:
conda create -n openai python=3.12
conda activate openai
pip install openai httpx
- openai – A library provided by OpenAI which makes working with the OpenAI API in Python simple and efficient.
- httpx – A modern and fully featured HTTP client library that supports both HTTP/1.1 and HTTP/2 and provides both sync and async APIs.
Authentication
After installing the libraries, we need to get the API key to call the OpenAI APIs. Note that OpenAI API and ChatGPT are managed separately. Therefore, even if you are a paid ChatGPT user, you still need to pay for the API.
You can create a new API key on the API Keys page. After you have created an API key, or have got one from your organization, you need to set a special environmental variable called OPENAI_API_KEY
which authorizes the openai libraries automatically.
export OPENAI_API_KEY=OpenAI-API-KEY>
Start a chat with OpenAI API
To start a chat with the openai library we first need to create a client. If you have set the OPENAI_API_KEY
environmental variable as above, then you don't need to set the API Key for the client.
Python">from openai import OpenAI
client = OpenAI()
However, if for some reason, the environmental variable is not called OPENAI_API_KEY
, you would need to specify the API key for the client explicitly:
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["CUSTOM_OPENAI_API_KEY_ENV"])
Then we need to create a system message and a user message respectively. The messages for the Chat API have three roles:
- assistant – It is the AI model like GTP-3.5 or GPT-4 that is used by the API to process user input and generate responses.
- system – The system message is used to set the assistant's behavior. For example, we can provide specific instructions about how the assistant should behave throughout the conversation.
- user – It is the end user that is asking questions.
In our simple example, we will write a system message that instructs the assistant to answer questions about American history concisely, and then ask a question with a user message:
system_message = (
"You are a helpful assistant that is good at American history "
"and give only concise answers to questions."
)
user_message = "Who is the first president of USA?"
We don't need to provide an assistant message because it's the answer by the AI model.
Before we start a chat, we need to choose a model; common choices are gpt-3.5-turbo
, gpt-4
, and gpt-4-turbo-preview
which are the latest model versions. The word "turbo" here refers to a version of the model that has been optimized for faster response times and lower costs, while still providing high-quality outputs, which is a recommended option for integrating AI-driven text generation into our applications in a cost-effective manner.
Now everything is ready, we can start a chat using the model, system message, and user message specified above:
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": system_message},
{"role": "user", "content": user_message},
],
)
answer = response.choices[0].message.content
print(f"Answer = {answer}")
If the code is written correctly and the API key is working properly, we will see the answer provided by the assistant:
Answer = George Washington.
Deal with error code 429
If you see error code 429, it normally means you have exceeded the quota and need to pay for OpenAI API:
RateLimitError: Error code: 429 - You exceeded your current quota, please check your plan and billing details
In this case, you have to pay for it if you want to use it. If you work in an organization/company, this should not be something you should worry about :).


After the payment is done properly, you will be able to use the API key normally.
Make chats asynchronously
In the example above, we used the openai library to make a chat synchronously. Unfortunately, the openai library itself does not directly support asynchronous requests.
Making asynchronous chats is critical if your application supports multiple users concurrently, or when you want to split a big chat into smaller ones for data processing or data analysis.
To make chats asynchronously, we need to use asynchronous HTTP clients like httpx
or aiohttp
to make concurrent requests in an asynchronous manner.
In the code snippet below, asyncio and httpx are used to make multiple chats asynchronously.
import asyncio
import httpx
import os
SYSTEM_MESSAGE = (
"You are a helpful assistant that is good at American history "
"and give only concise answers to questions."
)
OPENAI_API_KEY = os.environ["OPENAI_API_KEY"]
async def ask_question(client, question):
payload = {
"model": "gpt-3.5-turbo",
"messages": [
{"role": "system", "content": SYSTEM_MESSAGE},
{"role": "user", "content": question},
],
}
response = await client.post(
"https://api.openai.com/v1/chat/completions", json=payload
)
data = response.json()
answer = data["choices"][0]["message"]["content"]
return {"question": question, "answer": answer}
async def main():
questions = [
"Who is the first president of USA?",
"When did Trump become president?",
]
tasks = []
async with httpx.AsyncClient() as client:
client.headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {OPENAI_API_KEY}",
}
for question in questions:
tasks.append(ask_question(client, question))
results = await asyncio.gather(*tasks)
print(results)
asyncio.run(main())
As we can see, the code changes are minimal when switching from the openai library to calling the API endpoint directly. However, note that the data returned now is a simple dict rather than a [ChatCompletion](https://platform.openai.com/docs/api-reference/chat/object)
object.
For more details about how to use httpx or aiohttp to make asynchronous requests in Python, please check this post.
You can save the above code in a script file and run it directly. If everything is set correctly, you will see the answers given by OpenAI API:
[
{
"question": "Who is the first president of USA?",
"answer": "George Washington.",
},
{
"question": "When did Trump become president?",
"answer": "January 20, 2017.",
},
]
In this post, we have introduced how to use the openai library to call the OpenAI API in Python. Fundamental concepts are introduced in simple languages so we can get started with it quickly. Besides, we also introduced how to make asynchronous chats using the httpx library, which can dramatically improve your application's efficiency.
Even though we just scratched the surface of the OpenAI API, you can already start to implement it in your application, and it should be sufficient for many common use cases.
To apply the code templates demonstrated in this post to your specific use cases, you just need to adapt the system and user messages which are fed into the models and they will then be able to solve different problems accordingly. As mentioned in the introduction section, we managed to use ChatGPT to replace our legacy machine learning models for category prediction and it has improved the prediction accuracy and efficiency dramatically in our service. You can try to find some fitting points in your project which can also be improved by ChatGPT. And if you do find some, this post will help you to implement them in a couple of hours.