Building Your Own AI Group Chat: A Journey into Custom Universes and Characters

Author:Murphy  |  View: 22092  |  Time: 2025-03-22 19:54:31

Welcome to this article where I'll show you how I created an immersive AI Group Chat. As an LLM Engineer, I wasn't satisfied with the typical one-on-one AI chats found everywhere – so in my spare time, I set out to create an AI Group Chat where users could interact with multiple characters simultaneously, in any universe imaginable without spending any money.

If you are not a member, read here.

In the first part, I'll walk you through the idea of interacting with multiple characters simultaneously. Then, we'll explore the tools I've used – Python, Ollama, React, FastAPI, and open-source Large Language Models (LLMs). To finish, you'll have all the tips and insights to build your own custom universes and characters using my code as a starting point, plus some ideas on how to improve it (since this is just a fun side project).

Why Build an AI Group Chat?

Image by author

Most Chatbots provide a one-on-one interaction, which can feel restrictive and no longer offers much originality. By creating an AI group chat, the conversation becomes exciting and engaging with multiple different points of view. Multiple characters can interact not just with the user but also with each other, simulating a real group discussion where everybody wants to express their own "feelings" regarding any topic. I wanted to make the app adaptable so that anyone could try their own universes and characters without starting from scratch. Whether you're into fantasy realms, sci-fi galaxies, or entirely original worlds, this project is a playground for your imagination ( as it is for me! ).


The Tech Stack

  • Frontend: React and Chakra UI for a responsive and intuitive user interface.
  • Backend: Fastapi and Python to handle API requests and manage conversation logic.
  • LLMs: Open-source models powered by Ollama for generating authentic character responses 100% locally.

Getting Started: Cloning the Repository and Running the App

To make things easier, you can clone my GitHub repository which contains all the code you need, then set up the libraries.

Step 1: Clone the Repository

git clone https://github.com/MaximeJabarian/ai-group-chat.git
cd ai-group-chat

Here's the structure of the main components of the project:

ai-group-chat/
│
├── backend/
│   ├── backend.py          # Main FastAPI backend code
│   └── bots_config.json    # Configuration for AI characters
│
├── frontend/
│   ├── public/
│   │   ├── index.html      # Main HTML file for React app
│   │   └── images/         # Directory for static images
│   │
│   ├── src/
│   │   ├── App.js          # Main React component
│   │   └── ChatBubbles.css # Styles for chat bubbles
│   │
│   └── package.json        # Node.js dependencies
│
├── venv/                   # Python virtual environment (not tracked in git)
│
├── run_app.py              # Script to run the entire application
└── requirements.txt        # Python dependencies for backend

Step 2: Install Node.js and npm for the Frontend

Before proceeding, ensure that Node.js and npm (Node Package Manager) are installed on your system. You can download them from https://nodejs.org, then check the installation:

node -v 
npm -v

Also, go to the frontend/ directory and install the necessary Node.js dependencies:

cd frontend
npm install

Step 3: Set Up the python libraries for the backend

To avoid any conflict on your computer, I suggest to create a new environment for the app to the root folder ai-group-chat/:

python -m venv venv # Create a virtual environment and activate it
source venv/bin/activate  # On Windows, use 'venvScriptsactivate'
pip install -r requirements.txt # Install the required Python packages

Step 4: Running the Web App

Before running the app, ensure the Ollama app is installed and running locally. You'll need to download the Llm models you plan to use, such as llama3.1 or any other model. If you're new to Ollama check out this article before continuing. Then, run the following command :

ollama pull 

Once the model is set up, you can run the entire application – frontend and backend – with a single command:

python run_app.py 

Step 5: Updating Characters and Avatars

  1. Editing Character Personas: Open the bots_config.json file located in the backend directory, then update the characters array with your own information, or even the model:
{
  "characters": [
    {
      "name": "Maul",
      "model": "llama3.1",
      "personality": "Calm, calculating, driven by a desire for power and revenge."
    },
    {
      "name": "Yoda",
      "model": "llama3.1",
      "personality": "Wise, cryptic, speaks in reversed syntax."
    }
  ]
}
  1. Updating avatars: Place your image files in the public/images directory of the frontend. Ensure the images are named consistently with your character names. For example, for "Yoda":
public/images/yoda.png

Crafting the User Interface with React (App.js file)

1. Character Selection Panel

I opted for React in this project instead of Streamlit to create a more user-friendly and interactive interface. However, feel free to integrate the backend with Streamlit if that suits your needs better, that is easier if you want to learn the full logic of the app with zero knowledge in React.

Users can choose up to four characters to chat with. Here's how it's set up in frontend/src/App.js:


  {/* Left panel with a width of 200px, dark background, and white text */}

  Select Characters:
  {/* Title for the character selection section with extra-large font and bold weight */}

  {/* Display checkbox and Avatar */}
  {botList.map((bot, index) => (
    
       toggleBot(bot.name)}
        colorScheme="teal"
      />
      
      {bot.name}
    
  ))}

2. Handling User Messages

Managing state effectively is crucial. React hooks are used to manage messages and active bots:

const [messages, setMessages] = useState([]); 
// Stores chat messages

const [userMessage, setUserMessage] = useState(""); 
// Holds current user input

const [activeBots, setActiveBots] = useState([]); 
// Tracks selected active bots

const handleSendMessage = () => {
  if (userMessage.trim() === "") return;

  const newMessage = { speaker: "User", text: userMessage };
  setMessages([...messages, newMessage]);
  setUserMessage("");
  // Send message to backend
};

3. Displaying Messages Clearly

The chat window differentiates between user messages and bot responses:


  {messages.map((msg, index) => (
    
      {msg.speaker !== "User" && (
        
      )}
      
        {msg.text}
      
    
  ))}

Building the Backend with FastAPI (backend.py file)

1. Defining Characters and Personalities

Characters are defined in the backend/bots_config.json file, do not give to long personality description, regarding which LLM you're using it can slow down the response time:

{
  "characters": [
    {
      "name": "Maul",
      "model": "llama3.1",
      "personality": "Calm, calculating, driven by a desire for power and revenge."
    },
    {
      "name": "Chewbacca",
      "model": "llama3.1",
      "personality": "Gruff, loyal, communicates through growls and roars."
    },
    {
      "name": "Yoda",
      "model": "llama3.1",
      "personality": "Wise, cryptic, speaks in reversed syntax."
    }
  ]
}

2. Answers generation

The backend/backend.py file is responsible for loading character configurations from a JSON file, generating authentic responses for each character, and processing chat requests using FastAPI. FastAPI is used for its speed and simplicity.

  1. Loading Configurations: The backend reads the bots_config.json file to load character details like name, model, and personality.
  2. Generating Responses: A build_prompt function creates a custom prompt for each character based on their personality and the conversation history, ensuring the responses align with the character's traits. This also enable the characters to discuss between each others, and not necessarily with the user everytime.
  3. Handling Chat Requests: The /chat endpoint processes user messages and conversation history, dynamically generating responses from the active characters using the appropriate LLM models.
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from typing import List, Dict, Any
from llama_index.core.llms import ChatMessage
from llama_index.llms.Ollama import Ollama
from typing import Optional
import logging
import json
import os

# configurations are loaded
config_path = os.path.join(os.path.dirname(__file__), 'bots_config.json')
with open(config_path, 'r') as f:
    bot_config = json.load(f)
    characters = bot_config["characters"]

# Dynamically create bot variables
bots = []
for idx, character in enumerate(characters, start=1):
    bots.append({
        "name": character['name'],
        "model": character['model'],
        "personality": character['personality']
    })

logging.basicConfig(level=logging.INFO)

app = FastAPI()

# Set allowed origins for CORS
origins = [
    "http://localhost:3000",  
    "http://127.0.0.1:3000", 
]

# Add CORS middleware to enable React interface requests
app.add_middleware(
    CORSMiddleware,
    allow_origins=origins, 
    allow_credentials=True,
    allow_methods=["*"],  
    allow_headers=["*"],  
)

# Fetches bots and generates LLM response with error handling.
@app.get("/bots")
def get_bots():
    return bots

async def generate_response(model_name: str, prompt: str) -> str:
    """
    Function to generate a response from an Ollama template.
    """
    try:
        llm = Ollama(model=model_name, request_timeout=120.0)
        response = ""

        for r in llm.stream_chat([ChatMessage(role="user", content=prompt)]):
            response += r.delta
        return response.strip()

    except Exception as e:
        logging.error(f"Error with the model {model_name}: {str(e)}")
        raise e

# Generating Authentic Responses
def build_prompt(bot_name: str, personality: str, conversation_history: List[Dict[str, Any]]) -> str:
    """
    Build a prompt for the bot using the full conversation history and its personality.
    """
    prompt = f"You are {bot_name}, {personality}.n"

    for message in conversation_history:
        if 'speaker' in message and 'text' in message:
            speaker = message['speaker']
            text = message['text']
            prompt += f"{speaker} says: '{text}'n"

    prompt += "Respond without using asterisks to indicate actions or non-verbal cues, and in a way that stays true to your personality and reflects a realistic, dynamic conversation as if you were participating in a casual group chat."
    return prompt

class Message(BaseModel):
    text: str
    conversation_history: List[Dict[str, Any]]
    active_bots: Optional[List[str]] = None  # New field for active bots

# Handling Chat Requests
@app.post("/chat")
async def chat(message: Message):
    try:
        user_message = message.text
        conversation_history = message.conversation_history or []
        active_bots = message.active_bots or [bot['name'] for bot in bots]  # Default to all bots if none specified

        # Log the received conversation history
        logging.info(f"Conversation history received: {conversation_history}")

        # **Ensure we add the user's message only once**
        valid_conversation_history = [
            msg for msg in conversation_history
            if isinstance(msg, dict) and "speaker" in msg and "text" in msg
        ]

        # Append the user's message to the conversation history **only once** 
        if not valid_conversation_history or valid_conversation_history[-1]["speaker"] != "User":
            valid_conversation_history.append({"speaker": "User", "text": user_message})

        responses = []
        bot_responses = []

        for bot in bots:
            if bot['name'] not in active_bots:
                continue

            bot_name = bot['name']
            model_name = bot['model']
            personality = bot['personality']

            prompt = build_prompt(bot_name, personality, valid_conversation_history)
            logging.info(f"Prompt generated for {bot_name}: {prompt}")
            bot_response = await generate_response(model_name, prompt)

            bot_responses.append({'speaker': bot_name, 'text': bot_response})
            responses.append({
                "bot": bot_name,
                "response": bot_response
            })

        # Update conversation history with bot responses
        valid_conversation_history.extend(bot_responses)

        # Return responses and updated conversation history if needed
        return {
            "responses": responses,
        }

    except Exception as e:
        logging.error(f"Error processing the request: {str(e)}")
        raise HTTPException(status_code=500, detail=f"Internal server error: {str(e)}")

General Tips on Using LLMs Efficiently

1. Understand the Model's Capabilities

Before integrating an LLM into your project, familiarize yourself with its strengths and limitations. This helps in setting realistic expectations and designing effective prompts (Llama3.1, Mistral7, Phi3, Gimini2…). To understand more how the input window length can impact the quality and the latency regarding each model, check this article.

2. Craft Effective Prompts

The quality of the LLM's output heavily depends on the prompts provided. Spend time refining prompts to elicit the desired behaviours. Consider the following:

  • Clarity: Be specific about the role and behavior you expect from the model.
  • Context: Provide sufficient context to guide the model's response.
  • Instructions: Use clear instructions to shape the output.

3. Optimizing Performance and Reducing Latency

LLMs can be resource-intensive, so optimizing performance is key:

a. Latency Challenges: As the conversation progresses with multiple characters, latency tends to increase due to the growing chat history. To address this, the conversation history sent with each request can be reduced or summarized, helping to maintain responsiveness.

b. Bot Message: Currently, all bot messages are shown only after all characters have generated their responses, resulting in delays. A more efficient approach would be to modify the frontend to display each character's message as soon as it's ready. Implementing a streaming mechanism between the frontend and backend, along with asynchronous processing, would further enhance real-time interactions.

Conclusion

You've successfully run your first AI Group Chat application with your custom characters and universe, using Ollama, React, and FastAPI, allowing users to interact with multiple AI-driven characters. I hope this project inspires you to explore even further, reimagining new universes! I'm truly curious to see what you've created and how you've expanded on this foundation, let me know it in comment. I'm open for any collaboration to improve it, if interested reach out to me !

Thank you for reading!

Got questions or need a hand? Just shoot me a message – I'm here to help !


Loved the Article? Here's How to Show Some Love:

Comment