Deploy Vision Chatbot LLM frontend server

Deploy a LLM-based Vision Chatbot with PyTorch and Hugging Face Transformers on Google Axion processors

Log an issue

Fork and edit

Discuss on Discord

Deploy a LLM-based Vision Chatbot with PyTorch and Hugging Face Transformers on Google Axion processors

Frontend Script for Vision Chatbot LLM Server

After activating the virtual environment in a new terminal, you can use the following frontend.py script to input image, text prompt and interact with the backend. This script uses the Streamlit framework to create a web interface for the vision chatbot LLM server.

Create a frontend.py script with the following content:

    

        
        
import streamlit as st
import requests, time, base64, json

st.title("LLM Vision Chatbot on Arm")
st.write("Upload an image and input the prompt. The model will generate response based on the image as context.")

# File uploader for image and text input for prompt
uploaded_image = st.file_uploader("**Upload an image**", type=["png", "jpg", "jpeg"])
user_prompt = st.text_area("**Enter your prompt or question about the image**", "")

# Placeholder for the generated answer and metrics
output_area = st.empty()
metrics_area = st.empty()

if st.button("Generate Response"):
    if uploaded_image is None or user_prompt.strip() == "":
        st.warning("Please provide both the image and prompt before submitting.")
    else:
        # Prepare the request (OpenAI-compatible format with image in base64)
        image_bytes = uploaded_image.read()
        b64_image = base64.b64encode(image_bytes).decode('utf-8')
        # Construct request payload similar to OpenAI ChatCompletion
        payload = {
            "messages": [
                {"role": "user", "content": user_prompt}
            ],
            "image": b64_image,       # custom field for image
            "stream": True,           # token streaming
        }

        # Initialize streaming request to backend
        backend_url = "http://localhost:5000/v1/chat/completions"
        generated_text = ""
        # Make POST request with streaming response
        try:
            with requests.post(backend_url, json=payload, stream=True) as resp:
                # Iterate over the streamed lines from the response
                for line in resp.iter_lines(decode_unicode=True):
                    if line is None or line.strip() == "":
                        continue  # skip empty keep-alive lines
                    # OpenAI SSE format lines begin with "data: "
                    if line.startswith("data: "):
                        data = line[len("data: "):]
                        if data.strip() == "[DONE]":
                            break  # stream finished
                        # Parse the JSON chunk
                        chunk = json.loads(data)
                        # The first chunk contains the role, subsequent contain content
                        delta = chunk["choices"][0]["delta"]
                        if "role" in delta:
                            # Initial role announcement (assistant) – skip it
                            continue
                        if "content" in delta:
                            token = delta["content"]
                            # Append token to the output text
                            generated_text += token
                            # Update the output area with the new partial text
                            output_area.markdown(f"**Assistant:** {generated_text}")

        except requests.exceptions.RequestException as e:
            st.error(f"Error connecting to backend: {e}")

Run the Frontend Server

You are now ready to run the frontend server for the Vision Chatbot. Use the following command in a new terminal to start the Streamlit frontend server:

    

        
        
python3 -m streamlit run frontend.py

You should see output similar to the following as the frontend server starts successfully:

    

        
        Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.


  You can now view your Streamlit app in your browser.

  Local URL: http://localhost:8501
  Network URL: http://10.0.0.10:8501
  External URL: http://35.223.133.103:8501

In the next section you will view your running application within your local browser.

Back

Deploy a LLM-based Vision Chatbot with PyTorch and Hugging Face Transformers on Google Axion processors

Introduction

Set up an LLM based-Vision Chatbot

Deploy Vision Chatbot LLM backend server

Deploy Vision Chatbot LLM frontend server

Inference with Vision Chatbot

Next Steps

Deploy a LLM-based Vision Chatbot with PyTorch and Hugging Face Transformers on Google Axion processors

Frontend Script for Vision Chatbot LLM Server

Run the Frontend Server