Set up an LLM based-Vision Chatbot

Deploy a LLM-based Vision Chatbot with PyTorch and Hugging Face Transformers on Google Axion processors

Log an issue

Fork and edit

Discuss on Discord

Deploy a LLM-based Vision Chatbot with PyTorch and Hugging Face Transformers on Google Axion processors

Overview

This Learning Path demonstrates how to build and deploy a vision-enabled chatbot using open-source large language models (LLMs) optimized for Arm-based servers.

Vision chatbots accept both images and text as input, then generate text responses using the image as context. The instructions in this Learning Path are tailored for Arm servers running Ubuntu 24.04 LTS and have been tested on a Google Cloud c4a-standard-64 instance. You will need an Arm server instance with at least 32 CPU cores to run this example.

You’ll learn how to run a vision chatbot LLM inference using PyTorch and Hugging Face Transformers, efficiently leveraging Arm CPUs to process image-and-text input and produce relevant, contextual replies.

Install dependencies

Install the following packages on your Arm-based server instance:

    

        
        
sudo apt update
sudo apt install python3-pip python3-venv -y

Create a file with your Python dependencies

Using your preferred text editor, add the following Python dependencies to your requirements.txt file:

    

        
        
streamlit
numpy
sentencepiece
pillow
flask
transformers
huggingface_hub

Install Python dependencies

Now create a Python virtual environment and install the dependencies.

Create a virtual environment:

    

        
        
python3 -m venv llama-vision

Activate the virtual environment:

    

        
        
source llama-vision/bin/activate

Install the required libraries using pip:

    

        
        
pip install -r requirements.txt

Install PyTorch

Install the PyTorch package, which includes the Kleidi AI optimizations from the nightly build, using the following command:

    

        
        
pip install torch==2.7.0.dev20250307 --extra-index-url https://download.pytorch.org/whl/nightly/cpu/

Note

If the specified PyTorch version fails to install, try installing any PyTorch nightly build from PyTorch Nightly Builds released after version 2.7.0.dev20250307.

Install Torch AO

Clone the Torch AO repository:

    

        
        
    git clone https://github.com/pytorch/ao.git
    cd ao

Checkout the required branch:

    

        
        
    git checkout 24c966cb8931507ef389715bc5f19a11c28b0484

Install Torch AO:

    

        
        
    python setup.py install
    cd ..

To use the Llama-3.2-11B-Vision-Instruct model from Hugging Face, you need to request access or accept the terms. You need to log in to Hugging Face using a token.

    

        
        
    huggingface-cli login

Enter your Hugging Face token. You can generate a token from Hugging Face Hub by clicking your profile on the top right corner and selecting Access Tokens.

You must also visit the Hugging Face link printed in the login output and accept the terms by clicking the Agree and access repository button or filling out the request-for-access form, depending on the model.

Back

Deploy a LLM-based Vision Chatbot with PyTorch and Hugging Face Transformers on Google Axion processors

Introduction

Set up an LLM based-Vision Chatbot

Deploy Vision Chatbot LLM backend server

Deploy Vision Chatbot LLM frontend server

Inference with Vision Chatbot

Next Steps

Deploy a LLM-based Vision Chatbot with PyTorch and Hugging Face Transformers on Google Axion processors

Overview

Install dependencies

Create a file with your Python dependencies

Install Python dependencies

Install PyTorch

Install Torch AO

Set up an LLM based-Vision Chatbot

Deploy a LLM-based Vision Chatbot with PyTorch and Hugging Face Transformers on Google Axion processors

Introduction

Set up an LLM based-Vision Chatbot

Deploy Vision Chatbot LLM backend server

Deploy Vision Chatbot LLM frontend server

Inference with Vision Chatbot

Next Steps

Deploy a LLM-based Vision Chatbot with PyTorch and Hugging Face Transformers on Google Axion processors

Overview

Install dependencies

Create a file with your Python dependencies

Install Python dependencies

Install PyTorch

Install Torch AO

Hugging Face Cli Login