What is a Vector Database?

A vector database is a specialized database designed to store and query vector representations of data. It is a crucial component of many AI applications. But what exactly is it, and how does it work?

Traditional databases store data in tables or objects with defined attributes. However, they struggle to recognize similarities between data points that are not explicitly defined.

Vector databases, on the other hand, are designed to store a large number of vectors - which are arrays of numbers - and provide algorithms for searching through these stored vectors. This makes it much easier to identify similarities by comparing the vector locations in N-dimensional space; typically using distance metrics like cosine similarity or Euclidean distance.

How can you convert complex ideas, such as the semantic meaning of a series of words, into a series of number-based vectors? You can do so using a process called embedding.

What are Embeddings, and what can I do with them?

Embeddings are numerical vectors generated by an AI model to capture the semantic meanings of text. They convert collections of tokens (such as word fragments) into points in an N-dimensional space.

By comparing these vectors, you can query a vector database - using, for example, the embedding of a user’s question - to retrieve the most similar pieces of embedded data.

In the scenario you’ll work on in this Learning Path, this process helps determine which Arm Learning Path best answers the user’s query.

To get started, you have to convert the raw data, which is the Arm Learning Path content, into smaller, more consumable, chunks.

Here these chunks are small yaml files. Then you will run these chunks through the LLM model to create embeddings and store them in the FAISS vector database.

Facebook AI Similarity Search (FAISS) is a library developed by Facebook AI Research that is designed to efficiently search for similar vectors in large datasets. FAISS is highly optimized for both memory usage and speed, making it the fastest similarity search algorithm available.

One of the key reasons FAISS is so fast is its implementation of efficient Approximate Nearest Neighbor (ANN) search algorithms. ANN algorithms allow FAISS to quickly find vectors that are close to a given query vector without having to compare it to every single vector in the database. This significantly reduces the search time, especially in large datasets.

Additionally, FAISS performs all searches in-memory, which means that it can leverage the full speed of the system’s RAM. This in-memory search capability ensures that the search operations are extremely fast, as they avoid the latency associated with disk I/O operations.

In this application, you will take the input from the user and embed it using the same model you used for your database. You will then use FAISS nearest neighbor search to compare the user input to the nearest vectors in the database. Next, you will refer back to the original chunk files and look for these closest vectors. Using the data from the chunk.yaml files, you can retrieve the Arm resource(s) most relevant to that user’s question.

You can then use the retrieved resources to augment the context for the LLM, which generates a final response that is both contextually-relevant and accurate.

In-Memory Deployment

To ensure that your application scales efficiently, copy the FAISS database into every deployment instance. By deploying a static in-memory vector store in each instance, you eliminate the need for a centralized database, which can become a bottleneck as the number of requests increases.

When each instance has its own copy of the FAISS database, it can perform vector searches locally, leveraging the full speed of the system’s RAM. This approach ensures that the search operations are extremely fast and reduces the latency associated with network calls to a centralized database.

Moreover, this method enhances the reliability and fault tolerance of the application. If one instance fails, others can continue to operate independently without being affected by the failure. This decentralized approach also simplifies the deployment process, as each instance is self-contained and does not rely on external resources for vector searches.

By copying the FAISS database into every deployment, you can achieve a scalable, high-performance solution that can handle a large number of requests.

Collecting Data into Chunks

There is a companion GitHub repo for this Learning Path that serves as a Python-based Copilot RAG Extension example.

In this repo, you can find scripts to convert an Arm Learning Path into a series of chunk.yaml files for use in the RAG application.

Clone the GitHub repository

To clone the repo, run:

    

        
        
git clone https://github.com/ArmDeveloperEcosystem/python-rag-extension.git

    

Chunk Creation Script Setup

  1. Navigate to the vectorstore folder in the python-rag-extension github repo that you just cloned:
    

        
        
cd python-rag-extension/vectorstore

    
  1. It is recommended that you use a virtual environment to manage dependencies.

Ensure you have conda set up in your development environment. If you need guidance, follow the [Installation Guide]( https://docs.anaconda.com/miniconda/install/ .

  1. To create a new conda environment, use the following command:
    

        
        
conda create --name vectorstore python=3.11

    

Once setup is complete, activate the new environment:

    

        
        
conda activate vectorstore

    

Install the required packages:

    

        
        
conda install --file vectorstore-requirements.txt

    

Generate Chunk Files

To generate chunks, use the following command:

    

        
        
python chunk_a_learning_path.py --url <LEARNING_PATH_URL>

    

Replace <LEARNING_PATH_URL> with the URL of the Learning Path that you want to process.

If no URL is provided, the script defaults to a known Learning Path URL .

The script processes the specified Learning Path and saves the chunks as YAML files in a ./chunks/ directory.

Combine Chunks into FAISS Index

Once you have a ./chunks/ directory full of YAML files, you now need to use FAISS to create your vector database.

OpenAI Key and Endpoint

Ensure your local environment has your AZURE_OPENAI_KEY and AZURE_OPENAI_ENDPOINT set.

If required, you can follow the instructions below that explain how to generate and deploy Azure OpenAI keys.

Generate and Deploy Azure OpenAI keys

  1. Create an OpenAI Resource:
  • Go to the Azure Portal .
  • Click on Create a resource.
  • Search for OpenAI, and select Azure OpenAI Service.
  • Click Create.
  1. Configure the OpenAI Resource:
  • Fill in the required details such as Subscription, Resource Group, Region, and Name.
  • Click Review + create and then Create to deploy the resource.
  1. Generate API Key and Endpoint:
  • Once the resource is created, navigate to the resource page.
  • Under the Resource Management->Keys and Endpoint section, you will find the key and endpoint values.
  • Copy these values and set them in your local environment:
    

        
        
export AZURE_OPENAI_KEY="<your_openai_key>"
export AZURE_OPENAI_ENDPOINT="https://<your_openai_endpoint>.openai.azure.com/"

    

You now have the necessary keys to use Azure OpenAI in your application.

  1. Deploy text-embedding-ada-002 model:
  • Go inside Azure AI Foundry for your new deployment.
  • Under Deployments, ensure you have a deployment for “text-embedding-ada-002”.

Generate Vector Database Files

Run the Python script to create the FAISS index .bin and .json files.

Note

This assumes the chunk files are located in a chunks subfolder, as they should be automatically.

    

        
        
python local_vectorstore_creation.py

    

Copy the generated bin and json files to the root directory of your Flask application.

They should be in the vectorstore/chunks folder. Since you are likely still in the vectorstore folder, run these commands to copy:

    

        
        
cp chunks/faiss_index.bin ../
cp chunks/metadata.json ../

    

Your vector database is now ready for your flask application.

Back
Next