About this Learning Path

Skill level:	Advanced
Reading time:	1 hr 30 min
Last updated:	17 Jun 2026

Skill level:

Advanced

Reading time:

1 hr 30 min

Last updated:

17 Jun 2026

Author:	Odin Shen, Arm
Arm IP:	Cortex-A
Tags:	ML Linux Python Docker Ollama

Author:

Odin Shen, Arm

Arm IP:

Cortex-A

Tags:

Linux

Python

Docker

Ollama

Who is this for?

This is an advanced topic for developers building persistent local AI agent systems on NVIDIA DGX Spark who want to use Arm Grace CPUs for orchestration and Blackwell GPUs for local LLM inference and embeddings.

What will you learn?

Upon completion of this Learning Path, you will be able to:

Describe how persistent AI runtimes combine orchestration, semantic memory, and local inference
Build a continuously running local AI agent using Hermes Agent, Ollama, and Qdrant
Use Arm Grace CPUs to orchestrate event-driven AI workflows on NVIDIA DGX Spark
Deploy semantic memory and contextual retrieval pipelines using vector embeddings and Qdrant

Prerequisites

Before starting, you will need the following:

An NVIDIA DGX Spark system with at least 15 GB of available disk space
Familiarity with running Python scripts and basic Docker container workflows

Orchestrate a persistent local AI agent with Hermes on NVIDIA DGX Spark

Introduction

Explore persistent AI runtime architecture on NVIDIA DGX Spark

Build the DGX Spark AI runtime foundation

Deploy Hermes Agent as an orchestration runtime

Add local LLM inference to Hermes Agent

Build persistent semantic memory for Hermes Agent

Add semantic retrieval and contextual reasoning to Hermes Agent

Add autonomous workspace cognition to Hermes Agent

Next Steps

Orchestrate a persistent local AI agent with Hermes on NVIDIA DGX Spark

About this Learning Path

Who is this for?

What will you learn?

Prerequisites