Benchmark ClickHouse on Google Axion processors

Build a real-time analytics pipeline with ClickHouse on Google Cloud Axion

Log an issue

Fork and edit

Discuss on Discord

Build a real-time analytics pipeline with ClickHouse on Google Cloud Axion

ClickHouse benchmarking on Axion processors

In this section you’ll benchmark query latency on ClickHouse running on Google Axion (Arm64). The goal is to measure repeatable query latency with a focus on p95 latency, using data ingested with the real-time Dataflow pipeline.

Prepare ClickHouse for Accurate Latency Measurement

Disable Query Cache

ClickHouse may serve repeated queries from its query cache, which can artificially reduce latency numbers. To ensure that every query is fully executed, the query cache is disabled.

Run this inside the ClickHouse client:

    

        
        
SET use_query_cache = 0;

This ensures every query is executed fully and not served from cache.

Validate dataset size

Ensure enough data is present to produce meaningful latency results by checking the current row count.

    

        
        
SELECT count(*) FROM realtime.logs;

The output is similar to:

    

        
           ┌─count()─┐
1. │ 5000013 │ -- 5.00 million
   └─────────┘

If data volume is low, generate additional rows (optional):

    

        
        
INSERT INTO realtime.logs
SELECT
    now() - number,
    concat('service-', toString(number % 10)),
    'INFO',
    'benchmark message'
FROM numbers(1000000);

The output is similar to:

    

        
        Query id: 8fcbefab-fa40-4124-8f23-516fca2b8fdd
Ok.
1000000 rows in set. Elapsed: 0.058 sec. Processed 1.00 million rows, 8.00 MB (17.15 million rows/s., 137.20 MB/s.)
Peak memory usage: 106.54 MiB.

Define benchmark queries

These queries represent common real-time analytics patterns:

Filtered count: service-level analytics
Time-windowed count: recent activity
Aggregation by service: grouped analytics

Each query scans and processes millions of rows to stress the execution engine.

Query 1 – Filtered Count (Service-level analytics)

    

        
        
SELECT count(*)
FROM realtime.logs
WHERE service = 'service-5';

The output is similar to:

    

        
        Query id: cfbab386-7168-42ce-a752-2d5146f68b48

   ┌─count()─┐
1. │  350000 │
   └─────────┘
1 row in set. Elapsed: 0.013 sec. Processed 6.00 million rows, 74.50 MB (466.81 million rows/s., 5.80 GB/s.)
Peak memory usage: 3.25 MiB.

Query 2 – Time-windowed Count (Recent activity)

    

        
        
SELECT count(*)
FROM realtime.logs
WHERE event_time >= now() - INTERVAL 10 MINUTE;

The output is similar to:

    

        
        Query id: 7654746b-3068-4663-a5c6-6944d9c2d2b9
   ┌─count()─┐
1. │     572 │
   └─────────┘
1 row in set. Elapsed: 0.003 sec.

Query 3 – Aggregation by Service

    

        
        
SELECT
    service,
    count(*) AS total
FROM realtime.logs
GROUP BY service
ORDER BY total DESC;

The output is similar to:

    

        
        Query id: c48c0d30-0ef6-4fb9-bbb9-815a509a5f91

    ┌─service────┬──total─┐
 1. │ service-6  │ 350000 │
 2. │ service-1  │ 350000 │
 3. │ service-0  │ 350000 │
 4. │ service-7  │ 350000 │
 5. │ service-3  │ 350000 │
 6. │ service-4  │ 350000 │
 7. │ service-5  │ 350000 │
 8. │ service-2  │ 350000 │
 9. │ service-9  │ 350000 │
10. │ service-8  │ 350000 │
11. │ service-10 │ 250000 │
12. │ service-15 │ 250000 │
13. │ service-16 │ 250000 │
14. │ service-13 │ 250000 │
15. │ service-18 │ 250000 │
16. │ service-17 │ 250000 │
17. │ service-19 │ 250000 │
18. │ service-12 │ 250000 │
19. │ service-11 │ 250000 │
20. │ service-14 │ 250000 │
21. │ api        │     12 │
22. │ local      │      1 │
    └────────────┴────────┘
22 rows in set. Elapsed: 0.011 sec. Processed 6.00 million rows, 74.50 MB (527.10 million rows/s., 6.54 GB/s.)
Peak memory usage: 7.18 MiB.

Run repeatable latency measurements

To calculate reliable latency metrics, execute the same query multiple times (10 iterations) using clickhouse-client --time.

    

        
        
clickhouse-client --time --query "
SELECT count(*)
FROM realtime.logs
WHERE service = 'service-5';
"

You should see an output similar to:

Each run prints:

Query result (row count)
Execution time (seconds)
Output has row count + time mixed. We only need the time values.

Edit your file:

    

        
        
vi latency-results.txt

Only the latency values are required for statistical analysis. Row counts are removed.

Clean input for sorting and percentile calculation.
Remove 350000 lines if they exist.

Sort the latency values: Latency values are sorted in ascending order to compute percentiles.

    

        
        
sort -n latency-results.txt

Calculate p95 latency (manual):

The p95 latency represents the value under which 95% of query executions complete.

Formula:

    

        
        
p95 index = ceil(0.95 × N)

For 10 samples:

    

        
        
ceil(0.95 × 10) = ceil(9.5) = 10

The 10th value in the sorted list is your p95 latency.

p95 result

    

        
        
p95 latency = 0.011 seconds ≈ 11 ms

After executing the ClickHouse query 10 times on the GCP Axion (Arm) VM, the observed p95 query latency was ~11 ms, demonstrating consistently low-latency analytical performance on Arm-based infrastructure.

What you’ve accomplished

You’ve successfully completed a comprehensive ClickHouse benchmarking exercise on Google Axion (Arm64) processors. Key results from the c4a-standard-4 (4 vCPU, 16 GB memory) Arm64 VM running SUSE:

ClickHouse on Google Axion (Arm64) delivered consistently low query latency, even while scanning ~6 million rows per query.
Across 10 repeat executions, the p95 latency was ~11 ms, indicating stable and predictable performance.
Disabling the query cache ensured true execution latency, not cache-assisted results.
Analytical queries sustained 500M+ rows/sec throughput with minimal memory usage.

Throughout this Learning Path, you provisioned an Arm-based VM on Google Cloud, deployed ClickHouse, configured a real-time streaming pipeline with Pub/Sub and Dataflow, and validated end-to-end analytical performance. You can now deploy, optimize, and benchmark ClickHouse workloads on Google Cloud Arm infrastructure.

Back

Build a real-time analytics pipeline with ClickHouse on Google Cloud Axion

Introduction

Get started with ClickHouse on Google Cloud C4A Arm virtual machines

Create a Firewall Rule on GCP

Create a Google Axion C4A Arm virtual machine on GCP

Set up GCP Pub/Sub and IAM for ClickHouse real-time analytics on Axion

Install ClickHouse

Establish a ClickHouse baseline on Arm

Build a Dataflow streaming ETL pipeline to ClickHouse

Benchmark ClickHouse on Google Axion processors

Next Steps

Build a real-time analytics pipeline with ClickHouse on Google Cloud Axion

ClickHouse benchmarking on Axion processors

Prepare ClickHouse for Accurate Latency Measurement

Disable Query Cache

Validate dataset size

Define benchmark queries

Run repeatable latency measurements

What you’ve accomplished