Introduction
Understand Azure Cobalt 100 VMs and Elasticsearch benchmarking with ESRally
Create an Azure Cobalt 100 Arm virtual machine for Elasticsearch deployment
Install Elasticsearch and ESRally on the Cobalt 100 virtual machine instance
Benchmark Elasticsearch using ESRally on the Cobalt 100 virtual machine instance
Next Steps
ESRally is designed to benchmark Elasticsearch instances. It uses a race model where a selected track exercises specific indexing and query patterns. In this section, you’ll run the geonames track to benchmark your Elasticsearch deployment.
The geonames track is a general-purpose Elasticsearch workload based on the GeoNames dataset. ESRally indexes millions of location records and runs representative search operations so you can measure indexing throughput and query latency under mixed ingest and search conditions.
Run the benchmark command from the virtual machine. Replace 9.4.0 with the Elasticsearch version you confirmed in the previous section:
esrally race --distribution-version=9.4.0 --track=geonames --kill-running-processes
By default, ESRally downloads and manages its own Elasticsearch process for the duration of the benchmark. It does not use the Elasticsearch instance you started with systemctl. The --distribution-version flag tells ESRally which Elasticsearch version to download and run. The --kill-running-processes flag stops any existing Elasticsearch processes before starting, including the one started by systemctl, so there is no port conflict on 9200.
This benchmarking test takes approximately 85-90 minutes to complete on an E4pds_v6 instance. To avoid skewing the results, don’t interrupt or pause the benchmark.
The following sample output shows a baseline geonames run on an Azure Cobalt 100 E4pds_v6 virtual machine. Your results will be similar but might vary slightly depending on VM load, Elasticsearch version, and available memory.
Indexing sustained a mean of about 61,054 docs/s with 0% errors. Common read-path workloads such as default search, term search, phrase search, and cached aggregation stayed in an excellent 3-4 ms p50 range.
The system completed the run without any old-generation garbage collections, suggesting healthy JVM behavior under this benchmark. The main latency costs appeared in heavier workloads such as uncached aggregation, scroll, expression queries, and script-based scoring. This is consistent with Elasticsearch performance expectations for compute-intensive query patterns.
Performance summary for the E4pds_v6 Arm64 Cobalt 100 VM
Performance details on the E4pds_v6 Arm64 Cobalt 100 VM
field_value_script_score was about 1.39x slower than field_value_function_score at p50 latency (197.8 ms vs 142.1 ms). painless_static was the slowest task in the run at 410.9 ms p50, slightly ahead of painless_dynamic at 384.9 ms p50.This benchmark result supports the view that Azure Cobalt 100 E4pds_v6 is capable of delivering strong Elasticsearch baseline performance for the geonames track, especially for ordinary search, sort, and cached aggregation paths. Indexing sustained over 61,000 docs/s on average with zero errors, and the JVM remained stable throughout with no old-generation collections.
The main practical limitation is query complexity. Scripting, score computation, scroll, and uncached aggregation create a clear latency step-up relative to fast-path queries. These workloads should be isolated, cached, or minimized when low latency matters.
You’ve now deployed Elasticsearch on an Arm-based Azure Cobalt 100 virtual machine, installed ESRally, and run a baseline geonames benchmark. You’ve interpreted the main throughput and latency indicators to assess initial performance on Arm.
You can now use this baseline to continue tuning or evaluating Arm-based deployments.