Java on Arm Neoverse processors

Java is a general-purpose, high-level programming language used in a wide variety of applications.

Java code is typically compiled to bytecode that runs on a Java virtual machine (JVM) making it portable across computer architectures.

There are many ways to install Java on Arm Linux distributions. Refer to the Java install guide for information about how to install Java.

How can I increase Java performance on Arm?

Java runs well on Arm, but there are some things to investigate to make sure you are getting the best performance.

Which JVM flags impact performance?

The JVM includes a number of flags which are available to tune performance and aid in debugging. Some of the flags are general purpose and some are Arm architecture specific.

To print the final values of the flags after the JVM has been initialized run:

    

        
        
            java -XX:+PrintFlagsFinal -version
        
    

Use the output from the above command to see the current values before making any changes.

For example, to check the value of CICompilerCount run:

    

        
        
            java -XX:+PrintFlagsFinal -version | grep CICompilerCount
        
    

The value of the flag is printed:

    

        
          intx CICompilerCount = 3     {product} {ergonomic}

        
    

Flags to review

The -XX:-TieredCompilation -XX:ReservedCodeCacheSize -XX:InitialCodeCacheSize flags provide significant performance improvement for some applications.

The -XX:-TieredCompilation flag turns off intermediate compilation tiers. This may help long-running, predictable applications especially when the warmup period doesn’t significantly impact overall performance.

You can increase the code cache sizes using the -XX:ReservedCodeCacheSize and -XX:InitialCodeCacheSize flags. Increasing them may improve performance. You can also decrease them in a memory constrained environment.

The -XX:CICompilerCount flag specifies the number of compiler threads for the just-in-time (JIT) compiler. This can be useful to tune performance on multi-core systems.

The -XX:CompilationMode flag instructs the JIT compiler to to use highest optimization level (high-only) or the quick optimization level (quick-only).

The best way to determine optimal values for your application is to test your application with different values.

Architecture flags

There are flags that are specific to the Arm architecture and indicate if a particular hardware feature is used. You can see all the flags by running:

    

        
        
            java -XX:+PrintFlagsFinal -version | grep ARCH
        
    

The output is similar to:

    

        
        bool AvoidUnalignedAccesses                   = false                                {ARCH product} {default}
intx BlockZeroingLowLimit                     = 256                                  {ARCH product} {default}
bool NearCpool                                = true                                 {ARCH product} {default}
 int SoftwarePrefetchHintDistance             = 192                                  {ARCH product} {default}
bool TraceTraps                               = false                                {ARCH product} {default}
bool UseBlockZeroing                          = true                                 {ARCH product} {default}
cstr UseBranchProtection                      = none                                 {ARCH product} {default}
bool UseCRC32                                 = true                                 {ARCH product} {default}
bool UseCryptoPmullForCRC32                   = false                                {ARCH product} {default}
bool UseLSE                                   = true                                 {ARCH product} {default}
bool UseNeon                                  = false                                {ARCH product} {default}
bool UseSIMDForArrayEquals                    = true                                 {ARCH product} {default}
bool UseSIMDForBigIntegerShiftIntrinsics      = true                                 {ARCH product} {default}
bool UseSIMDForMemoryOps                      = false                                {ARCH product} {default}
uint UseSVE                                   = 0                                    {ARCH product} {default}
bool UseSimpleArrayEquals                     = false                                {ARCH product} {default}

        
    

Depending on your application, you may want to investigate the vector processing flags for SIMD, NEON, SVE, and CRC.

You can try Process Watch to monitor the usage of SIMD and CRC instructions.

Refer to the Java documentation for more information about the flags.

Crypto

AES encryption/decryption has been optimized for Arm, and the optimizations are enabled by default in newer versions of Java.

You can also look at Amazon Corretto Crypto Provider (ACCP) 2 to get the best performance on Arm.

Stack size

The default stack size for Java threads (ThreadStackSize) is 2 MB on aarch64 compared to 1 MB on x86_64. You can check the default with:

    

        
        
            java -XX:+PrintFlagsFinal -version | grep ThreadStackSize
        
    

The output is:

    

        
             intx CompilerThreadStackSize = 2048  {pd product} {default}
     intx ThreadStackSize         = 2048  {pd product} {default}
     intx VMThreadStackSize       = 2048  {pd product} {default}

        
    

The default can be changed on the command line with either -XX:ThreadStackSize=<kbytes> or -Xss<bytes>. Note that the first option is in kilobytes and the second is in bytes.

Usually, there’s no need to change the default stack size, because the thread stack will be committed as it grows.

Transparent Huge Pages

If Transparent Huge Pages (THP) are set to always, the page size matches the default stack size. In this case, the full stack size is committed to memory. If you have a very high number of threads the memory usage will be large.

To mitigate this issue, you can either manually change the stack size using the flags or change the THP setting to madvise.

To view the THP value run:

    

        
        
            cat /sys/kernel/mm/transparent_hugepage/enabled
        
    

If you see the output shows the current value in brackets:

    

        
        [always] madvise never

        
    

You can change to madvise by running:

    

        
        
            echo madvise | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
        
    

Even if you change from always to madvise, the JVM can still use THP for the Java heap and code cache if you use the -XX:+UseTransparentHugePages flag.

Native code in shared objects

A Java JAR (Java Archive) file is a package file format used to aggregate multiple Java class files, along with associated metadata and resources (like text, images, etc.), into a single file for distribution.

JAR files can include compiled native code. The Java Native Interface (JNI) is used to interact with native code. For Linux, this means shared libraries (.so files) are included in the JAR.

Java JARs can include shared-objects that are architecture specific. Some Java libraries check if these shared objects are found and if they are they use a JNI to call to the native library instead of relying on a generic Java implementation of the function. While the code might work, without the JNI the performance can suffer.

A quick way to check if a JAR contains such shared objects is to simply unzip it and check if any of the resulting files are shared-objects and if an aarch64 (arm64) shared-object is missing:

Unzip and search for share objects in the JAR:

    

        
        
            unzip foo.jar
find . -name "*.so" -exec file {} \;
        
    

For each x86-64 ELF file, check there is a corresponding aarch64 ELF file in the binaries. With some common packages (e.g. commons-crypto) we’ve seen that even though a JAR can be built supporting Arm manually, artifact repositories such as Maven don’t have updated versions. To see if a certain artifact version may have Arm support, consult the Common JARs with native code Table .

Are there other Java on Arm blogs?

Here are a few resources:

Back
Next