Tuning via NIC queue count

To further optmize your settings, you can set the NIC queue count and observe the performance uplift:

Typically, the number of transmit/receive queues for network cards in bare-metal environments is relatively large, reaching 63 on Arm Neoverse. Each transmit/receive queue corresponds to one interrupt number. Before CPU cores are taken offline, there are sufficient cores to handle these interrupt numbers. However, when only 8 cores are retained, it results in a single core having to handle multiple interrupt numbers, thereby triggering more context switches.

Setting NIC queue count

  1. Use the following command to find the NIC name corresponding to he IP address.
    

        
        
ip addr

    

From the output you can see that the NIC name enp1s0f0np0 corresponds to the IP address 10.169.226.181.

    

        
        1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute
       valid_lft forever preferred_lft forever
2: enP11p4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    link/ether 0e:cc:0b:ff:f6:57 brd ff:ff:ff:ff:ff:ff
    inet 172.31.46.193/20 metric 100 brd 172.31.47.255 scope global dynamic enP11p4s0
       valid_lft 1938sec preferred_lft 1938sec
    inet6 fe80::ccc:bff:feff:f657/64 scope link
       valid_lft forever preferred_lft forever

        
    
  1. Set the network interface name variable
    

        
        
net=enp1s0f0np0

    
  1. Use the following command to check the current transmit/receive queues of the ${net} network interface
    

        
        
sudo ethtool -l ${net}

    

It can be observed that the number of transmit/receive queues for the ${net} network interface is currently 63.

    

        
        
Channel parameters for enP11p4s0:
Pre-set maximums:
RX:		n/a
TX:		n/a
Other:		n/a
Combined:	32
Current hardware settings:
RX:		n/a
TX:		n/a
Other:		n/a
Combined:	32

    
  1. Use the following command to reset the number of transmit/receive queues for the ${net} to match the number of CPUs, which is 8.
    

        
        
sudo ethtool -L ${net} combined 8

    
  1. Verify whether the settings have been successfully applied.
    

        
        
sudo ethtool -l ${net}

    

You should see that the number of combined Rx/Tx queues has been updated to 8.

    

        
        Channel parameters for enP11p4s0:
Pre-set maximums:
RX:		n/a
TX:		n/a
Other:		n/a
Combined:	32
Current hardware settings:
RX:		n/a
TX:		n/a
Other:		n/a
Combined:	8

        
    

The performance uplift after tuning NIC queue count

  1. Shutdown and restart Tomcat on your Arm Neoverse bare-metal instance as shown:
    

        
        
~/apache-tomcat-11.0.10/bin/shutdown.sh 2>/dev/null
ulimit -n 65535 && ~/apache-tomcat-11.0.10/bin/startup.sh

    
  1. Run wrk2 on your x86_64 bare-metal instance:
    

        
        
ulimit -n 65535 && wrk -c1280 -t128 -R500000 -d60 http://${tomcat_ip}:8080/examples/servlets/servlet/HelloWorldExample

    

Notice the performance uplift after tuning the NIC queue count:

    

        
          Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     8.35s     4.14s   16.33s    61.16%
    Req/Sec     2.96k    73.02     3.24k    89.16%
  22712999 requests in 1.00m, 11.78GB read
Requests/sec: 378782.37
Transfer/sec:    201.21MB

        
    
Back
Next