How to use diagnostic tools to run performance tests on a cluster in ThinkAgile HX

Description

This article introduces the procedure used to run performance tests on a cluster using the diagnostics utility. This tool is useful in the pre-sales demonstration of a cluster and while identifying the source of performance issues in a production cluster. Diagnostics should also be run as a part of the setup process to ensure a cluster is running properly before the customer takes ownership of it.

The diagnostic utility deploys a VM on each node in the cluster. Controller VMs (CVMs) control the diagnostic VM on their hosts and report back to a single system.

The diagnostics test covers the following data:

Sequential write bandwidth
Sequential read bandwidth
Random read IOPS
Random write IOPS

Applicable Systems

ThinkAgile HX

Procedure

1. Use SSH to log in to any CM in the cluster.

2. Setup the diagnostics test.

nutanix@CVM:~$ ~/diagnostics/diagnostics.py cleanup
Cleaning up node 10.10.3.13 ... done.
Cleaning up node 10.10.3.14 ... done.
Cleaning up node 10.10.3.15 ... done.
Cleaning up the container and the storage pool ... done.

3. Run the diagnostics test.

nutanix@CVM:~$ ~/diagnostics/diagnostics.py run

The script performs the following tasks:

Install a diagnostic VM on each node in the cluster
If necessary, create cluster entities to support the test
Uses the Linux fio utility to run four performance tests
Reports results back to a single system

If the command fails and returns an ERROR:root:Zookeeper host port list is not set message, refresh the environment by running either source /etc/profile or bash -l, and then run the command again.

4. The test might take up to 15 minutes to complete for a four-node cluster. For larger clusters, you should allow more time.

5. When complete, review the results. You also can review the results archived in the home/nutanix/diagnostics/results/timestamp directory.

6. Because the test creates new cluster entities, it is necessary to run a cleanup script when you are finished.

nutanix@CVM:~$ ~/diagnostics/diagnostics.py cleanup

Diagnostics output

System output similar to the following indicates a successful test.

Checking if an existing storage pool can be used ...
Using storage pool sp1 for the tests.
Checking if the diagnostics container exists ... does not exist.
Creating a new container NTNX-diagnostics-ctr for the runs ... done.
Mounting NFS datastore 'NTNX-diagnostics-ctr' on each host ... done.
Deploying the diagnostics UVM on host 172.16.8.170 ... done.
Preparing the UVM on host 172.16.8.170 ... done.
Deploying the diagnostics UVM on host 172.16.8.171 ... done.
Preparing the UVM on host 172.16.8.171 ... done.
Deploying the diagnostics UVM on host 172.16.8.172 ... done.
Preparing the UVM on host 172.16.8.172 ... done.
Deploying the diagnostics UVM on host 172.16.8.173 ... done.
Preparing the UVM on host 172.16.8.173 ... done.
VM on host 172.16.8.170 has booted. 3 remaining.
VM on host 172.16.8.171 has booted. 2 remaining.
VM on host 172.16.8.172 has booted. 1 remaining.
VM on host 172.16.8.173 has booted. 0 remaining.
Waiting for the hot cache to flush ... done.
Running test 'Prepare disks' ... done.
Waiting for the hot cache to flush ... done.
Running test 'Sequential write bandwidth (using fio)' ... bandwidth MBps
Waiting for the hot cache to flush ... done.
Running test 'Sequential read bandwidth (using fio)' ... bandwidth MBps
Waiting for the hot cache to flush ... done.
Running test 'Random read IOPS (using fio)' ... operations IOPS
Waiting for the hot cache to flush ... done.
Running test 'Random write IOPS (using fio)' ... operations IOPS
Tests done.

Note:

Expected results vary based on the specific AOS version and hardware model used
The IOPS values reported by the diagnostics script are higher than the values reported by the Nutanix management interfaces. This is because the diagnostics script reports physical disk I/O, and the management interfaces show IOPS reported by the hypervisor.

Additional Information

Document ID:HT510112

Original Publish Date:03/13/2020

Last Modified Date:01/02/2024