NCC Health Check: cvm_memory_usage_check
NCC Health Check: cvm_memory_usage_check
NCC Health Check: cvm_memory_usage_check
Description
The NCC health check cvm_memory_usage_check verifies if each Controller VM (CVM) has enough free memory on each node.
The check uses the MemAvailable metric reported in /proc/meminfo on each CVM. By default, the check fails if MemAvailable on any CVM is less than:
- 768000 KB (750 MB) with
- 589824 KB (576 MB) with NCC 3.10 and higher
Note: If the value of MemAvailable on any CVM is less than the configured threshold for 20 minutes, a critical alert of "CVM or PC VM RAM Usage High" is triggered with ID A1056.
If you receive the following alert and the size of available memory is larger than 750 MB, upgrade to the latest version of NCC.
Main memory usage in Controller VM or Prism Central VM {ip_address} is high. {available_memory_kb} KB of memory is free
Running the NCC Check
Run the NCC check as part of the complete NCC Health Checks.
Or you can run this check separately.
You can also run the checks from the Prism web console Health page: select Actions > Run Checks. Select All checks and click Run.
- This check runs on Controller VMs and Prism Central VMs.
- This check is scheduled to run every 5 minutes, by default.
- This check will generate an alert after 5 consecutive failures across scheduled intervals.
Note: We can see MemFree category in /proc/meminfo similar to MemAvailable. However, MemFree does not contain buffers & cached memory area which can be recycled. Hence we should not check the value of MemFree for the free memory.
Sample output
For Status: WARN
------------------------------------------------------------------------+
Detailed information for cvm_memory_usage_check:
Node x.x.x.x:
WARN: Unable to parse response from x.x.x.x:
Refer to KB 2473 (http://portal.nutanix.com/kb/2473) for details on cvm_memory_usage_check or Recheck with: ncc health_checks system_checks cvm_memory_usage_check --cvm_list=x.x.x.x
For Status: FAIL (589824 KB for NCC 3.10 and higher and 758000 KB for NCC <3.10)
Node x.x.x.x:
Main memory usage in Controller VM 10.x.x.x is high. 758000 KB of memory is free.
Refer to KB 2473 (http://portal.nutanix.com/kb/2473) for details on cvm_memory_usage_check or Recheck with: ncc health_checks system_checks cvm_memory_usage_check --cvm_list=x.x.x.x
Output messaging
Check ID | 3023 |
---|---|
Description | Check that CVM or Prism Central VM memory usage is not high. |
Causes of failure | The RAM usage on the Controller VM or Prism Central VM has been high. |
Resolutions | Check the memory utilization of Prism Central VM or Controller VM. If abnormal behavior is seen, please collect logs and contact Nutanix Support. |
Impact | Cluster performance may be significantly degraded. |
Alert ID | A1056 |
Alert Title | CVM or Prism Central VM RAM Usage High |
Alert Smart Title | RAM Usage high on vm_type : ip_address |
Alert Message | Main memory usage in vm_type ip_address is high, available_memory_kb KB is free. |
Note: To check if this alert is generated on CVM or PCVM (Prism Central VM), select the alert and check the Source Entity mentioned on the UI. It will print the name of the Virtual Machine from where the alert is generated.
Solution
If the check reports a warning, or a "CVM or PC VM RAM Usage High" alert is triggered, ensure that the CVMs are configured within the threshold amount of memory depending on the features used on the cluster. For more information, see Controller VM Memory configuration (for CVMs) and Prism Central Instance Configurations (for Prism Central VMs).
If the check reports a warning on a PC VM, do the following:
- Ensure that the PC VMs are configured within the threshold amount of memory depending on the features used on the PC cluster.
Prism Central <5.17.1. For each Prism Central VM
PC Size | vCPU | Memory [GB] | VMs Supported (across all clusters) |
Small | 4 | 16 GB | 2500 (scaleout: 5000) |
Large | 8 | 32 GB | 12500 (scaleout: 25000) |
Prism Central 5.17.1 or higher. For each Prism Central VM.
PC Size | vCPU | Memory [GB] | VMs Supported (across all clusters) |
Small | 6 | 26 GB | 2500 (scaleout: 5000) |
Large | 10 | 44 GB | 12500 (scaleout: 25000) |
Additional memory requirements if below services enabled:
PC Size | Calm/Leap or both enabled, GB | Microsegmentation, GB |
Small | 4 | 1 |
Large | 8 | 1 |
- Upgrade NCC to the latest on the Prism Central environment. See Upgrading NCC on Prism Central.
- Upgrade Prism Central to 5.17.1 or higher versions. See Prism Central Upgrade and Installation.
- Prism Central VMs should be only configured with the standard resource size based on the VMs handled by Prism Central and we do not recommend any custom resource addition. If the PC VMs are configured within the threshold amount of memory, but the check still reports a failure, contact Lenovo Premier Support (if you have coverage) or Nutanix Support to verify that the PC VM services are behaving as expected.
- Check to see if any VMs are over-provisioned with more memory resources from the Prism Central Dashboard using the Behavioral Learning Tools.
When logging a support case with Lenovo Premier Support (if you have coverage) or Nutanix Support on this issue, include the output of the following commands to the case:
- Collect current memory usage information (CVM or PCVM):
nutanix@cvm$ allssh "cat /proc/meminfo | grep Mem"
- Collect a log bundle using Logbay and upload it to the case directly via FTP/SFTP, or manually via the Support Portal. For more information on how to use Logbay, see Nutanix KB 6691. To automatically upload the log bundle using Logbay, use the --dst (FTP/SFTP destination) and -c (support case number) options.
Additional Information
- Nutanix KB 2473 - Original document in Nutanix Portal
- Lenovo ThinkAgile HX Series knowledge base article landing page