The logical design provides a high-level overview of the Health Reporting and Monitoring for VMware Cloud Foundation validated solution.
Logical Design
The design consists of a host virtual machine deployed in the management domain of your VMware Cloud Foundation instance, that hosts the PowerShell module for VMware Cloud Foundation Reporting and the Python Module for VMware Cloud Foundation Health Monitoring in VMware Aria Operations. The host virtual machine uses the two modules to periodically connect to the VMware Cloud Foundation Support and Serviceability (SoS) utility and SDDC component APIs to collect health metrics, generate HTML reports, and send the data to VMware Aria Operations. This data is then presented through custom VMware Aria Operations dashboards to provide active health monitoring of your VMware Cloud Foundation instance.
PowerShell Module for VMware Cloud Foundation Reporting
The PowerShell Module for VMware Cloud Foundation Reporting is an open-source PowerShell module that ships with a library of cmdlets that connect to SDDC management components, collect health data, and publish that data in different formats. The cmdlet library contains combined operation, health check, system alert, configuration, and system overview functions. These functions provide insight to the operational state of your VMware Cloud Foundation instance.
The PowerShell module uses the VMware Cloud Foundation Support and Serviceability (SOS) utility as well as SDDC component APIs to collect and publish health data for SDDC Manager, vCenter Server, vSAN, NSX, and VMware Aria Suite Lifecycle. The PowerShell module collects storage, networking, configuration, and security data. You install and configure the PowerShell module on the host virtual machine.
The PowerShell module can generate the following reports:
- System overview report
- Health report
- Alert report
- Configuration report
- Upgrade precheck report
Python Module for VMware Cloud Foundation Health Monitoring in VMware Aria Operations
The Python Module for VMware Cloud Foundation Health Monitoring in VMware Aria Operations is an open-source collection of python scripts and VMware Aria Operations artifacts. It uses the VMware Cloud Foundation Supportability and Serviceability (SOS) utility and the supporting PowerShell modules to collect health data for a VMware Cloud Foundation instance and then send this data to objects in VMware Aria Operations as custom metrics for use in dashboards to monitor the platform's health. This enables the creation and configuration of custom dashboards, alerts, notification, and remediation in VMware Aria Operations. You install and configure the Python module on the host virtual machine.
The Python module includes predefined custom VMware Aria Operations dashboards in the VCF Health dashboard group that cover individual component health metrics and an aggregated single pane of glass rollup dashboard.
Dashboard |
Description |
---|---|
VCF Health Rollup |
Rollup for all individual dashboards for VCF Health. |
VCF Backup Health |
Displays the backup status for SDDC Manager, vCenter Servers, and NSX Local Managers. |
VCF Certificate Health |
Displays the component certificates are valid (within the expiry date). |
VCF Compute Health |
Displays ESXi health, including host licenses, disk storage, disk partitions, core dumps, free pool, and overall health status. Shows overall health of vCenter Server instances. |
VCF Connectivity Health |
Displays connectivity health which verifies the connection between SDDC Manager and the underlying components of VMware Cloud Foundation. Includes Ping, SSH connectivity, and API connectivity health checks for SDDC components. |
VCF DNS Health |
Displays the Forward and Reverse DNS health summary. |
VCF Hardware Compatibility |
Displays the data from the Hardware Compatibility check which validates ESXi hosts and vSAN devices. |
VCF Networking Health |
Displays the health of Local NSX Managers, Edge Clusters, Edge Nodes, Transport Nodes, Transport Node Tunnels and Tier-0 Gateway BGP connections. |
VCF NTP Health |
Displays the NTP health which verifies that components have their time synchronized with the NTP server used by SDDC Manager. It also ensures that the hardware and software time stamp of ESXi hosts are within 5 minutes of the SDDC Manager appliance. |
VCF Password Health |
Displays the password health checking for expiry across the VMware Cloud Foundation instance. |
VCF SDDC Manager and vCenter Services Health |
Displays service health for services running within SDDC Manager and vCenter Server. |
VCF Snapshot Health |
Displays the snapshot status for SDDC Manager, vCenter Servers, and NSX Local Managers. |
VCF Storage Health |
Displays disk capacity health for SDDC Manager, vCenter Servers, ESXi hosts, and datastores. Also displays VMs with Connected CD-ROMs. |
VCF vSAN Health |
Displays vSAN health across ESXi hosts and vSphere clusters. |
VCF Version Health |
Displays the component version and compares it with SDDC Manager inventory, the actual installed Bill of Materials (BoM) component version, and the BoM component versions to detect any drift. |
Single VMware Cloud Foundation Instance with a Single Availability Zone |
Single VMware Cloud Foundation Instance with Multiple Availability Zones |
Multiple VMware Cloud Foundation Instances |
---|---|---|
A host virtual machine is deployed on the management VLAN in the management domain. |
|
In the first VMware Cloud Foundation instance, a host virtual machine is deployed on the management VLAN in the management domain. |