You can view IP addresses, host names, GPU allocation, memory use, and other data of your vSphere Bitfusion cluster, servers, and clients in the vSphere Bitfusion Plug-in.
Monitoring vSphere Bitfusion Cluster
- The IP address of the primary vSphere Bitfusion server. The vSphere Bitfusion Plug-in uses the IP for communication.
- The allocation history of GPUs, shown in the Cluster GPU Allocation chart. The chart covers a range from the last 5 minutes to the last 30 days, the number of GPUs populating the cluster, and the number of GPUs allocated from all vSphere Bitfusion servers.
- All vSphere Bitfusion servers in the vSphere Bitfusion cluster, including servers that have been disabled or powered off, shown in the Servers table. Each entry displays a host name, IP address, and the number of the allocated GPUs.
- All vSphere Bitfusion clients that have run applications on the vSphere Bitfusion servers, shown in the Clients table. Each entry lists a host name, ID, and the number of GPUs currently allocated to the client.
Monitoring vSphere Bitfusion Servers
- All vSphere Bitfusion servers in the vSphere Bitfusion cluster, shown in the Servers table. You can select any server to display the server details. The table displays each server’s host name, IP address, current GPU allocation, and the current health state.
- A heat map with an entry for each GPU on the server, shown in the Allocation chart. Each cell displays by intensity of color how engaged the GPU is during the selected time interval . The level of engagement is a weighted sum of memory allocation and CUDA cell use.
- Memory and core use charts, one pair for each GPU. The Memory charts also show the memory capacity.
- The outgoing and incoming traffic for each network interface.
Monitoring vSphere Bitfusion Clients
You can use the vSphere Bitfusion Plug-in to view the following data for your clients.
- All vSphere Bitfusion clients in the vSphere Bitfusion cluster, shown in the Clients table. A new entry appears on the list after a new client runs a vSphere Bitfusion command that requires a server connection for the first time. You can select a client to display the client details. The table displays each client’s host name, ID, current GPU allocation, and version.
- The GPUs that are allocated to a client, shown in the GPU Assignment chart. A client can run multiple applications, each allocating separate GPUs, but they are displayed together. Allocations of partial GPUs add the fractional value to the sum.