The Management Pack for VMware Aria Operations for Applications creates objects, metrics, and relationships in Aria Operations based on Metrics in Aria Operations for Applications.
Object Types
VMware Aria Operations for Applications Adapter Instance |
K8s Cluster |
K8s Node |
K8s Namespace |
K8s Deployment |
K8s ReplicaSet |
K8s Pod |
Relationships
Parent |
Child |
---|---|
Cluster |
Node |
Cluster |
Namespace |
Namespace |
Deployment |
Deployment |
ReplicaSet |
Node |
Pod |
Namespace |
Pod |
ReplicaSet |
Pod |
Alerts
Alert Name |
---|
K8s pod CPU usage too high |
K8s pod memory usage too high |
K8s too many pods crashing |
K8s node CPU usage too high |
K8s node storage usage too high |
K8s node memory usage too high |
K8s too many containers not running |
K8s node unhealthy |
K8s pod storage usage too high |
Metrics
Deployments, Pods, and ReplicaSets do not have metrics.
The WQL queries used to gather Kubernetes metrics are querying data points over the last 5 minutes. In servicing these queries, VMware Aria Operations for Applications performs data interpolation to provide reasonable values where there are gaps.
Cluster Metrics
Metric Name |
Description |
WQL Query |
---|---|---|
Active Nodes |
Count of active nodes |
at(5m,align(5m,count(ts({metric_prefix}.node.memory.working_set), cluster))) |
Inactive Nodes |
Count of inactive nodes |
at(5m,align(5m, count(lowpass(1, ts({metric_prefix}.node.status.condition,condition='Ready'),cluster)))) |
Namespaces |
Count of namespaces |
at(5m,count(align(5m, mean, count(ts({metric_prefix}.ns.memory.usage), cluster, namespace_name)), cluster)) |
Pods |
Count of pods |
at(5m,sum(align(5m, mean, ts({metric_prefix}.cluster.pod.count)),cluster)) |
CPU Usage |
Average CPU usage in millicores |
at(5m,align(5m,sum(ts({metric_prefix}.cluster.cpu.usage_rate), cluster))) |
Memory Usage |
Memory usage in bytes |
at(5m,align(5m,sum(ts({metric_prefix}.cluster.memory.usage),cluster))) |
Average Memory Utilization |
Total node working set memory divided by total node allocatable memory, across the cluster, as a percentage |
at(5m,avg(align(5m,ts({metric_prefix}.node.memory.working_set)) / align(5m,ts({metric_prefix}.node.memory.node_allocatable)) * 100, cluster)) |
Average CPU Utilization |
Average node CPU utilization across the cluster, as a percentage |
at(5m,avg(align(5m,mean,ts({metric_prefix}.node.cpu.node_utilization)), cluster) * 100) |
Average Storage Utilization |
Total node filesystem usage divided by total node filesystem limit, across the cluster, as a percentage |
at(5m,avg(align(5m,ts({metric_prefix}.node.filesystem.usage)) / align(5m,ts({metric_prefix}.node.filesystem.limit)),cluster) * 100) |
Node Metrics
Metric Name |
Description |
WQL Query |
---|---|---|
Status |
The overall status of the node. Either Ready or Not Ready. |
at(5m,ts(kubernetes.node.status.condition, condition=Ready)) |
CPU Allocatable |
CPU allocatable in millicores. |
at(5m,align(5m, ts(kubernetes.node.cpu.node_allocatable, nodename and cluster))) |
CPU Capacity |
CPU capacity in millicores. |
at(5m,align(5m,ts(kubernetes.node.cpu.node_capacity, nodename and cluster))) |
CPU Reserved |
Share of CPU that is reserved on the node allocatable in millicores. |
at(5m,align(5m,ts(kubernetes.node.cpu.node_reservation, nodename and cluster))) |
CPU Utilization |
CPU utilization as a share of node allocatable in millicores. |
at(5m,align(5m,avg(ts(kubernetes.node.cpu.node_utilization, nodename and cluster), cluster, nodename) * 100)) |
Memory Allocatable |
Memory allocatable in bytes |
at(5m,align(5m,ts(kubernetes.node.memory.node_allocatable, nodename and cluster))) |
Memory Capacity |
Memory capacity in bytes |
at(5m,align(5m,ts(kubernetes.node.memory.node_capacity, nodename and cluster))) |
Memory Reserved |
Share of memory that is reserved on the node allocatable in bytes |
at(5m,align(5m,ts(kubernetes.node.memory.node_reservation, nodename and cluster))) |
Memory Utilization |
Memory utilization as a share of memory allocatable. A percentage. |
at(5m,align(5m, avg(ts(kubernetes.node.memory.working_set, nodename and cluster) / ts(kubernetes.node.memory.node_allocatable, nodename and cluster), cluster, nodename) * 100 |
Filesystem Available |
The number of available bytes remaining in the filesystem. |
at(5m,align(5m,ts(kubernetes.node.filesystem.available, nodename and cluster))) |
Filesystem Limit |
The total size of the filesystem in bytes |
at(5m,align(5m,ts(kubernetes.node.filesystem.limit, nodename and cluster))) |
Filesystem Usage |
Total bytes consumed by the filesystem as a percentage of the Filesystem Limit |
at(5m,align(5m,(limit(250, ts(kubernetes.node.filesystem.usage, nodename and cluster)/ts(kubernetes.node.filesystem.limit)) * 100))) |
Metric Name |
Description |
WQL Query |
---|---|---|
Total Number of Pods |
Count of Pods in the Namespace |
at(5m,sum(align(60s, ts(kubernetes.ns.pod.count)), namespace_name, cluster)) |
Number of pods with status Running |
Pods in the Namespace in phase Running |
at(5m, count(align(60s,ts(kubernetes.pod.status.phase, phase=Running)), namespace_name, cluster)) |
Number of pods with status Pending |
Pods in the Namespace in phase Pending |
at(5m, count(align(60s, ts(kubernetes.pod.status.phase, phase=Pending)), namespace_name, cluster)) |
Number of pods with status Succeeded |
Pods in the Namespace in phase Succeeded |
at(5m, count(align(60s, ts(kubernetes.pod.status.phase, phase=Succeeded)), namespace_name, cluster)) |
Number of pods with status Failed |
Pods in the Namespace in phase Failed |
at(5m, count(align(60s, ts(kubernetes.pod.status.phase, phase=Failed)), namespace_name, cluster)) |
Number of pods with status Unknown |
Pods in the Namespace in phase Unknown |
at(5m, count(align(60s, ts(kubernetes.pod.status.phase, phase=Unknown)), namespace_name, cluster)) |
CPU Utilization |
Mean CPU usage over the last 5 minutes. |
at(5m,avg(align(5m, mean, ts(kubernetes.ns.cpu.usage_rate)), cluster, namespace_name)) |
Memory Usage |
Total memory usage of the namespace across all nodes |
at(5m,sum(align(5m,ts(kubernetes.ns.memory.usage)),cluster, namespace_name)) |