KPI Definition Example
Network Reliability KPI
- kpi name: packet-drop-rate
- input metrics: packetCount, packetDropCount
- time window: 60 seconds, tumbling window
- filter expression: metadata['dataSource'] matches '10\\.118\\.7[23]\\.[1-9]0?'
- calculation expression: percent(sum(packetDropCount), sum(packetCount))
- groupBy expression: metadata['dataSource']
This KPI calculates the packet drop percent rate for devices whose IP address falls in the range specified by the given regular expression. Metrics observed by the streaming system for other devices that fall outside of the filter range will be filtered out. Assuming the dataSource attribute is present in the metric event, the groupBy expression ensures that the packet-drop-rate KPI is calculated for each unique value of the attribute. All packetCount and packetDropCount metrics received during the time window are added together (in the sum() operation) followed by the percent operation.
Threshold Definition Example
Network Reliability Threshold
We would like to set a threshold for non-acceptable values for our network reliability KPI packet-drop-rate. It is desired that the packet drop rate remains under 10%, since that is the understood tolerance of the devices and network. Anything above this level will be unacceptable. A packet drop rate above 20% will be a sign that some more critical is occurring and anything above that value is certainly severe and undesirable. This threshold would piggy back of the packet-drop-rate KPI and add to the definition by adding a set of threshold ranges to capture the behavior described above.
threshold name: packet-drop-rate
Lable | Value >= | Value < |
---|---|---|
normal | 0 | 10 |
warning | 10 | 20 |
critical | 20 | max |
This threshold configuration will augment the calculated KPI packet-drop-rate and add an additional tag to the output event to identify the threshold crossing condition as specified in the config.
Non-windowed Threshold Definition Example
Device Availability Threshold
- threshold name: device-status-threshold
- filter expression: metadata['dataSource'] matches '10\\.118\\.7[23]\\.[1-9]0?'
- catalog: device catalog
- catalog item: device status
Label | Value >= | Value < |
---|---|---|
device.down | 0 | 1 |
device.up | 1 | max |
This threshold configuration will generate an event for a device that falls within the filter criteria when an availability event is received and indicates the device is not responding or offline.
Composite KPI Example
Device Availability KPI
- kpi name: device-down-status-count
- input threshold: device-status-threshold:device.down
- time window: 60 seconds, tumbling window
- filter expression: metadata['dataSource'] matches '10\\.118\\.7[23]\\.[1-9]0?'
- calculation expression: count(device-status-threshold:device.down)
- groupBy expression: metadata['dataSource']
Label | Value >= | Value < |
---|---|---|
device.down | 5 | max |