The Aria Operations for Logs design enables real-time logging for all components in the telco cloud. The Aria Operations for Logs cluster consists of one primary node and two or more secondary nodes behind a load balancer.

Aria Operations for Logs - Logical Design

The deployment of Aria Operations for Logs is a single instance of a multi-node logging cluster that is deployed in the management cluster. The initial three-node cluster must be created for a highly available Aria Operations for Logs deployment. Additional nodes can be added to scale the deployment.

Figure 1. Logical Design Components for Aria Operations for Logs
Logical Design Components for Aria Operations for Logs

The Aria Operations for Logs deployment contains the nodes that analyze and store data from the logging clients. The deployment configuration (number and sizing of nodes) for the Aria Operations for Logs cluster must be sized to meet the requirements for log ingestion rate.

Aria Operations for Logs integrates with other platforms in the Monitoring or Observability framework of the Telco Cloud to exchange data with platforms. For example, the in-place capabilities of Aria Operations for Logs allow syslog messages about a specific event to be viewed seamlessly from within Aria Operations.

The Integrated Load Balancer (ILB) must be used on the Aria Operations for Logs cluster so that all log sources can address the cluster by a load-balanced address, by using the ILB. When a scale-out or node failure occurs, you do not need to reconfigure log sources with a new destination address. The ILB also guarantees that the Aria Operations for Logs cluster accepts all incoming ingestion traffic.

The ILB address is required for users to connect to Aria Operations for Logs using either the Web UI or API and for clients to ingest logs using syslog or the Ingestion API.

Note:

Multiple ingress IP addresses can be allocated to the Aria Operations for Logs cluster. Each unique entry can implement ingress tagging for all log messages. Ingress tagging provides a high-level distinction among different elements of the Telco Cloud, such as RAN, 5G Core, and so on. Up to 60 Ingress IP addresses can be created per cluster.

Aria Operations for Logs - Distributed Design

Aria Operations for Logs does not support the same distributed design of Aria Operations. The concept of cloud proxies does not exist for Aria Operations for Logs.

The distributed model of Aria Operations for Logs is to create separate instances and use them as forwarders to a centralized Aria Operations for Logs cluster.

In the multi-site design, if desired separate smaller instances of Aria Operations for Logs must be deployed in the multi-site management domain, this will allow for a distributed collection of logs across the Telco Cloud with a centralized management view.

The distributed model allows for increased Ingress IP addresses. Multiple ingress addresses can be created per deployment. This can be useful to apply ingress tagging metadata to logs to indicate a site, region, or other data so that logs can be navigated easily from the centralized management domain.

Aria Operations for Logs - Scaling

Aria Operations for Logs can be scaled to support up to 18 nodes (1 Primary and 17 Workers). Each node can be deployed in a Small, Medium, or Large form factor. The CPU, Memory, and Disk requirements increase depending on the overall size of the node.

Aria Operations for Logs also supports scaling up the nodes from medium to large and scaling out the number of supported nodes from 3 to 4 or more.

Storage can also be added independently of scale-out or scale-up operations. When a storage is added, ensure the same additional storage is added to each node in the cluster. A maximum of 4TB storage can be added to each node. The storage can be 2x2TB disks or 4x1TB disks. A single disk cannot be larger than 2TB.

The maximum number of logs that can be supported depends on the node size. A Large node collects up to approximately 1,50,000 events per second from up to 750 syslog sources.

Aria Operations for Logs supports the following alerts that trigger notifications about its health and the monitored solutions:

  • System Alerts: Aria Operations for Logs generates notifications when an important system event occurs. For example, when the disk space is almost exhausted and Aria Operations for Logs must start deleting or archiving old log files.

  • Content Pack Alerts: Content packs contain default alerts that can be configured to send notifications. These alerts are specific to the content pack and are deactivated by default.

  • User-Defined Alerts: Administrators and users can define alerts based on the data ingested by Aria Operations for Logs.

Note:

Each ESXi host sends up to 10 messages per second with an average message size of 170 bytes/message, which is equivalent to 150 MB per day for each host.

Aria Operations for Logs - K8s & FluentBit and Integrations

In addition to the built-in log collection facilities provided by the vSphere components and VM appliances such as TCA and TCA-CP, logs need to be collected from Kubernetes-based components that run as containers in the worker nodes.

FluentBit is a commonly used Kubernetes logging component to capture logs from a Kubernetes cluster. Fluent-bit also collects the logs from both the Kubernetes pods and VM processes on each worker node within the cluster, adding the required metadata and performing routing to the desired Aria Operations for Logs endpoint.

The stdout/stderr stream of messages from all containers is also captured by kubelet and stored in files on the worker node. You can use FluentBit to collect and forward these logs to one or more endpoints such as the regional Aria Operations for Logs cluster. Additionally, application logs can be forwarded to an application-specific logging stack provided by the application vendor, this allows centralized access to application logs without compromising sensitive infrastructure component data.

When using Telco Cloud Automation to deploy the FluentBit add-on to the Tanzu Kubernetes cluster, a reference configuration can be used. The reference configuration includes modified filters, inputs, and outputs for consumption by Aria Operations for Logs. The reference configuration ensures that the cluster name is added to the logging messages, simplifying the capability to search for logs from a specific cluster.

Aria Operation for Logs can act as a log forwarder. By using combinations of tagging, Aria Log Forwarding and Fluentbit output targets, specific logs can be sent to external logging platforms (such as a SIEM). This capability enables the separation and isolation of infrastructure, application data, and security events.

Aria Operations for Logs Recommendations

Table 1. Recommended Sizing for Aria Operations for Logs

Attribute

Specification

Appliance Size

Medium (75 GB/ Day Logs, 400 events / second)

Number of vCPUs

8

Memory

16 GB

Disk Space

As required based on dimensioning

Design Recommendation

Design Justification

Design Implication

Deploy Aria Operations for Logs in a cluster configuration of three nodes with an integrated load balancer:

  • one primary node

  • two worker nodes

Provides high availability

The integrated load balancer:

  • Prevents a single point of failure

  • Simplifies the Aria Operations for Logs deployment and subsequent integration

  • Simplifies the Aria Operation for Logs scale-out operations, reducing the need to reconfigure existing logging sources

  • Deploy a minimum of three medium nodes

  • Size each node identically

  • If the capacity of your Aria Operations for Logs must expand, add identical capacity to each node.

Deploy Aria Operations for Logs with nodes of at least medium size.

Accommodates the number of expected syslog and connections from the following sources:

  • Management and Compute vCenter Servers

  • Management and Compute ESXi hosts

  • NSX Components

  • Aria Operations Components

  • Telco Cloud Automation and Tanzu Kubernetes Grid clusters

If you configure Aria Operations for Logs to monitor additional syslog sources, increase the size of the nodes.

Enable alerting over SMTP

Administrators and operators can receive email alerts from Aria Operations for Logs

Requires access to an external SMTP server.

Forward alerts to Aria Operations.

Provides monitoring and alerting information that is pushed from Aria Operations for Logs to Aria Operations for centralized administration.

None

Leverage fluent-bit on the Tanzu Kubernetes clusters to forward syslog information to Aria Operations for Logs.

Provides a central logging infrastructure for all the core Tanzu Kubernetes clusters

None

Integrate Aria Operations for Logs with Active Directory users and groups.

Provides fine-grained role and privilege-based access for varying user roles across the organization.

Requires access to Active Directory

Create service accounts for use with third-party integrations

Align permissions with customer security policies

Restricts access to the environment and allows Aria Operations for logs to function with a minimal set of features.

May require custom configuration on Aria Operations for Log or the integration points

Replace all the default certificates with CA-signed certificates.

Ensures that all communication is properly encrypted and proper certificate management processes are followed

A single certificate is required for all the nodes including the load balancer.

As the Aria Operations for logs deployment scales out, a new certificate is required to accommodate the new node.

Deploy and configure connectivity through the following management packs or integrations:

  • vCenter

  • NSX

  • Cloud Director

  • Aria Automation Orchestrator

Ensures that metrics are collected from all endpoints of the Telco Cloud

Requires the configuration and allocation to collector groups or cloud proxies for all integration points.

Wherever possible, leverage logging over TCP or TLS for reliable and secure transmissions.

Ensures reliable and secure transmission

May require additional configuration on the appliances and additional firewall port openings to support TCP connections.

When monitoring Tanzu Kubernetes Grid clusters, ensure that FluentBit is deployed through Telco Cloud Automation

Allows reference configuration to be applied to ensure proper logging collection, including logging using TLS

Requires deployment of add-ons through VMware Telco Cloud Automation

Requires the deployment of v2 Clusters

When using multiple Aria Operations for Logs in a multi-site environment, ensure that CFAPI is used between clusters.

Maintains the original syslog message to ensure that the correct source location is correlated

None