times

The release notes cover the following topics:

About VMware Telco Cloud Operations
What's New in this Release
Known Issues
Resolved Issues

About VMware Telco Cloud Operations

VMware Telco Cloud Operations is a real-time automated service assurance solution designed to bridge the gap between the virtual and physical worlds. It provides holistic monitoring and network management across all layers for rapid insights, lowers costs, and improved customer experience. Powered by machine learning (ML) capabilities, VMware Telco Cloud Operations automatically establishes dynamic performance baselines, identifies anomalies, and alerts operators when abnormal behavior is detected.

VMware Telco Cloud Operations simplifies the approach to data extraction, enrichment, and analysis of network data across multi-vendor environments into actionable notifications and alerts to manage the growing business needs of Telco in an SDN environment.

For information about setting up and using VMware Telco Cloud Operations, see the VMware Telco Cloud Operations Documentation.

What's New in this Release

The VMware Telco Cloud Operations v1.2 introduces the following enhancements:

Multi-Tenant Operations Support
- Multi-Tenant support is now available for Operations and Reporting Portal.
- Authenticate and Authorization support to control and separate access to notifications and metrics for multi-tenancy.
Multi-SAM Integration Support
- Ability to configure and fetch events/Topology from one or more underlying Service Assurance Manager (SAM) into the VMware Telco Cloud Operations.
- Operational UI enhancements to support Multiple SAM.
Reporting Enhancements
- Support drill down capability on various charts Bar Chart, Trending Chart, and Polystat Chart. For example, users can now:
  - Drill down SDN Dashboard Polystat chart to the associated SDN HealthScore Trend.
  - Drill down on Traffic Flows Trending chart to detailed NetFlow data chart related to particular application flow.
- Complex KPIs performing spatial aggregation across network resources can now be computed and reported by selecting metrics on the Report Panel. For more details, refer VMware Telco Cloud Operations Configuration Guide – Working with Dashboard and Reports section for details on how to configure KPIs in Reporting Panel.
Notification Enrichment Support
- In addition to metrics, VMware Telco Cloud Operations now includes support for event record enrichment with user-defined metadata tag names (for example tenant name, location, and so on) based on event key expression lookup. For more details, refer to VMware Telco Cloud Operations Configuration Guide - Enrichment example section.
SD-WAN Enhancements
- Viptela bulk query pagination: The Viptela bulk REST API has a maximum limit of 10000 records in a response. If the actual response has more than 10k records then the suggested options is to use pagination to retrieve the complete response. This pagination is implemented in this release for all the Viptela collectors.
Deployment Enhancements
- VM-Level Diagnostic Tool introduced to collect all the logs related to the VMware Telco Cloud Operations cluster in a single bundle file. Includes logs from the deployment phase as well as operations.
Documentation Enhancements
- Detailed documentation on various metrics available for reporting.

For information about system requirements, hardware requirements, patch installation, and sizing guidelines, see the VMware Telco Cloud Operations Deployment Guide.

Resolved Issues

TCO-3065
Kafka adapter based monitoring subsystem stops working, when processing one of the Viptela specific kafka message leads to any exception. Once this exception is encountered none of the subsequent kafka messages are processed.
The deployment will fail when a node uses an IPv6 address rather than an IPv4 address.
An error message appears when a deployment fails.
Any notifications in VMware Telco Cloud Operations have the serial number column with one empty row under the Audit Log tab.
In an Elasticsearch database, a zero value is not saved and displays an empty row.

VeloCloud notifications are not appearing in the SDWAN Notification panel of SD-WAN Dashboard without enrichment.

Workaround:

Select Edit from the drop-down, in the SDWAN Notification panel.
In the next page edit the query which has the AND to OR.

UserDefined20.keyword:$Tenant AND ((ElementClassName.keyword:VEdge) OR (ClassDisplayName.keyword:"VGateway" OR ClassDisplayName.keyword:"Tenant" OR ClassDisplayName.keyword:"vEdge" OR ClassDisplayName.keyword:"Orchestrator" OR ClassDisplayName.keyword:"VEdge" OR ClassDisplayName.keyword:"Tunnel") OR (InstanceName.keyword:vSmart OR InstanceName.keyword:vBond OR InstanceName.keyword:vManage))

In VMware Telco Cloud Opeartions HA environment, import-ldap-cert.sh is failing to import LDAP certificates. If the keycloakserver is switched to different node(from CPN to kafkaworker or kafkaworker to CPN)during script execution for the first time

Workaround : Re-import LDAP certificates to keycloak using import-ldap-cert.sh, if pod switch happens to different node:

1.Ensure that, which node keycloak is running before and after importing LDAP certificates using import-ldap-cert.sh

kubectl get pods -o wide | grep keycloak

2.If node is different, then verify if the LDAP cert is actually imported:

./import-ldap-cert.sh -l -s <store password>

3.If it does not display the truststore information, re-import the certificates.
When Multitenancy is configured, currently Data Enrichment also has to be configured in order for the Admin dashboards to display any data.
Regardless of whether Enrichment is configured, the default behavior is that all data is shown on the Admin dashboard views. When a customer desires multi-tenancy, the configuration can be changed to facilitate the dashboard's relevant to multi-tenancy tags.

Known Issues

A possible cause for the deployment to fail is when you use the automated deployment tool.
When you deploy VMware Telco Cloud Operations using the automated deployment tool, the deployment of the worker node may fail with the error: Failed to send data.

Workaround: Modify the VCENTER_IP configuration parameter in the deploy.settings file to use the fully qualified domain name (FQDN). For more information about modifying the deploy.settings file, see the VMware Telco Cloud Operations Deployment Guide.
When the number of hops of connectivity is increased, you may experience performance issues in the topology maps.
There might be performance issues in the rendering of Redundancy Group, MPLS, Metro-E, and SDN connectivity map types in the Map Explorer view. This issue is observed on deployments with a complex topology where the topology maps may stop working when the number of hops of connectivity is increased.
When you create a Metrics Collector with incorrect inputs for Smart Assurance, it indicates that the collector is running, even though the connection is not established.
When you provide an incorrect input during collector configuration, the collector appears to have started but it does not start. Verify the log to check the actual collector status.
VMware Telco Cloud Operations currently does not support connections to SAM server with broker authentication, EDAA authentication, and Edge Kafka authentication.

For a workaround, see the Security Recommendation section in the VMware Telco Cloud Operations Deployment Guide.

Note: EDAA related operations including the Acknowledge, Ownership, Server Tools, Browse Details > Containment and Browse Details > Domain Manager are not supported when Smarts Broker is configured in secure mode.
Broker failover is not supported in VMware Telco Cloud Operations.
Primary Broker fails in the Smart Assurance failover environment.

Workaround: Currently when a Broker (multi-broker) failover happens in Smart Assurance, then it requires a manual intervention where you need to log in to VMware Telco Cloud Operations and change the Broker IP address to point to the new Broker IP.
Procedure:
1. Go to https://IPaddress of the Control Plane Node.
2. Navigate to Administration > Configuration > Smarts Integration
3. Delete the existing Smarts Integration Details.
4. Re-add the Smarts Integration Details by pointing it to secondary Broker.
Statistics - Tunnel reports for SDWAN displays unknown elastic error, if the specific device is not selected in Edge filter.

Workaround: To avoid the error, remove ALL option for Edge.

Procedure to disable the ALL option: Statistics Tunnel > Dashboard Settings > Variables > Edge > Disable Include All option.
When Smarts is restarted without repos for multiple times, the Viptela ControlNode controller status is going to OTHER/UNKNOWN state.

Workaround: Use below command in control plane node to delete the respective Viptela stale collectors:

kubectl delete deployments.apps <viptela deployment app instance>
VMware Telco Cloud Operations Health Status Pod report displays empty value for some pods. They indicate that some pods ran for sometime, consumed some CPU and Memory resources, but no longer exist.

Workaround: To select a small range, you can go to the Gear icon on the top right of the reports and uncheck the option Hide time picker and go back to the reports.
Weekly indexes are not displayed while creating custom reports, only daily and hourly index are shown part of reports.

Workaround:
1. Select Configurations > Data Sources from the left side menu bar
2. Click Add Data Source.
3. Select Elasticsearch.
4. Enter relevant name based on the metric-type for which the weekly index needs to be created (for example: Week-Network-Interface) and the Elastic http url as http://elasticsearch:9200, refer any other VMware Telco Cloud Operations data sources
5. Enter Index Name based on the metric type for which the weekly index needs to be create ([vsametrics-week-networkinterface-]YYYY.MM) and select Pattern "Monthly"
6. Enter the Time Field Name timestamp and Version 7+.
7. Keep the rest of the fields to default value.
8. Click Save & Test.
Notification count mismatch between SAM and VMware Telco Cloud Operations UI due to non-filtering of notification with Owner field set to SYSTEM. By default in VMware Telco Cloud Operations there are no filters set.

Workaround: Manually apply the filter to remove notifications with Owner field not containing SYSTEM in VMware Telco Cloud Operations Notification Console window by following below steps:
1. Go to Default Notification Console.
2. Click Customize View.
3. Go to Filters and provide Filter Set Name, for example Filterout SYSTEM Notifications.
4. Filter Section Add Attribute with below condition:
  Property = Owner
  
  Expression = regex
  
  Value = ~(SYSTEM+)
5. Click Update.
Verify the Default Notification Console has only those notifications whose owner not set to SYSTEM. The default notification count must match between SAM and VMware Telco Cloud Operations UI.
Netflow-9 Statistics, Netfow-9 Trends, Netflow-5 Statistics, and Netflow-5 Trends reports display error message - Failed to parse query with the Default Time interval of 3 hours.

Workaround: You need to select smaller time intervals. For example: 15 minutes, 30 minutes, 1 hour, etc.
When the Kafka server is configured to a wrong IP or the Kafka node goes down during discovery, then the Velocloud discovery hangs for 20 minutes before exiting the discovery. This is the case even when the messagePollTimeout of the VCO Access setting is set to a lower value.

Workaround: In the esm-param.conf file add the below line replacing the <kafka ip address> and <time in seconds>, and restart the server.

MessagePollTimeoutPeriodInSeconds-<kafka ip address> <time in seconds>
The Containment, Browse detail, Notification Acknowledge/Unacknowledge does not work when the primary Tomcat server fails in a Smart Assurance HA environment.
In a Smart Assurance Failover deployment, when the primary Tomcat fails, the UI operations including the Notification Acknowledgement, Containment, Browse Detail, and Domain Managers fail.

Workaround: When the primary Tomcat instance fails in a Smart Assurance failover environment, then you can manually point the VMware Telco Cloud Operations to a secondary Tomcat instance.
Procedure:
1. Go to https://IPaddress of the Control Plane Node.
2. Navigate to Administration > Configuration > Smarts Integration
3. Delete the existing Smarts Integration Details.
4. Re-add the Smarts Integration Details by editing the EDAA URL and pointing it to the secondary Tomcat Instance.
An error message appears in Grafana report.
When user logs out from Operational UI and tries to launch report from Grafana user interface, an error message appears.

Workaround: Refresh or relaunch Grafana UI to logout.
The SDWAN Flow Top N Summary reports displays an error message.
In case of SDWAN Flow Top N Summary report, the Grafana Bar Gauge widget does not support substantial time intervals.

Workaround: You need to set smaller time interval (24 hour) for the flow reports. Follow the procedure to set the substantial time interval:
1. Click Edit from the report.
2. Expand the Interval in the last row (Date Histogram) of query, and set it to higher interval like (7d or so on).
3. Save the report.
The SAM server is getting listed in the Domain Manager section instead of Presentation SAM section.
During Smart integration and configuration, INCHARGE SA (SAM server) is getting listed in the Domain Manager section. This problem occurs only when, the SAM server is started in Non-EDAA Mode.

Workaround: To get listed under Presentation SAM section, start the SAM server in EDAA Mode.
Enrichment stream name field is not editable.
If user wants to edit stream name after creating the enrichment stream, the option to edit enrichment name is not available.

Workaround: You can choose Clone option and edit the name.
Disk usage is not mentioned in the VMware Telco Cloud Operations Health Status Node report.
In Health Status Node Report, the disk usage is not specified for which kubernetes cluster node (Controlplane, Arango, ElsticSearch, Domain Manager, Kafka, and so on) the disk is used.

Workaround:
1. Click Edit in the panel of Disk usage.
2. Click Field tab.
3. Click Display Name and No value fields (no need to enter any value).
  Node names appear.
4. Click Save and Apply.
Some of the DataCenter Summary reports are taking longer time to display.
On 100k footprint with 10 Million records sent per polling, the DataCenter reports are taking more than usual time to display

Workaround: Perform the following procedure on report side:
1. Reduce the default time interval from 24hr to 12hr or 6hr.
2. If the issue still persists, point the datasource to hourly index for the panel which is showing the error.
The IP server went down with ACI and Viptela features discovered on longevity setup.
This happens only when, both ACI and Viptela are discovered in the same IP server.

Workaround: Run separate IP servers for monitoring of ACI and Viptela, along with different Kafka topic for each IP servers. This issue will not occur if ACI and Viptela monitoring is separated between different IP servers.

For Example: IP1 server for monitoring of Cisco ACI with kafka topic names ACI-Discovery-Topic, ACI-Monitoring-Topic and IP2 server for monitoring of Viptela with kafka topic names Viptela-Discovery-Topic, Viptela-Monitoring-Topic.
Topology pod is down, due to Redis service failure.
In one of the 100k deployment, Topology pod is down due to redis service failure, and notification sync in VMware Telco Cloud Operations is very slow.

Workaround: Following procedure can be applied to restart Redis cluster and restart dependant services. On control plane node perform below steps:
1. Scale down events pods using command:(kubectl scale deployment <events_POD> --replicas=0)
2. Scale down topology pods using command:(kubectl scale deployment <topology_POD> --replicas=0)
3. Delete Redis deployment using command:(kubectl delete deployment redis)
4. cd to /home/clusteradmin/kubernetes
  - kubectl apply -f redis.yaml.
5. Once Redis comes up, Scale up Topology and Events Pods.
  - kubectl scale deployment <events_POD> --replicas=1
  - kubectl scale deployment <topology_POD> --replicas=1
Security Vulnerability
CVE-2021-3449 -- An OpenSSL TLS server may crash if sent a maliciously crafted renegotiation ClientHello message from a client. If a TLSv1.2 renegotiation ClientHello omits the signature_algorithms extension (where it was present in the initial ClientHello), but includes a signature_algorithms_cert extension then a NULL pointer dereference will result, leading to a crash.