VMware Telco Cloud Service Assurance 2.2.0 | 30 MAY 2023 Check for additions and updates to these release notes. |
VMware Telco Cloud Service Assurance 2.2.0 | 30 MAY 2023 Check for additions and updates to these release notes. |
VMware Telco Cloud Service Assurance is a real-time automated service assurance solution designed to holistically monitor and manage complex virtual and physical infrastructure and services end to end, from mobile core to the RAN to the edge. From a single pane of glass, VMware Telco Cloud Service Assurance provides cross‑domain, multi‑layer, automated assurance in a multi‑vendor and multi‑cloud environment. It provides operational intelligence to reduce complexity, perform rapid root cause analysis and see how problems impact services and customers across all layers lowering costs and improved customer experience.
For information about setting up and using VMware Telco Cloud Service Assurance, see the VMware Telco Cloud Service Assurance Documentation.
VMware Telco Cloud Service Assurance release 2.2.0 brings together various features and enhancements across platforms, networking, and virtual infrastructure management areas. This release introduces the following major features:
VMware Telco Cloud Service Assurance version 2.2.0 enables the discovery and close monitoring of RAN sites running on Samsung vDU network functions. The discovery and monitoring of the underlying infrastructure components of the vDU are acheived through a combination of network management protocols, software-defined networking technologies, and network function virtualization (NFV) techniques. The vDUs are dynamically discovered, and a complete topology diagram is built showing the connectivity to CUs and DUs. This is done using VMware Telco Cloud Automation and mapping the configuration information, interfacing with Samsung USM REST APIs.
Remote Data collector allows you to deploy collectors at remote data center locations. Once deployed, it allows seamless remote configuration and remote upgrade using central VMware Telco Cloud Service Assurance over a secure channel.
VMware Telco Cloud Service Assurance now includes wizard driven SNMP Collector User Interface to configure masks, agents, and polling groups. VMware Telco Cloud Service Assurance Data Collection SDK (Python) has been enhanced to support Topology ingestion and VMware Telco Cloud Service Assurance Data Collection SDK (XML) has been enhanced to support vCenter and Stream Collectors.
Data-Driven Alarm, Anomaly, Enrichment, and Remediation Management enables you to define filters and actions based on metrics received by VMware Telco Cloud Service Assurance from remote data collectors and remote Domain Managers.
Flexible Scaling provides you the ability to independently scale up VMware Telco Cloud Service Assurance services on top of the base footprints such as 25K, 50K, and so on. Scaling is performed based on the number of Devices, Events, Metrics, Retention Interval, Alarms, Anomaly, and Enrichment definitions. For example, a 25K base footprint can further be independently scaled to 500K devices, 5000K events, and 80 million metrics.
With the VMware Telco Cloud Service Assurance Sizing Sheet, you can get the recommendation of the footprint to be deployed and scale the deployment based on the following:
The number of devices to be discovered and monitored.
Amount of data and metrics to be collected.
The number of incoming events.
Retention Interval and so on.
For more information, see VMware Telco Cloud Service Assurance Sizing Sheet.
Simplified VM-based demo footprint deployment allows you to deploy VMware Telco Cloud Service Assurance seamlessly for demo deployments. This deployment exercises the RHEL-compatible OS and uses mainstream Kubernetes. This entire deployment procedure is fully automated and easy to run with a minimal set of prerequisites.
Now operators can select events of interest and define rules for different manual or automatic remediation configured using User Interface or API. Remediation Rules User Interface supports the following functionalities:
Supports the live data.
Simplified User Interface for selecting Actions.
Custom tagging for each rule.
Filtering based on target class name.
Enable or disable Remediation Rules.
Remediation Rules syntax validation to avoid configuration issues.
Flexible 3rd party actions:
Custom JIRA ticket prefix.
CHG request format.
Slack channel name.
Actions can now automatically open a ServiceNow ticket when a notification is in active state and close the same ticket when the notification becomes inactive.
A redesigned Filter component makes it easier for users to create filter conditions based on incoming live data. You can also select tags, metrics, and regular expressions to support advanced filter use cases.
Users can create Custom notifications based on filter criteria and also provide a custom name for each notification for easy identification.
VMware vRealize Operations is rebranded as VMware Aria Operations for Integrations. With VMware Telco Cloud Service Assurance, the application and documentation rebranding will take effect in a future release.
DCF credentials addition fails when the broker is started on a non-default port.
VMware vRealize Operations service to Kubernetes Master relationship is not getting created when Hostname is used to discover the VMware vRealize Operations instead of IP address.
VMware vRealize Operations discovery is failing when there are special characters configured as part of VMware vRealize Operations password.
Exceptions observed during vCenter discovery resulting in incomplete discovery of vCenter components.
Remediation action fails on a 100K footprint based deployment.
VMware Telco Cloud Service Assurance deployment is failing when the Harbor password is having special characters.
Service offering not getting cleared along with problem alarm.
Smarts elasticsearch does not start as a service at times.
Cisco ACI device access SNMPAddress keep appending value.
Issue with NOTIF configuration in SAM 10.1.5 deployment.
License Manager is not getting instrumented for Cisco Catalyst 36xx stack-able ethernet switch .1.3.6.1.4.1.9.1.2066.
Custom or user specific Threshold group settings for Memory are not working.
Sub-Interfaces lose instrumentation during post-processing phase of pending discovery and do not come back.
While creating OSPF domain, the error message appears repeatedly in the logfile.
ASL error during reconfigure, in IPAM 10.1.2.0.
Uninstall is not removing default-tcops-scheduledbackup
pod.
Prometheus pod is down in Longevity environment.
Many pods are in evicted state in VM mode of VMware Telco Cloud Service Assurance deployment.
You need to configure docker log rotation on all VMs after K8s installation.
Log rotation must be configured on all the VMs to keep the disk usage in check. Without log rotation, the logs generated by pods/containers are not cleaned up, and eventually fills up the root file-system on the VMs.
The consequence is that the VM becomes unresponsive and VMware Telco Cloud Service Assurance itself can become inaccessible.
Workaround:
To configure docker log rotation on all VMs after K8s installation, follow the procedure:
Configure the log roll over for all the Container logs on all the VMs where the native Kubernetes is deployed.
Create a file daemon.json
in the /etc/docker
directory with the following contents on all the VMs where the Kubernetes cluster is deployed"
[tco@node1 ~]$ sudo cat /etc/docker/daemon.json
{ "data-root": "/var/lib/docker",
"log-driver": "json-file",
"log-opts":
{
"max-file": "5",
"max-size": "50m"
}
}
Restart the docker systemd
service, using the following command:
sudo systemctl restart docker
Multiple PODs are getting into CrashedLoopbackState in a VM based VMware Telco Cloud Service Assurance deployment.
Workaround: Manually delete the evicted admin-operator
pod using kubectl delete pod
. This allows the new pod to come up.
Harbor Registry and Harbor core pods are getting into Evicted state and restarting in a VM based demo deployment.
There is no functionality impact as new Harbor pods get spun up when the restart happens.
Post upgrade from 2.1.0 to 2.2.0, the Enrichment jobs are not getting started.
Workaround:
Login to VMware Telco Cloud Service Assurance.
Navigate to the Administration > Configuration > Enrichment.
If the enrichment jobs are not started post-upgrade, start the Enrichment jobs such as event-stream
, metric-stream
, and topology-stream
manually.
Airflow application is not getting reconciled.
During deployment, the Airflow application is not getting reconciled.
Workaround: If for some of the VMware Telco Cloud Service Assurance application reconciliation fails with the error (Failed with reason BackoffLimitExceeded), then perform the following steps to recover from the failure.
For the Airflow service, delete the jobs using the following command:
kubectl delete job airflow-run-airflow-migrations
kubectl delete job airflow-create-user
All the classes and instances are not visible in Topology Explorer/Topology Maps.
When more numbers of SAMs are added to the VMware Telco Cloud Service Assurance, the classes and instances are not visible in Topology Explorer/Topology Maps.
Unable to recognize the base install during MPLS upgrade from 11.1.0.0 to 11.2.0.0.
This is applicable only when versions prior to DM 11.2.0.0 is installed.
Workaround:
Open /var/.com.zerog.registry.xml
file from a machine where smarts is installed and perform the following steps:
MPLS independent upgrade: Under <product>
tag search for old MPLS installation and modify id from 28322b95-1f3b-11b2-bb9a-eb6ec3979369 to 28322b95-1f3b-11b2-bb9a-eb6ec3979370.
When NPM and MPLS installations are present in same directory, and incase user wants to upgrade NPM and then MPLS: Add below <product>
tag content under <product>
tag, just before MPLS upgrade:
<product name="MPLS" id="28322b95-1f3b-11b2-bb9a-eb6ec3979370" upgrade_id="db14898f-1f3a-11b2-a90c-eb6ec3979369" version="11.1.0.0" copyright="2019" info_url="VMware Inc" support_url="www.http://vmware.com" location="/opt/InCharge" last_modified="2022-07-29 05:02:08"><![CDATA[]]><vendor name="InstallAnywhere" id="2832bde7-1f3b-11b2-bb9a-eb6ec3979369" home_page="http://www.installanywhere.com" email="[email protected]"/><feature short_name="Application" name="Application" last_modified="2022-07-29 05:02:08"><![CDATA[This installs the application feature.]]><component ref_id="db1488e4-1f3a-11b2-a8c7-eb6ec3979369" version="1.0.0.0" location="/tmp/install.dir.4793/./devstat_err-javadoc.jar"/><component ref_id="db1488e3-1f3a-11b2-a8c8-eb6ec3979369" version="1.0.0.0" location="/opt/InCharge/MPLS/jre"/></feature></product>
When MPLS and NPM installations are present in same directory, incase user wants to upgrade MPLS and then NPM:Add below <product>
tag content under <products>
tag, just before MPLS upgrade:
<product name="MPLS" id="28322b95-1f3b-11b2-bb9a-eb6ec3979370" upgrade_id="db14898f-1f3a-11b2-a90c-eb6ec3979369" version="11.1.0.0" copyright="2019" info_url="VMware Inc" support_url="www.http://vmware.com" location="/opt/InCharge" last_modified="2022-07-29 05:02:08"><![CDATA[]]><vendor name="InstallAnywhere" id="2832bde7-1f3b-11b2-bb9a-eb6ec3979369" home_page="http://www.installanywhere.com" email="[email protected]"/><feature short_name="Application" name="Application" last_modified="2022-07-29 05:02:08"><![CDATA[This installs the application feature.]]><component ref_id="db1488e4-1f3a-11b2-a8c7-eb6ec3979369" version="1.0.0.0" location="/tmp/install.dir.4793/./devstat_err-javadoc.jar"/><component ref_id="db1488e3-1f3a-11b2-a8c8-eb6ec3979369" version="1.0.0.0" location="/opt/InCharge/MPLS/jre"/></feature></product>
Add below tag before NPM upgrade, if not present:
<product name="NPM" id="28322b95-1f3b-11b2-bb9a-eb6ec3979369" upgrade_id="db14898f-1f3a-11b2-a90c-eb6ec3979369" version="11.1.0.0" copyright="2019" info_url="VMware Inc" support_url="www.http://vmware.com" location="/opt/InCharge" last_modified="2022-08-02 02:28:34"><![CDATA[]]><vendor name="InstallAnywhere" id="2832bde7-1f3b-11b2-bb9a-eb6ec3979369" home_page="http://www.installanywhere.com" email="[email protected]"/><feature short_name="Application" name="Application" last_modified="2022-08-02 02:28:34"><![CDATA[This installs the application feature.]]><component ref_id="db1488e4-1f3a-11b2-a8c7-eb6ec3979369" version="1.0.0.0" location="/tmp/install.dir.7209/./devstat_err-javadoc.jar"/><component ref_id="db1488e4-1f3a-11b2-a8c7-eb6ec3979369" version="1.0.0.0" location="/opt/InCharge/NPM/_uninst/uninstaller"/></feature></product>
Ensure that the 'location' and 'version' attributes are updated correctly as per the existing smarts deployment location and version.
Metric file names in filters are not segregating tags and properties.
While creating an alarm, when properties and tags of the incoming metric are the same, duplicate entries appear in the drop-down.
Workaround: Ensure that there are no duplicates in the properties and tags section of incoming metrics.
Adding Enrichment other than default causes the duplication of records in VMware Telco Cloud Service Assurance topics.
When a user adds new enrichers, the VMware Telco Cloud Service Assurance ends up duplicating the records, as all of the records go through default as well as the new enricher.
Alarms are not getting generated for multiple devices when broader filter is given.
A value with a longer name than the value field will not be shown in its entirety.
Post upgrade, Alarm page is throwing an error Alarm list cannot be loaded
.
Perform the following pre-upgrade and post-upgrade procedure if Alarm list is not loaded.
Workaround:
Before Upgrade:
Before you begin the upgrade procedure, ensure that you disable and delete all the Alarming definitions, including the pre-defined definitions from the VMware Telco Cloud Service Assurance UI. To disable and delete the Alarming definitions, navigate to Administration > Alarms Management > Alarming in the VMware Telco Cloud Service Assurance UI.
After Upgrade:
In case, if you have not deleted the alarms as suggested in the Before Upgrade section, perform the following steps:
kubectl get pods | grep alerting-rest
kubectl exec -i <alerting-rest-pod-name> –- bash -c "curl -X DELETE http://localhost:8080/omega-alerting/v1/alert"
kubectl scale deployments alerting-rest --replicas=0
kubectl scale deployments alerting-rest --replicas=1
Alarms and Anomaly is not allowing static and user defined metrics to define alarms and anomaly.
Alarm is not getting generated with range regex filter "<0-9>".
While creating an alarms definition we are able to get metrics by passing '<0-9>' for range in regex patterns, but alarm is not getting generated for it.
Other general pattern for ranges in regex is "[0-9]" are working fine.
Alarm resulting Notification is not having Ticket ID specified in Notification definition.
"Unsigned plugins were found" warning appears, while adding new datasource.
Events Logs is not displaying the status as Completed/End time, even though the Notifications process completed.
When user execute the command kubectl get pods
, the stale entries with the status of 'Init:Error or Error' appears in init jobs.
Cloudify discovery not working by passing DCF complex passwords.
In out-of-box cisco-aci collector, the devicetype/deviceName value are empty.
Legacy issue, no reports and functionality impacted.
Warning message "java.io.FileNotFoundException
" appears in snmp config collector logs.
No functional impact. Warning messages can be ignored.
Grafana reports are not getting populated with daily or weekly VMware Smart Assurance metrics indexes.
Workaround:
Select Configurations > Data Sources from the left side menu bar.
Click Add Data Source.
Select OpenSearch.
Enter the relevant name based on the metric type for which the weekly index is created. For example, (Week-Network-Interface). Use HTTP URL as http://nginx:8099/esdb-proxy/. You can also refer to any other VMware Telco Cloud Service Assurance data sources.
Under Auth, check Skip TLS Verify and Forward OAuth Identity.
Enter Index Name based on the metric type for which the weekly index is created. Check the OpenSearch DB for the availability of the indexes. For example, ([vsametrics-week-networkinterface-]YYYY.MM). Select Pattern "Monthly" ([vsametrics-week-networkinterface-]YYYY) and select Pattern "Yearly".
Enter the Time Field Name timestamp and Version OpenSearch 1.0.x.
Retain the rest of the fields to default value.
Click Save & Test.
Now in the corresponding report settings, change the default data source to the newly created Datasource. For example, Report.. Home==>Top 10 Bandwidth Utilization.
Navigate to setting, and change Datesource "Network-Interface" to "Week-Network-Interface".
Change metric "CurrentUtlization" to "CurrentUtlization.avg".
Save.
Exception appears in the user interface while adding more than 30 SAMs or Domain Managers at a time.
Sometimes the user interface timeouts with an exception, while adding more number of SAMs or Domain Managers in a single Smarts Integration Create Wizard flow.
Though the timeout happens , the collectors would be created successfully. Youcan validate by going to the details of the Smarts Integration.
Workaround: To avoid the timout exception, it is advisable to add limited SAM or Domain Mnaagers (around 30) in initial Smarts Integration create wizard flow. And, subsequently add additional SAM or Domain Managers in the existing Smarts Integration.
The SDK REST Custom collector pod is spinned up and in still state when the REST simulator is down.
The SDK REST Custom collector is up and running, after the simulator is up.
But the spinned up REST pods which are in error state remains in still sate. These pods are not consuming any memory or CPU resources, and there is no functional impact.
ElasticSearch Grafana dashboard cluster status shows YELLOW with number 23 in the upgraded setup
Wrong expression for yellow - it must be +2, not +22.
"expr": "elasticsearch_cluster_health_status{job=\"$job\",instance=\"$instance\",cluster=\"$cluster\",color=\"red\"}==1 or (elasticsearch_cluster_health_status{job=\"$job\",instance=\"$instance\",cluster=\"$cluster\",color=\"green\"}==1)+4 or (elasticsearch_cluster_health_status{job=\"$job\",instance=~\"$instance\",cluster=\"$cluster\",color=\"yellow\"}==1)+22",
Instance count mismatch observed.
Instance count mismatched observed between SAM and VMware Telco Cloud Service Assurance.
Execute the following script from the ArangoDB coordinator pod:
Kubectl get pods | grep arangodb-crdn
kubectl exec -it <<arangodb crdn pod>> – bash
cd /opt/vmware/vsa/infra/install/Scripts
./viewUpdate.sh
After upgrade and migration, the ESM Server hangs.
The ESM Server hangs for the first time, after performing an upgrade or migrate from earlier versions. Server does not responds to any dmctl
commands and the SAM Console gets freezed. You are unable to naviagte in the SAM Console.
You can stop the ESM Server using dmquit
, or kill
the ESM Server after the ESM Server is started for the first time post upgrade or migrate.
Once the server is completely stopped, restart the ESM Server. After the Server is restarted, the hang issue is no more observed, and the dmctl
commands start responding.
Sometimes in Longevity environment, ElasticSearch, Nginx, and Prometheus pods are down in the VMware Telco Cloud Service Assurance.
After some days, few services are down although the TKG cluster shows healthy.
To restore the pods, follow the procedure:
Get the list of pvc’s by executing:
kubectl get pvc | grep elasticsearch
Once you determine which pvc needs to be removed, first delete the finalizers by editing the pvc with the following command:
kubectl edit pvc data-elasticsearch-data-1Y
In edit mode, the page looks like the following, remove the highlighted lines and save:
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
kapp.k14s.io/delete-strategy: orphan
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
volume.beta.kubernetes.io/storage-provisioner: csi.vsphere.vmware.com
volume.kubernetes.io/storage-provisioner: csi.vsphere.vmware.com
creationTimestamp: "2022-09-13T19:27:52Z"
finalizers:
- kubernetes.io/pvc-protection
labels:
adminoperator.tcx.product/delete-strategy: "true"
es-operator-dataset: elasticsearch-data
After that, the deletion of the pvc is successful and the creation of a new pvc allows the pod to be restored.
Post upgrade, Kibana-init reconcile failed.
There is no functional impact.
VMware Telco Cloud Service Assurance job Instantiation and Terminate status is not showing correct status in VMware Telco Cloud Automation user interface.
VMware Telco Cloud Service Assurance job Instantiation and Terminate status is shown as success, eventhough the VMware Telco Cloud Service Assurance deployment is in progress.
To check the VMware Telco Cloud Service Assurance job Instantiation & Terminate status, use the following kubectl
command:
root [ ~/tcx-deployer/scripts ]# kubectl get tcxproduct
Getting an error message during migration of IP, SAM, and ESM domain managers.
Following error message appears, during migration of IP, SAM, and ESM. And, .conflict
files are created for sm_merge
and version.pm
:
----
Merge Process aborted/opt/InCharge11/IP/smarts/local/bin/system/sm_merge:
line 1: $'\177ELF\002\001\001': command not found/opt/InCharge11/IP/smarts/local/bin/system/sm_merge:
line 2: $'w\267P\343\316\301\024W\026': command not found
------
There is no functional impact, and errors can be ignored. It is safe to delete the sm_merge.conflict
and version.pm.conflict
files, before starting the server.
The sm_merge.conflict
and version.pm.conflict
file must be deleted before starting the server.
Incremental scale fails when VMware Telco Cloud Service Assurance scale is triggered without the Node or VM scale up.
Post deployment if the incremental scale is triggered without scaling up the VM or Node, the incremental scale fails with error: Insufficient CPU capacity
.
Post that, increases the VM or Node capacity as per the footprint and re-trigger the incremental scale. Again, the incremental scale fails even though sufficient resource capacity is provided.
Ensure that the Node or VM scale is done as per the destination footprint:
Run the command: kubectl delete validatingwebhookconfiguration admin-operator-webhook
.
Re-trigger the incremental scale operation.
Incremental scale passed and all the apps are scaled up as per the destination footprint specified.
The log_level messages are displaying 'unknown' in Service logs (Kibana logs).
When user navigates to Administration > Service Logs, and clicks on the application service logs, the filter log level displays 'unknown' fields for log_level messages for any service. For example: Apiservice, elasticsearch, and so on.. .
Unable to delete the cloned console of default Summary View.
Note: You are able to perform all required operations using Edit option.
VMware Telco Cloud Service Assurance currently does not support connections to SAM server when Broker is configured in secure mode.
Currently there is no workaround. Broker must configured in non-authenticate mode.
Note: EDAA related operations including the Acknowledge, Ownership, Server Tools, Browse Details > Containment and Browse Details > Domain Manager are not supported when Broker is configured in secure mode.
When the number of hops of connectivity is increased, you may experience performance issues in the topology maps.
There might be performance issues in the rendering of Redundancy Group and SDN connectivity map types in the Map Explorer view. This issue is observed on deployments with a complex topology where the topology maps may stop working when the number of hops of connectivity is increased.
Broker failover is not supported in VMware Telco Cloud Service Assurance.
Primary Broker fails in the Domain manager failover environment.
Currently when a Broker (multi-broker) failover happens, then it requires a manual intervention where you need to log in to VMware Telco Cloud Service Assurance and change the Broker IP address to point to the new Broker IP.
Procedure:
Go to https://IPaddress of the Control Plane Node.
Navigate to Administration > Configuration > Smarts Integration
Delete the existing Smarts Integration Details.
Re-add the Smarts Integration Details by pointing it to secondary Broker.
Weekly indexes are not displayed while creating custom reports, only daily and hourly index are shown part of reports.
Procedure for workaround:
Select Configurations > Data Sources from the left side menu bar
Click Add Data Source.
Select Elasticsearch.
Enter relevant name based on the metric-type for which the weekly index needs to be created (for example: Week-Network-Interface) and the Elastic http url as http://elasticsearch:9200, refer any other VMware Telco Cloud Service Assurance data sources
Enter Index Name based on the metric type for which the weekly index needs to be create ([vsametrics-week-networkinterface-]YYYY.MM) and select Pattern "Monthly"
Enter the Time Field Name timestamp and Version 7+.
Keep the rest of the fields to default value.
Click Save & Test.
Notification count mismatch between SAM and VMware Telco Cloud Service Assurance UI due to non-filtering of notification with Owner field set to SYSTEM. By default in VMware Telco Cloud Service Assurance there are no filters set.
Manually apply the filter to remove notifications with Owner field not containing SYSTEM in VMware Telco Cloud Service Assurance Notification Console window by following below steps:
Go to Default Notification Console.
Click Customize View.
Go to Filters and provide Filter Set Name, for example Filterout SYSTEM Notifications.
Filter Section Add Attribute with below condition:
Property = Owner
Expression = regex
Value = ~(SYSTEM+)
Click Update.
Verify the Default Notification Console has only those notifications whose owner not set to SYSTEM. The default notification count must match between SAM and VMware Telco Cloud Service Assurance UI.
The Containment, Browse detail, Notification Acknowledge/Unacknowledge does not work when the primary Tomcat server fails in a HA environment.
In a Failover deployment, when the primary Tomcat fails, the UI operations including the Notification Acknowledgement, Containment, Browse Detail, and Domain Managers fail.
When the primary Tomcat instance fails in a failover environment, then you can manually point the VMware Telco Cloud Service Assurance to a secondary Tomcat instance.
Procedure:
Go to https://IPaddress of the Control Plane Node.
Navigate to Administration > Configuration > Smarts Integration
Delete the existing Smarts Integration Details.
Re-add the Smarts Integration Details by editing the EDAA URL and pointing it to the secondary Tomcat Instance.
The SAM server is getting listed in the Domain Manager section instead of Presentation SAM section.
During Smart integration and configuration, INCHARGE SA (SAM server) is getting listed in the Domain Manager section. This problem occurs only when, the SAM server is started in Non-EDAA Mode.
To get listed under Presentation SAM section, start the SAM server in EDAA Mode.
While starting server, the Map error warning message appears for INCHARGE-SA and INCHARGE-OI in respective logs.
No functional impacts.
User needs to mandatorily discover ESX Servers for getting Virtual Machine Down event. Currently the Virtual Machine Down event is not generated if the corresponding ESX Servers are not discovered in IP Server. So, its recommended to discover Virtual Machines to get proper Root cause events.
On RHEL 7.8 version machine when SAM services are started, brcontrol shows IPv6 entry for servers due to which communication between servers is getting impacted.
On RHEL 7.8 version, if you start any domain manager as a service, the domain gets registered to a broker using both v4 and v6 IP address space. Due to this issue domain manager v6 entry will go to DEAD state in brcontrol output and the communication between the servers is failing sometimes due to this issue.
Note: Issue also detected on some machines with RHEL 7.2 and 7.6.
To avoid a domain running in v6 mode, allow only v4, by setting the below flag in runcmd_env.sh file:
SM_IP_VERSIONS=v4
Restart the domain manager, after updating runcmd_env.sh file.
Topology synchronization is taking more than 10 minutes for 25K devices, when latency between SAM and VMware Telco Cloud Service Assurance is more than 5 milliseconds.
When the latency increases topology synchronization time increases.
Ensure that the latency between VMware Telco Cloud Service Assurance (Topology Collector) and SAM Presentation server is less than 5 milliseconds.
Notification processing rate is slower, when the latency between SAM and VMware Telco Cloud Service Assurance is greater than 5 milliseconds.
When the latency between VMware Telco Cloud Service Assurance and SAM Presentation server increases, notification processing rate goes down.
Ensure that the latency between VMware Telco Cloud Service Assurance (Notification Collector) and SAM Presentation server is less than 5 milliseconds.
KPI feature will be supported in upcoming release of VMware Telco Cloud Service Assurance.