Some of the common troubleshooting steps are described here, which user must follow if discovery or monitoring is not working as expected.

Troubleshooting Methodologies -1

DCF Collector (Discovery & Monitoring)
  • Getting Collector Id for a VCO:
    • Run “dmctl -s <server name> invoke ICF_PersistentDataSet::VCOTopologyCollectorInstanceIds get”
      • Returns two blocks for each VCO. Each block has a collector Id and a tag to indicate if it is discovery/monitoring collector .
  • Logs and log locations:
    • Collector logs are located at /opt/DCF/Collecting/Collector-Manager/<collector Id>/logs/
  • Logging and troubleshooting:
    • Edit /opt/DCF/Collecting/Collector-Manager/<collector Id>/conf/logging.properties to increase log level
    • Change the default INFO level to one of – CONFIG, FINE, FINER & FINEST (lowest value)
    • The new log level takes effect in the next discovery
  • The kafka messages can also be configured to be written to file:
    • Edit /opt/DCF/Collecting/Collector-Manager/<collector Id>/conf/collecting.xml. Change enabled=true for file collector
      • <connector enabled="true" name="File" type="File-Connector" config="Collector-Manager/<collector Id>/conf/file-connector.xml" />

Troubleshooting Methodologies - 2

  • Check if the /opt/DCF/Collecting/Stream-Collector/<Collector Id>/velo/conf/requests/context-vco.xml has proper values (username, password, port & protocol) for VeloCloud access.
  • For specific discovery/monitoring issues use REST API to get the data directly from VCO to confirm VCO is returning proper value. The base URI for the REST API is https://<VCO IP>/portal/rest/

DCF Collector (Discovery & Monitoring)

Table 1. REST API used for Discovery and monitoring for different user types
Topology/Status REST API User Type
Operator MSP Enterprise
Login
Login /login/operatorLogin Yes NA NA
Login /login/enterpriseLogin Not Used Yes Yes
Discovery
Enterprises /network/getNetworkEnterprises Yes NA NA
Enterprises enterpriseProxy/getEnterpriseProxyEnterprises NA Yes NA
Enterprises enterprise/getEnterprise Not used Not used Yes
Edge

/network/getNetworkEnterprises

{ “with” : [“edges”]}

Yes NA NA
Edge /enterprise/getEnterpriseEdges Not used Yes Yes
Edge interfaces (WAN and LAN) /edge/getEdgeConfigurationStack Yes Yes Yes
Edge links /enterprise/getEnterpriseEdges Yes Yes Yes
Gateway /network/getNetworkGatewayPools Yes Yes NA
GatewayPool /network/getNetworkGatewayPools Yes Yes NA
Non VeloCloud Site and the Tunnels

/enterprise/getEnterpriseDataCenters

Yes Yes Yes
Edge and Gateway relationship

/network/getNetworkGatewayPools

(connectedEdgeList field in the response)

Yes NA NA
Edge and Gateway relationship gateway/getGatewayEdgeAssignments Not used Yes NA
Edge and Gateway relationship /edge/getEdgeConfigurationStack Not used Not used Yes
Edge Clusters

/edge/getEdgeConfigurationStack

/enterprise/getEnterpriseEdgeServices

/enterprise/getEnterpriseServices

Yes Yes Yes
Edge Hub and Spoke Relationship

/edge/getEdgeConfigurationStack

/enterprise/getEnterpriseEdgeServices

/enterprise/getEnterpriseServices

Yes Yes Yes
Monitoring
Edge Status enterprise/getEnterpriseEdges Yes Yes Yes
Edge link Status enterprise/getEnterpriseEdges Yes Yes Yes
Edge link Performance data monitoring/getAggregateEdgeLinkMetrics Yes Yes NA
Gateway Status network/getNetworkGatewayPools Yes Yes NA
Non VC site Tunnel Status enterprise/getEnterpriseDataCenters Yes Yes Yes

Troubleshooting Methodologies -3

Smarts
  • Logs and log locations:
    • Server logs are located at $INSTALL_DIR/smarts/local/logs
  • Logging and troubleshooting:
    • Enable Discovery logging:
      • Run dmctl –s <server> put ESM_Manager::ESM-Manager::EnableVeloCloudDiscoveryDebug TRUE
      • Run dmctl –s <server> put ESM_Manager::ESM-Manager::IsDebug TRUE
      • Run dmctl –s <server> put ESM_Manager::ESM-Manager::LogLevel DEBUG
        Note: This logs all the Kafka messages for all VCO.
  • Enable Monitoring Logging:
    • Run dmctl –s <server> put ESM_Manager::ESM-Manager:: EnableVeloCloudMonitoringDebug TRUE
    • Run dmctl –s <server> put ESM_Manager::ESM-Manager::LogLevel DEBUG
      Note: This logs all the Kafka messages for all VCO.
  • Make sure VeloCloud feature is enabled (default enabled):
    • Run dmctl –s <server> get ESM_Manager::ESM-Manager::IsVC
  • Smarts notifies DiscoveryFailure & MonitoringFailure event for the Orchestrator with the reason for failure set in the event details.
  • Log analysis:
    • Discovery Kafka message:
      • Sample:
        {"groupName":"group","discoveryID":null,"jobID":"9999","type":"Tenant","timestamp":1554472776,"value":0.0,"action":"r","properties":{"VelocloudOrchestrator":"10.106.124.199","Tenant-name":"testTenant_1","source":"velocloud-sdwan4dfefd0c-c8b6-41ac-a83f-96776c9001c4","type":"Tenant","Tenant-Id":"1","EnterpriseLogicalId":"c8cc3bc7-df6e-407c-ac30-352b4b3de93a"},"metrics":{"VCO-Req":{"properties":{"unit":"code","name":"VCO-Req"},"value":8080.0}},"relations":[{"type":"Orchestrator","element":"VCO-a.b.c.d","relationName":"SupervisedBy"}],"initialized":true,"forceRefresh":true,"name":"testTenant_1"}
        Creates Tenant::testTenant_1
        Creates Orchestrator::VCO-a.b.c.d
        Establish relationship Tenant::testTenant_1 SupervisedBy Orchestrator::VCO-a.b.c.d 
        
    • Monitoring Kafka message:
      • Sample:
        {"groupName":"group","discoveryID":null,"jobID":"9999","type":"VGateway","timestamp":1554753805,"value":0.0,"action":"\u0000","properties":{"instanceName":"GatewayName","attributes":"Status,IsActivated","source":"velocloud-sdwan048048cb-701e-440d-85e9-e46110965181","GatewayName":"VGateway-OneCloud-GW1-Bangalore","type":"VGateway","attributevalues":"OK,TRUE"},"metrics":{"VCO-Req":{"properties":{"unit":"code","name":"VCO-Req"},"value":0.0}},"relations":[],"initialized":true,"forceRefresh":false,"name":"VGateway-OneCloud-GW1-Bangalore"}
        VGateway::Vgateway-OneCloud-GW1-Bangalore  is the instance name
        Status is OK
        IsActivated is TRUE
        
  1. DCF Health Check
    1. How does user know the Orchestrator not accepting ICMP pings?
    2. Have the VeloCloud collectors deployed and Enabled?
    3. Is the DCF connection to KAFKA alive?
    4. Can DCF communicate to the Orchestrator?
    5. Can UI communicate to DCF and Orchestrator?
    6. Is DCF collecting data from the data source?
    7. What is the last time SMARTS Discovery and monitoring has happened?
    8. When is the last time DCF collected data from the data source?
    9. When is the last time DCF has published data to KAFKA?

      Logs contain all operational data. Specific Collector logs can be checked for all the details <DCF_Install>/Collecting/Collector-Manager/<Collector-ID>/logs/collecting-0-0.log

  2. KAFKA Health Check
    1. KAFKA Topics created
    2. KAFKA Messages not published from DCF, KPI engine
    3. KAFKA authentication issues
    4. KAFKA not reachable

      Logs contain all operational data. Specific Collector logs can be checked for all the details <DCF_Install>/Collecting/Collector-Manager/<Collector-ID>/logs/collecting-0-0.log

  3. Smarts Domain Managers
    1. Getting Collector Id for a VCO:

      Run "dmctl -s <server name> invoke ICF_PersistentDataSet::VCOTopologyCollectorInstanceIds get"

      Returns two blocks for each VCO. Each block has a collector Id and a tag to indicate if it is discovery/monitoring collector

      <connector enabled="true" name="File" type="File-Connector" config="Collector-Manager/<collector Id>/conf/file-connector.xml" />

    2. DCF Logs and log locations:

      Collector logs are located at /opt/DCF/Collecting/Collector-Manager/<collector Id>/logs/

    3. DCF Logging and troubleshooting:

      Edit /opt/DCF/Collecting/Collector-Manager/<collector Id>/conf/logging.properties to increase log level

      Change the default INFO level to one of - CONFIG, FINE, FINER & FINEST (lowest value)

      The new log level takes effect in the next discovery.

    4. The kafka messages can also be configured to be written to file.

      Edit /opt/DCF/Collecting/Collector-Manager/<collector Id>/conf/collecting.xml.

      Change enabled=true for file collector.

    5. Check if the /opt/DCF/Collecting/Stream-Collector/<Collector Id>/velo/conf/requests/context-vco.xml has proper values (username, password, port & protocol) for VeloCloud access.
    6. For specific discovery/monitoring issues use REST API to get the data directly from VCO to confirm VCO is returning proper value. The base URI for the REST API is https://<VCO IP>/portal/rest/. A REST client (e.g Postman, Insomnia & SoapUI) can be used to issue these REST queries.
      Note: The various VCO REST APIs used for discovery and monitoring for different user types are documented at the table REST API used for Discovery and monitoring for different user types.
    7. Smarts Logging:
      1. Server logs are located at $INSTALL_DIR/smarts/local/logs
        1. Enable Discovery logging:
          1. Run dmctl -s <server> put ESM_Manager::ESM-Manager::EnableVeloCloudDiscoveryDebug TRUE
          2. Run dmctl -s <server> put ESM_Manager::ESM-Manager::IsDebug TRUE
          3. Run dmctl -s <server> put ESM_Manager::ESM-Manager::LogLevel DEBUG
        2. Enable Monitoring Logging:
          1. Run dmctl -s <server> put ESM_Manager::ESM-Manager:: EnableVeloCloudMonitoringDebug TRUE
          2. Run dmctl -s <server> put ESM_Manager::ESM-Manager::LogLevel DEBUG
            Note: This logs all the Kafka messages for all VCO.
    8. Smarts Log analysis.
      1. Discovery Kafka message:

        -Sample: {"groupName":"group","discoveryID":null,"jobID":"9999","type":"Tenant","timestamp":1554472776,"value":0.0,"action":"r","properties":{"VelocloudOrchestrator":"10.106.124.199","Tenant-name":"testTenant_1","source":"velocloud-sdwan4dfefd0c-c8b6-41ac-a83f-96776c9001c4","type":"Tenant","Tenant-Id":"1","EnterpriseLogicalId":"c8cc3bc7-df6e-407c-ac30-352b4b3de93a"},"metrics":{"VCO-Req":{"properties":{"unit":"code","name":"VCO-Req"},"value":8080.0}},"relations":[{"type":"Orchestrator","element":"VCO-a.b.c.d","relationName":"SupervisedBy"}],"initialized":true,"forceRefresh":true,"name":"testTenant_1"}

        Creates Tenant::testTenant_1

        Creates Orchestrator::VCO-a.b.c.d

        Establish relationship Tenant::testTenant_1 SupervisedBy Orchestrator::VCO-a.b.c.d

      2. Monitoring Kafka message:

        -Sample: {"groupName":"group","discoveryID":null,"jobID":"9999","type":"VGateway","timestamp":1554753805,"value":0.0,"action":"\u0000","properties":{"instanceName":"GatewayName","attributes":"Status,IsActivated","source":"velocloud-sdwan048048cb-701e-440d-85e9-e46110965181","GatewayName":"VGateway-OneCloud-GW1-Bangalore","type":"VGateway","attributevalues":"OK,TRUE"},"metrics":{"VCO-Req":{"properties":{"unit":"code","name":"VCO-Req"},"value":0.0}},"relations":[],"initialized":true,"forceRefresh":false,"name":"VGateway-OneCloud-GW1-Bangalore"}

        VGateway::Vgateway-OneCloud-GW1-Bangalore is the instance name

        Status is OK

        IsActivated is TRUE

    9. Feature is enabled (default enabled)

      Run dmctl -s <server> get ESM_Manager::ESM-Manager::IsVC

    10. Smarts ESM server raises a DiscoveryFailure/MonitoringFailure event on the Orchestrator instance when there is an internal error or communication failure between the components. The event has the reason for the failure set in the event details.