Some of the common troubleshooting steps are given below, which user must follow if discovery or monitoring is not working.

Troubleshooting Methodologies -1

DCF Collector (Discovery & Monitoring)
  • Getting Collector Id for a VCO
    • Run “dmctl -s <server name> invoke ICF_PersistentDataSet::VCOTopologyCollectorInstanceIds get”
      • Returns two blocks for each VCO. Each block has a collector Id and a tag to indicate if it is discovery/monitoring collector .
  • Logs and log locations:
    • Collector logs are located at /opt/DCF/Collecting/Collector-Manager/<collector Id>/logs/
  • Logging and troubleshooting:
    • Edit /opt/DCF/Collecting/Collector-Manager/<collector Id>/conf/logging.properties to increase log level
    • Change the default INFO level to one of – CONFIG, FINE, FINER & FINEST (lowest value)
    • The new log level takes effect in the next discovery
  • The kafka messages can also be configured to be written to file:
    • Edit /opt/DCF/Collecting/Collector-Manager/<collector Id>/conf/collecting.xml. Change enabled=true for file collector
      • <connector enabled="true" name="File" type="File-Connector" config="Collector-Manager/<collector Id>/conf/file-connector.xml" />

Troubleshooting Methodologies - 2

DCF Collector (Discovery & Monitoring)
  • Check if the /opt/DCF/Collecting/Stream-Collector/<Collector Id>/velo/conf/requests/context-vco.xml has proper values (username, password, port & protocol) for VeloCloud access.
  • For specific discovery/monitoring issues use REST API to get the data directly from VCO to confirm VCO is returning proper value. The base URI for the REST API is https://<VCO IP>/portal/rest/
Object REST API for Discovery POST Paylod
Tenant network/getNetworkEnterprises {"with": ["edges"]}
Gateway Pools network/getNetworkGatewayPools {"with": ["gateways", "enterprises"]}
Gateways network/getNetworkGatewayPools {"with": ["gateways", "enterprises"]}
Edge

enterprise/getEnterpriseEdges

edge/getEdgeConfigurationStack

{"enterpriseId": <enterprise_id>,"with":["recentLinks"]}

{"enterpriseId": <entId>, "edgeId": <edgeId>}

Non VC Site enterprise/getEnterpriseDataCenters {"enterpriseId": <enterprise_id>}

Troubleshooting Methodologies -3

DCF Collector (Discovery & Monitoring)
  • Monitoring REST APIs with payload.
  • A REST client (e.g Postman, Insomnia & SoapUI) can be used to issue these REST queries.
Object REST APIs POST Payload
Gateways network/getNetworkGatewayPools {"with": ["gateways", "enterprises"]}
Edge enterprise/getEnterpriseEdges (Used to get link status as well) {"enterpriseId": <enterprise_id>},"with":["recentLinks"]}
Non VC Site enterprise/getEnterpriseDataCenters {"enterpriseId": <enterprise_id>}

Troubleshooting Methodologies -4

Smarts
  • Logs and log locations:
    • Server logs are located at $INSTALL_DIR/smarts/local/logs
  • Logging and troubleshooting:
    • Enable Discovery logging:
      • Run dmctl –s <server> put ESM_Manager::ESM-Manager::EnableVeloCloudDiscoveryDebug TRUE
      • Run dmctl –s <server> put ESM_Manager::ESM-Manager::IsDebug TRUE
      • Run dmctl –s <server> put ESM_Manager::ESM-Manager::LogLevel DEBUG
        Note: This logs all the Kafka messages for all VCO.
  • Enable Monitoring Logging:
    • Run dmctl –s <server> put ESM_Manager::ESM-Manager:: EnableVeloCloudMonitoringDebug TRUE
    • Run dmctl –s <server> put ESM_Manager::ESM-Manager::LogLevel DEBUG
      Note: This logs all the Kafka messages for all VCO.
  • Make sure VeloCloud feature is enabled (default enabled):
    • Run dmctl –s <server> get ESM_Manager::ESM-Manager::IsVC
  • Smarts notifies DiscoveryFailure & MonitoringFailure event for the Orchestrator with the reason for failure set in the event details.
  • Log analysis:
    • Discovery Kafka message:
      • Sample:
        {"groupName":"group","discoveryID":null,"jobID":"9999","type":"Tenant","timestamp":1554472776,"value":0.0,"action":"r","properties":{"VelocloudOrchestrator":"10.106.124.199","Tenant-name":"testTenant_1","source":"velocloud-sdwan4dfefd0c-c8b6-41ac-a83f-96776c9001c4","type":"Tenant","Tenant-Id":"1","EnterpriseLogicalId":"c8cc3bc7-df6e-407c-ac30-352b4b3de93a"},"metrics":{"VCO-Req":{"properties":{"unit":"code","name":"VCO-Req"},"value":8080.0}},"relations":[{"type":"Orchestrator","element":"VCO-a.b.c.d","relationName":"SupervisedBy"}],"initialized":true,"forceRefresh":true,"name":"testTenant_1"}
        Creates Tenant::testTenant_1
        Creates Orchestrator::VCO-a.b.c.d
        Establish relationship Tenant::testTenant_1 SupervisedBy Orchestrator::VCO-a.b.c.d 
        
    • Monitoring Kafka message:
      • Sample:
        {"groupName":"group","discoveryID":null,"jobID":"9999","type":"VGateway","timestamp":1554753805,"value":0.0,"action":"\u0000","properties":{"instanceName":"GatewayName","attributes":"Status,IsActivated","source":"velocloud-sdwan048048cb-701e-440d-85e9-e46110965181","GatewayName":"VGateway-OneCloud-GW1-Bangalore","type":"VGateway","attributevalues":"OK,TRUE"},"metrics":{"VCO-Req":{"properties":{"unit":"code","name":"VCO-Req"},"value":0.0}},"relations":[],"initialized":true,"forceRefresh":false,"name":"VGateway-OneCloud-GW1-Bangalore"}
        VGateway::Vgateway-OneCloud-GW1-Bangalore  is the instance name
        Status is OK
        IsActivated is TRUE
        
  1. DCF Health Check
    1. How does user know the Orchestrator not accepting ICMP pings?
    2. Have the VeloCloud collectors deployed and Enabled?
    3. Is the DCF connection to KAFKA alive?
    4. Can DCF communicate to the Orchestrator?
    5. Can UI communicate to DCF and Orchestrator?
    6. Is DCF collecting data from the data source?
    7. What is the last time SMARTS Discovery and monitoring has happened?
    8. When is the last time DCF collected data from the data source?
    9. When is the last time DCF has published data to KAFKA?

      Logs contain all operational data. Specific Collector logs can be checked for all the details <DCF_Install>/Collecting/Collector-Manager/<Collector-ID>/logs/collecting-0-0.log

  2. KAFKA Health Check
    1. KAFKA Topics created
    2. KAFKA Messages not published from DCF, KPI engine
    3. KAFKA authentication issues
    4. KAFKA not reachable

      Logs contain all operational data. Specific Collector logs can be checked for all the details <DCF_Install>/Collecting/Collector-Manager/<Collector-ID>/logs/collecting-0-0.log

  3. Smarts Domain Managers
    1. Getting Collector Id for a VCO:

      Run "dmctl -s <server name> invoke ICF_PersistentDataSet::VCOTopologyCollectorInstanceIds get"

      Returns two blocks for each VCO. Each block has a collector Id and a tag to indicate if it is discovery/monitoring collector

      <connector enabled="true" name="File" type="File-Connector" config="Collector-Manager/<collector Id>/conf/file-connector.xml" />

    2. DCF Logs and log locations:

      Collector logs are located at /opt/DCF/Collecting/Collector-Manager/<collector Id>/logs/

    3. DCF Logging and troubleshooting:

      Edit /opt/DCF/Collecting/Collector-Manager/<collector Id>/conf/logging.properties to increase log level

      Change the default INFO level to one of - CONFIG, FINE, FINER & FINEST (lowest value)

      The new log level takes effect in the next discovery.

    4. The kafka messages can also be configured to be written to file.

      Edit /opt/DCF/Collecting/Collector-Manager/<collector Id>/conf/collecting.xml.

      Change enabled=true for file collector.

    5. Check if the /opt/DCF/Collecting/Stream-Collector/<Collector Id>/velo/conf/requests/context-vco.xml has proper values (username, password, port & protocol) for VeloCloud access.
    6. For specific discovery/monitoring issues use REST API to get the data directly from VCO to confirm VCO is returning proper value. The base URI for the REST API is https://<VCO IP>/portal/rest/. A REST client (e.g Postman, Insomnia & SoapUI) can be used to issue these REST queries.
    7. Discovery REST APIs with payload.

      Object

      REST API for Discovery

      POST payload

      Tenant

      network/getNetworkEnterprises

      {"with": ["edges"]}

      Gateway Pools

      network/getNetworkGatewayPools

      {"with": ["gateways", "enterprises"]}

      Gateways

      network/getNetworkGatewayPools

      {"with": ["gateways", "enterprises"]}

      Edge

      enterprise/getEnterpriseEdges

      edge/getEdgeConfigurationStack

      {"enterpriseId": <enterprise_id>,"with":["recentLinks"]}

      {"enterpriseId": <entId>, "edgeId": <edgeId>}

      Non VC Site

      enterprise/getEnterpriseDataCenters

      {"enterpriseId": <enterprise_id>}

    8. Monitoring REST APIs with payload.

      Object

      REST API

      POST payload

      Gateways

      network/getNetworkGatewayPools

      {"with": ["gateways", "enterprises"]}

      Edge

      enterprise/getEnterpriseEdges (Used to get link status as well)

      {"enterpriseId": <enterprise_id>},

      "with":["recentLinks"]}

      Non VC Site

      enterprise/getEnterpriseDataCenters

      {"enterpriseId": <enterprise_id>}

    9. Smarts Logging
      1. Server logs are located at $INSTALL_DIR/smarts/local/logs
        1. Enable Discovery logging
          1. Run dmctl -s <server> put ESM_Manager::ESM-Manager::EnableVeloCloudDiscoveryDebug TRUE
          2. Run dmctl -s <server> put ESM_Manager::ESM-Manager::IsDebug TRUE
          3. Run dmctl -s <server> put ESM_Manager::ESM-Manager::LogLevel DEBUG
        2. Enable Monitoring Logging
          1. Run dmctl -s <server> put ESM_Manager::ESM-Manager:: EnableVeloCloudMonitoringDebug TRUE
          2. Run dmctl -s <server> put ESM_Manager::ESM-Manager::LogLevel DEBUG
            Note: This logs all the Kafka messages for all VCO
    10. Smarts Log analysis
      1. Discovery Kafka message:

        -Sample: {"groupName":"group","discoveryID":null,"jobID":"9999","type":"Tenant","timestamp":1554472776,"value":0.0,"action":"r","properties":{"VelocloudOrchestrator":"10.106.124.199","Tenant-name":"testTenant_1","source":"velocloud-sdwan4dfefd0c-c8b6-41ac-a83f-96776c9001c4","type":"Tenant","Tenant-Id":"1","EnterpriseLogicalId":"c8cc3bc7-df6e-407c-ac30-352b4b3de93a"},"metrics":{"VCO-Req":{"properties":{"unit":"code","name":"VCO-Req"},"value":8080.0}},"relations":[{"type":"Orchestrator","element":"VCO-a.b.c.d","relationName":"SupervisedBy"}],"initialized":true,"forceRefresh":true,"name":"testTenant_1"}

        Creates Tenant::testTenant_1

        Creates Orchestrator::VCO-a.b.c.d

        Establish relationship Tenant::testTenant_1 SupervisedBy Orchestrator::VCO-a.b.c.d

      2. Monitoring Kafka message:

        -Sample: {"groupName":"group","discoveryID":null,"jobID":"9999","type":"VGateway","timestamp":1554753805,"value":0.0,"action":"\u0000","properties":{"instanceName":"GatewayName","attributes":"Status,IsActivated","source":"velocloud-sdwan048048cb-701e-440d-85e9-e46110965181","GatewayName":"VGateway-OneCloud-GW1-Bangalore","type":"VGateway","attributevalues":"OK,TRUE"},"metrics":{"VCO-Req":{"properties":{"unit":"code","name":"VCO-Req"},"value":0.0}},"relations":[],"initialized":true,"forceRefresh":false,"name":"VGateway-OneCloud-GW1-Bangalore"}

        VGateway::Vgateway-OneCloud-GW1-Bangalore is the instance name

        Status is OK

        IsActivated is TRUE

    11. Feature is enabled (default enabled)

      Run dmctl -s <server> get ESM_Manager::ESM-Manager::IsVC

    12. Smarts ESM server raises a DiscoveryFailure/MonitoringFailure event on the Orchestrator instance when there is an internal error or communication failure between the components. The event has the reason for the failure set in the event details.