The SoS tool is a command-line Python tool used primarily to perform log collection and take configuration backups from all of the components in your Cloud Foundation environment.

The SoS tool is installed in /opt/vmware/evosddc-support in the SDDC Manager instance's file system. Only the root account can run the SoS tool. To run a command, change to the /opt/vmware/evosddc-support directory and type ./sos followed by the options required for your desired operation.

Note:

When using the tool to collect logs, initiating the command in the SDDC Manager instance that has the VIP address assigned to it is preferred.

When using the tool to take configuration backups:

  • Initiating the command in the SDDC Manager instance is assigned the VIP address saves backup configurations for all of the racks in the installation.

  • Initiating the command in an SDDC Manager instance that is not assigned the VIP address saves backup configurations only for the rack in which that instance is deployed.

For a description of the VIP address and how to determine to which SDDC Manager instance has it, see About the Primary Rack and the SDDC Manager Virtual IP Address.

./sos --option-1 --option-2 --option-3 ... --option-n

To list the available command options, use the --help long option or the -h short option.

./sos --help
./sos -h

Log files for the vRealize Log Insight agent in vCenter Server are collected when vCenter Server log files are collected.

Note:

You can specify some options in the conventional GNU/POSIX syntax, using -- for the long option and - for the short option.

SoS Options for Information About the SoS Tool

Use these options to see information about the SoS tool itself.

Table 1. SoS Information Options

Option

Description

--help

-h

Provides a summary of the available SoS tool options

--version

-v

Provides the SoS tool's version number.

SoS Tool Options Used When Retrieving Support Log Files

Use these options when retrieving support logs from your environment's various components.

  • To collect all logs from all components except VDI-specific components, you can run the SoS tool without specifying any component-specific options.

  • To collect logs for a specific component, run the tool with the appropriate options.

  • When you have a VDI workload domain in the environment and you want the SoS tool to collect logs from the VDI-specific server components, you must include the --vdi-pass vdi-password option. The SoS tool uses the specified vdi-password to log in as the Administrator user to the VDI environment's server VMs and retrieve their support bundles, such as the View Composer instances, View Connection Server instances, security server instances, App Volumes instances, and AD Domain Server VM.

For steps to collect all logs, see Collect Logs for Your Cloud Foundation Environment.

Table 2. SoS Tool Log File Options

Option

Description

--log-dir logdirectory

Use this option to specify an output directory to which the SoS tool will write the log files, such as /home/sos-logs.

If this option is not specified, the tool writes the output files to /var/tmp in the VM's filesystem in which the command was run.

For a description of the output directory structure, see Component Log Files Collected By the SoS Tool.

--no-clean-old-logs

Use this option to prevent the tool from removing any output from a previous collection run. By default, the SoS tool.

By default, before writing the output to the directory, the tool deletes the prior run's output files that might be present. If you want to retain the older output files, specify this option.

--vdi-pass vdi-password

You must specify this option if you want the logs collected from any VDI workload domains' server VMs in the environment. For vdi-password, specify the password used for the account for logging in to View Administrator, the VDI environment's Web interface.

--esx-logs

Use this option to collect logs from the ESXi hosts only.

--vc-logs

Use this option to collect logs from the vCenter Server instances only.

The logs from the vRealize Log Insight agents corresponding to the vCenter Server instances are also collected when this option is used.

--switch-logs

Use this option to collect logs from the switches only. Logs from all switches are collected: management, ToR, and, if a multirack installation, inter-rack switches.

--vrm-logs

Use this option to collect logs from the SDDC Manager instances only.

--zk-logs

Use this option to collect logs from the Zookeeper server instances only.

Zookeeper server processes run in each of the infrastructure virtual machines, the ones with ISVM in their names. These ISVM VMs run in your installation's primary rack. For more details about Zookeeper in the environment, see the VMware Cloud Foundation Overview and Bring-Up Guide.

--cassandra-logs

Use this option to collect logs from the Apache Cassandra database only.

Apache Cassandra processes run in each of the infrastructure virtual machines, the ones with ISVM in their names. These ISVM VMs run in your installation's primary rack.

--via-logs

When the VIA virtual machine is reachable from the SDDC Manager instance where you are issuing the SoS tool command to collect the logs, you can use this option to collect logs only from the VIA virtual machine.

--psc-logs

Use this option to collect logs from the Platform Services Controller instances only.

--nsx-logs

Use this option to collect logs from the NSX Managerand NSX Controller instances only.

--li-logs

When there are vRealize Log Insight instances in your installation, use this option to collect logs from those instances only.

--vrops-logs

When there are vRealize Operations Manager instances in your installation, use this option to collect logs from those instances only.

--hms-logs

Use this option to collect logs from the HMS software component only.

--rack rackname

In a multirack environment, use this option to collect logs from a specific rack.

Without this option, the SoS tool collects logs from all of the racks in the environment.

--vrm-ip VRM-VM-IP-address

In a multirack environment, use this option to collect logs from an SDDC Manager instance different from the one in which you are running the SoS tool.

You run the SoS tool in a specific SDDC Manager instance, usually the one in the primary rack. When you want to run the tool in one SDDC Manager instance but collect the logs from another instance, you use this option to specify that other instance's IP address.

Without this option, the SoS tool collects logs from all of the SDDC Manager instances in the environment.

--vrm-pwd VRM-VM-root-password

In a multirack environment, when running the SoS tool in one SDDC Manager instance to collect logs from another instance, use this option to specify the password for that other instance's root account.

When running the SoS tool in one SDDC Manager instance to collect logs from another instance using the --vrm-ip VRMIP option, the SoS tool authenticates into that other SDDC Manager instance using the root account to initiate log collection in that instance. The SoS tool requires the password of that other instance's root account to log in to that instance.

--dump-only-vrm-java-threads

Use this option to only collect the Java thread information from the SDDC Manager instances.

--debug-mode

Use this option to run the log collection process in debug mode.

SoS Tool Options Used for Backing Up Component Configurations

Use this option to create backup files of the configurations for various components. For the steps to run the tool using this option, see Back Up Component Configurations Using the SoS Tool.

When the environment has more than one rack and the command is initiated in the SDDC Manager instance that currently has the VIP address, the tool also initiates the backup command on the other racks. Each rack's output is written into its own SDDC Manager instance's filesystem. If you initiate the command in the SDDC Manager instance that does not have the VIP, the backup command is initiated only for that rack. For a description of how to determine which SDDC Manager instance has the VIP, see About the Primary Rack and the SDDC Manager Virtual IP Address.

By default, the tool writes the backup files for a rack into the /var/tmp directory in the filesystem of that rack's SDDC Manager instance. For example, the backup files for the one rack are written into its SDDC Manager instance's /var/tmp directory, the backup files for the second rack are written into its SDDC Manager instance's /var/tmp directory, and so on. When you log in to the first rack's SDDC Manager instance, change directories to the /var/tmp directory, and list the directory contents, you see the collected set of backups that the tool has written for that rack, for example:

rack-1-vrm-1:/var/tmp # ls -l
drwxr-xr-x 3 root root 4096 Nov 23 00:48 backup-2016-11-23-00-46-01-20678
drwxr-xr-x 3 root root 4096 Nov 23 03:48 backup-2016-11-23-03-48-15-6185
drwxr-xr-x 3 root root 4096 Nov 24 13:56 backup-2016-11-24-13-56-22-25040
drwxr-xr-x 3 root root 4096 Nov 25 12:24 backup-2016-11-25-12-22-54-17065
drwxr-xr-x 3 root root 4096 Nov 28 13:18 backup-2016-11-28-13-16-57-14030
drwxr-xr-x 3 root root 4096 Nov 28 18:37 backup-2016-11-28-18-35-33-12228
drwxr-xr-x 3 root root 4096 Nov 28 18:51 backup-2016-11-28-18-50-28-17743
drwxr-xr-x 3 root root 4096 Nov 29 13:12 backup-2016-11-29-13-10-56-8848

Then when you log in to the second rack's SDDC Manager instance, change directories to the /var/tmp directory, and list the directory contents, you see the collected set of backups that the tool has written for that second rack, for example:

rack-2-vrm-1:/var/tmp # ls -l
drwxr-xr-x 3 root root 4096 Nov 24 11:38 backup-2016-11-24-11-38-08-32210
drwxr-xr-x 3 root root 4096 Nov 24 13:56 backup-2016-11-24-13-56-32-14703
drwxr-xr-x 3 root root 4096 Nov 25 12:25 backup-2016-11-25-12-24-22-17923
drwxr-xr-x 3 root root 4096 Nov 25 20:46 backup-2016-11-25-20-45-20-28378
drwxr-xr-x 3 root root 4096 Nov 28 13:19 backup-2016-11-28-13-18-25-21909
drwxr-xr-x 3 root root 4096 Nov 28 18:36 backup-2016-11-28-18-34-52-23231
drwxr-xr-x 3 root root 4096 Nov 28 18:38 backup-2016-11-28-18-37-01-23891
drwxr-xr-x 3 root root 4096 Nov 28 18:53 backup-2016-11-28-18-51-56-29795
drwxr-xr-x 3 root root 4096 Nov 29 13:13 backup-2016-11-29-13-12-24-27142
Table 3. SoS Tool Backup Options

Option

Description

--backup

Use this option to take a backup of the configurations of these components:

  • ESXi hosts

  • Switches (management, ToR, inter-rack)

  • The three infrastructure (ISVM) virtual machines' Zookeeper server instances and Cassandra datastore

  • SDDC Manager instances (the virtual machines, one per rack, that have vrm in their names)

  • The SDDC Manager instances' HMS software components

The output is written to the/var/tmp directory in each SDDC Manager instance's filesystem, following this directory structure:

backup-datetimestamp
  sos.log
  rack-1
    esx
      configBundle-hostname.domain.tgz #One per host
    switch
      ToR-or-inter-rack-switch-ip-address-manufacturername-running-config.gz #File named according to the switch's IP address and manufacturer
      cumulus-192.168.100.1.tgz #Management switch configuration file
    zk
      isvm-ip-address #Three directories in the zk directory, each named using the IP address of an ISVM VM, such as 192.168.100.43
        cassandra-db-backup.tgz
        zk-db-backup.tgz
    vrm.properties
       hms_ib_inventory.json
       vrm.properties
       vrm.properties.vRack
       vrm-security.keystore
    hms.tar.gz #HMS component's configuration data
    vrm-datetimestamp.tgz #Postgres database configuration data

SoS Tool Options That Directly Alter the SDDC Manager Configuration

These SoS command options are used for specific troubleshooting tasks in very particular situations. These commands alter the SDDC Manager configuration by directly changing specific values set in the underlying distributed database.

Caution:

Using these options is not recommended unless under guidance from VMware Technical Support. Use these options only when VMware Technical Support instructs you to do so.

Table 4. SoS Tool Options that Directly Alter the SDDC Manager Configuration

Option

Description

--change-ntp NTP-IP-address

This option updates the SDDC Manager configuration to replace the existing NTP server IP address with a new one.

During the bring-up process on the first rack in a Cloud Foundation installation, an NTP server IP address is entered in the bring-up wizard and is saved to the distributed database. This SoS tool option updates that stored NTP server IP address.

--change-uplink-db uplink-port-1, uplink-port-2, ...

This option changes the uplink port information that is stored in the distributed database

This option is deprecated in this release. To update the uplink ports, use the Uplink screen in the SDDC Manager client. See Manage Uplink Connectivity Settings Using the SDDC Manager Client.

--remove-esx-host-in-db ESXi-host-ip

--remove-esx-host-in-db ESXi-hostname

After decommissioning an ESXi host, this option updates the distributed database to remove the information for the ESXi host specified in the option, either by IP address or hostname.

This option is deprecated in this release. To decommission an ESXi host from the environment, use the steps as documented in Replace Dead Host or SAS Controller or Expander when Host Belongs to a Workload Domain to decommission the ESXi host.

SoS Tool Options for Audit Data Collection and Diff Generation

These SoS commands are used for collecting audit data and to generate diff between collected audit data. Audit data consist of version and configuration details obtained from the various physical and logical components that constitute VMware Cloud Foundation, including racks, servers, switches, domains and VMs.

Note:

Audit tool options will work only after successful completion of second boot on the rack.

Table 5. SoS Tool Options for Audit Data Collection and Diff Generation

Option

Description

--audit

This option collects audit information from all the components of Cloud Foundation.

By default, audit data is saved in the /var/tmp/audit-compliance/audit directory as a JSON file. The log file is saved under /var/tmp/audit-compliance/logs.

--audit-diff

This option generates a diff between two audit data JSON files.

This options picks the latest and the penultimate audit data JSON files from the /var/tmp/audit-compliance/audit directory and generates the diff. By default, the diff is stored as a JSON file in the /var/tmp/audit-compliance/diff directory.

--audit-output-dir <path-to-audit-parent-directory>

Use this option to save audit data JSON and diff JSON files to a directory other than the default /var/tmp/audit-compliance parent directory.

Note:

This option can be used with the --audit and --audit-diff options.

This option creates the following directory structure:

  • path-to-audit-parent-directory/audit-compliance

  • path-to-audit-parent-directory/audit-compliance/audit

  • path-to-audit-parent-directory/audit-compliance/diff

Audit data JSON files are saved in the path-to-audit-parent-directory/audit-compliance/audit directory.

Audit diff JSON files are saved in the path-to-audit-parent-directory/audit-compliance/diff directory.

--audit-files <full-path-to-audit-json-file-1> <full-path-to-audit-json-file-2>

Use this option to generate a diff file between the two specific audit data JSON files.

Note:

This option must be with the --audit-diff option.

By allowing the user to specify the audit files to be diffed, this option bypasses the default behavior of the -audit-diff option, described above.

--no-audit

Use this option to prevent audit data collection during SoS log collection.

By default, audit data collection runs when SoS log collection runs. This option prevents this default behavior.

SoS Tool Options for Health Check

These SoS commands are used for checking the health status of various components or services, including connectivity, compute, storage, database, domains, and networks, among others.

Table 6. SoS Tool Options for Health Check

Option

Description

--health-check

This option performs all available health checks.

--connectivity-health

This option performs a connectivity health check.

--services-health

This option performs a services health check.

--compute-health

This option performs a compute health check.

--storage-health

This option performs a storage health check.

--db-health

This option performs a database health check.

--ntp-health

This option performs an NTP health check.

--general-health

This option performs a general health check.

--network-wire-map

This option performs a network wire map health check.

--network-health

This option performs a network health check.

--certificate-health

This option performs a certificate health check.

--domain-health, plus optional --domain-name <domain-name>

The --domain-health option performs a domain health check.

To perform a health check on a specific domain, include the optional --domain-name <domain-name> flag with the domain name.

--get-host-ips

This option returns server information.