La siguiente lista proporciona información sobre los comandos de vSphere Bitfusion más importantes y sus tareas. Si es necesario, el equipo de soporte de VMware puede proporcionar comandos de CLI adicionales.

Asignar GPU en vSphere Bitfusion

Para asignar varias GPU a una sola aplicación, ejecute el comando bitfusion run.

Para asignar una cantidad de GPU e iniciar una sesión en la que pueda ejecutar varias aplicaciones en las mismas GPU, ejecute bitfusion request_gpus.

Ejecutar aplicaciones en vSphere Bitfusion

Para iniciar una sola aplicación, ejecute el comando bitfusion run.

Para iniciar varias aplicaciones en una sesión iniciada con el comando bitfusion request_gpus, ejecute el comando bitfusion client.

Anular la asignación de GPU en vSphere Bitfusion

Para anular la asignación de las GPU en una sesión iniciada con el comando bitfusion request_gpus, ejecute el comando bitfusion release_gpus.

Enumerar las GPU disponibles en vSphere Bitfusion

Para comprobar una instalación del servidor de vSphere Bitfusion y encontrar una lista de GPU disponibles, ejecute el comando bitfusion list_gpus.

 - server 0 [172.16.31.162:56001]: running 0 tasks
   |- GPU [0]: free memory (15109 / 15109MiB) Tesla T4 (7.5)
 - server 1 (leader)  [172.16.31.156:56001]: running 0 tasks
   |- GPU [0]: free memory (15109 / 15109MiB) Tesla T4 (7.5)

Ejecutar una comprobación de estado en vSphere Bitfusion

Puede acceder a la comprobación de estado desde la línea de comandos.
  • Para comprobar el estado de todos los servidores de vSphere Bitfusion y el cliente de Bitfusion, ejecute bitfusion health.
  • Para comprobar el estado de un único cliente o servidor de vSphere Bitfusion, ejecute bitfusion localhealth.

Comprobar su versión de vSphere Bitfusion

Para mostrar la versión instalada de vSphere Bitfusion, ejecute el comando bitfusion version.

Bitfusion version: 4.0.0 release

Mostrar información de las GPU en vSphere Bitfusion

Para ver información de las GPU, ejecute el comando bitfusion smi. Como alternativa, para recibir un resultado similar, puede iniciar la aplicación nvidia-smi con el comando bitfusion run.

+----------------------------------------------------------------------------------------+
| 172.16.31.162:56001                                          Driver Version: 460.73.01 |
+--------------------------------------+-------------------------+-----------------------+
| GPU  Name              Persistence-M | Virt Mem    Alloc / All | BusId  Vol Uncorr ECC |
| Fan  Temp  Perf        Pwr:Usage/Cap | Phy Mem     Used  / All | GPU-Util   Compute M. |
|======================================+=========================+=======================|
| 0    Tesla T4               Enabled  | 0       MB / 15109   MB | 00000000:13:00.0    0 |
| 0 %   28C  P8             10W /  70W | 3       MB / 15109   MB |   0%          Default |
+--------------------------------------+-------------------------+-----------------------+
+----------------------------------------------------------------------------------------+
| 172.16.31.156:56001                                          Driver Version: 460.73.01 |
+--------------------------------------+-------------------------+-----------------------+
| GPU  Name              Persistence-M | Virt Mem    Alloc / All | BusId  Vol Uncorr ECC |
| Fan  Temp  Perf        Pwr:Usage/Cap | Phy Mem     Used  / All | GPU-Util   Compute M. |
|======================================+=========================+=======================|
| 0    Tesla T4               Enabled  | 0       MB / 15109   MB | 00000000:13:00.0    0 |
| 0 %   34C  P8             10W /  70W | 3       MB / 15109   MB |   0%          Default |
+--------------------------------------+-------------------------+-----------------------+

Probar el ancho de banda en vSphere Bitfusion

Para probar el ancho de banda y la latencia entre el cliente y los servidores de vSphere Bitfusion, ejecute el comando bitfusion net_perf.

Interfaz de red única
Displayed results are calculated from round-trip measurements
BW(1MB) = 1000/(LAT(1MB) - LAT(1B))

[ <client>] ens160 => [10.202.8.169] net1 ( tcp) Single packet lat = 51 us, bw(1MB) = 1.71 GB/s
[ <client>] ens160 => [10.202.8.185] net1 ( tcp) Single packet lat = 48 us, bw(1MB) = 1.09 GB/s
[ <client>] ens160 => [10.202.8.233] net1 ( tcp) Single packet lat = 50 us, bw(1MB) = 0.87 GB/s
Varias interfaces de red
Displayed results are calculated from round-trip measurements
BW(1MB) = 1000/(LAT(1MB) - LAT(1B))

[ <client>] ens160 => [10.202.8.169] net1 ( tcp) Single packet lat = 51 us, bw(1MB) = 1.71 GB/s
[ <client>] ens160 => [10.202.8.185] net1 ( tcp) Single packet lat = 48 us, bw(1MB) = 1.09 GB/s
[ <client>] ens160 => [10.202.8.233] net1 ( tcp) Single packet lat = 50 us, bw(1MB) = 0.87 GB/s
[ <client>] ens192f0 => [10.202.8.169] net2 ( tcp) Single packet lat = 47 us, bw(1MB) = 2.14 GB/s
[ <client>] ens192f0 => [10.202.8.185] net2 ( tcp) Single packet lat = 49 us, bw(1MB) = 1.11 GB/s
[ <client>] ens192f0 => [10.202.8.233] net2 ( tcp) Single packet lat = 50 us, bw(1MB) = 1.15 GB/s
[ <client>] vmw_pvrdma0 => [10.202.8.169] vmw_pvrdma0 (infiniband) Single packet lat = 19 us, bw(1MB) = 3.66 GB/s Single packet Write lat = 8 us, bw = 10.101 GB/s
[ <client>] vmw_pvrdma0 => [10.202.8.185] vmw_pvrdma0 (infiniband) Single packet lat = 21 us, bw(1MB) = 3.45 GB/s Single packet Write lat = 8 us, bw = 10.5263 GB/s
[ <client>] vmw_pvrdma0 => [10.202.8.233] vmw_pvrdma0 (infiniband) Single packet lat = 21 us, bw(1MB) = 3.46 GB/s Single packet Write lat = 8 us, bw = 10.4167 GB/s

Solicitar ayuda en vSphere Bitfusion

Para obtener la lista completa de comandos de la CLI de vSphere Bitfusion o más información sobre un comando específico, ejecute bitfusion help.

NAME:
   Bitfusion - Run application with VMware Bitfusion

USAGE:
   bitfusion <command> <options> "application"
   bitfusion <command> <options> -- [application]
   bitfusion help [command]

   For more information, system requirements, and advanced usage please visit docs.bitfusion.io

COMMANDS:
        tls-certs, TC    Manage TLS certificates used by bitfusion server.  Requires root privileges.
        version, v       Display full Bitfusion version
        localhealth, LH  Run health check on current node only
        dealloc          Deallocate license certificate.  Requires root priviledges.
        crashreport      Send crash report to bitfusion
        list_gpus        List the available GPUs in a shared pool
        initdb           Init database setup
        token            Fetch and manipulate tokens
        register         Register remote server as the plugin
        unregister       Unregister remote plugin
        removenode       Remove unavailable nodes
        user             Manage bitfusion users
        help, h          Shows a list of commands or help for one command
   Client Commands:
        client, c     Run application
        health, H     Run health check on all specified servers and current node
        request_gpus  Request GPUs from a shared pool
        release_gpus  Release GPUs back into a shared pool. Options must match a previous request_gpus command
        run           Request GPUs from a shared pool, run a client command, then release the GPUs
        stats         Gather stats from all servers.
        smi           Display smi-like info for all servers.
        local         Run a CUDA application locally
        net_perf      Gather network performance data from all SRS servers.
   Server Commands:
        server, s                Run dispatcher service - listens for 'bitfusion client' commands
        resource_scheduler, srs  Run Bitfusion resource scheduler (SRS) on GPU server
        analytics                Run Bitfusion analytics server
        manager                  Run Bitfusion manager server

EXAMPLES:
   $ bitfusion resource_scheduler --srs_port 50001

   $ bitfusion run -n 4 -- <application>

   $ bitfusion request_gpus -n 1 -p 0.25