You can run your application in a dedicated partition of a GPU's memory, and other applications can use the remaining GPU's memory.

The GPU partitioning arguments are optional run command arguments. You use the arguments to run your application in a partition of a GPU memory.

  • The GPU partitioning process is dynamic. When you start a run command with an argument, vSphere Bitfusion allocates a partition before the application runs and deallocates the partition afterwards.
  • The applications that are sharing GPUs concurrently are isolated from each other by using separate client processes, network streams, server processes, and memory partitions.
  • vSphere Bitfusion partitions only the memory of the GPU and not the compute resource. An application is strictly contained to the assigned memory partition, but it can access the complete compute resource, if needed. When the same compute cells are required, the applications compete for compute resources, otherwise the applications run concurrently.

You can specify the partition size in MB or as a fraction of the total GPU memory.

Partitioning GPU memory size by fraction (number > 0.0 and <= 1.0, for example, 0.37)

bitfusion run -n num_gpus -p gpu_fraction -- applications and arguments

Partitioning GPU's memory size by MB

bitfusion run -n num_gpus -p MBs_per_gpu -- applications and arguments

For more information, see GPU Partitioning Examples.