Compute Unified Device Architecture (CUDA) is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). CUDA dramatically speeds up computing applications by using the processing power of GPUs. For example, CUDA is used by TensorFlow and PyTorch benchmarks.

Install NVIDIA CUDA on Ubuntu

To run AI and ML workflows in vSphere Bitfusion, you must install CUDA on the Ubuntu Linux operating system of your vSphere Bitfusion client.

Prerequisites

Verify you have installed vSphere Bitfusion client on an Ubuntu Linux operating system.

Procedure

  1. Navigate to a directory on the virtual machine in which to download the NVIDIA CUDA distribution.
    cd <download_directory>
  2. Download and move the cuda-ubuntu2004.pin file.
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
    sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
  3. Download the NVIDIA CUDA distribution for Ubuntu 20.04 by using the wget command.
    wget <https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda-repo-ubuntu2004-11-0-local_11.0.3-450.51.06-1_amd64.deb>
  4. Install the CUDA 11 package for Ubuntu 20.04 by using the dpkg -i command.
    sudo dpkg -i cuda-repo-ubuntu2004-11-0-local_11.0.3-450.51.06-1_amd64.deb
  5. Install the keys to authenticate the software package by using the apt-key command.
    The apt-key command manages the list of keys used by apt to authenticate packages. Packages which have been authenticated using these keys are considered to be trusted.
    sudo apt-key add /var/cuda-repo-ubuntu2004-11-0-local/7fa2af80.pub
  6. Update and install the CUDA software package.
    sudo apt-get update
    sudo apt-get install cuda
  7. (Optional) To confirm your GPU partition size or verify the resources available on your vSphere Bitfusion deployment, run the NVIDIA System Management Interface (nvidia-smi) monitoring application .
    bitfusion run -n 1 nvidia-smi
  8. Navigate to the directory that contains the CUDA Matrix Multiplication (matrixMul) sample files.
    cd /usr/local/cuda/samples/0_Simple/matrixMul 
  9. Run the make and bitfusion run commands against the matrixMul sample file.
    sudo make
    bitfusion run -n 1 ./matrixMul

What to do next

Install and configure NVIDIA cuDNN. See Install NVIDIA cuDNN.

Install NVIDIA CUDA on CentOS or Red Hat Linux

To run AI and ML workflows in vSphere Bitfusion, you must install CUDA on the CentOS or Red Hat Linux operating system of your vSphere Bitfusion client.

Prerequisites

Verify you have installed vSphere Bitfusion client on a CentOS or a Red Hat Linux operating system.

Procedure

  1. Navigate to directory on the virtual machine in which to download the NVIDIA CUDA distribution.
    cd <download_directory>
  2. To download the NVIDIA CUDA 11 package for CentOS 8 or Red Hat Linux 8, run the wget command.
    wget https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda-repo-rhel8-11-0-local-11.0.3_450.51.06-1.x86_64.rpm
  3. To install the CUDA package, run the rpm -i command.
    sudo rpm -i cuda-repo-rhel8-11-0-local-11.0.3_450.51.06-1.x86_64.rpm
  4. Run the yum clean all and yum -y install commands as shown to update your environment and install the CUDA software package.
    sudo yum clean all
    sudo yum -y install cuda
  5. (Optional) To confirm your GPU partition size or verify the resources available on your vSphere Bitfusion deployment, run the NVIDIA System Management Interface (nvidia-smi) monitoring application .
    bitfusion run -n 1 nvidia-smi
  6. Navigate to the directory containing the CUDA Matrix Multiplication (matrixMul ) sample files.
    cd /usr/local/cuda/samples/0_Simple/matrixMul 
  7. Run the make and bitfusion run commands against the matrixMul sample file.
    sudo make
    bitfusion run -n 1 ./matrixMul

What to do next

Install and configure NVIDIA cuDNN. See Install NVIDIA cuDNN.