As a DevOps engineer, you can deploy a deep learning VM with a RAG reference solution from the self-service Automation Service Broker catalog.

Procedure

  1. On the Catalog page in Automation Service Broker, locate the AI RAG Workstation card and click Request.
  2. Select a project.
  3. Enter a name and description for your deployment.
  4. Configure the RAG workstation parameters.
    Setting Sample value
    VM class A100 Small - 1 vGPU (16 GB), 8 CPUs and 16 GB Memory
    Minimum VM class specification:
    • CPU: 10 vCPUs
    • CPU RAM: 64 GB
    • GPU: 2xH100
    • GPU memory: 50 GB
    Data disk size 3 Gi
    User password Enter a password for the default user. You might be prompted to reset your password when you first log in.
    SSH public key This setting is optional.
  5. Install software customizations.
    1. (Optional) If you want to install a custom cloud-init in addition to the cloud-init defined for the RAG software bundle, select the checkbox and paste the contents of the configuration package.
      VMware Aria Automation merges the cloud-init from the RAG software bundle and the custom cloud-init.
    2. Provide your NVIDIA NGC Portal access key.
    3. Enter Docker Hub credentials.
  6. Click Submit.

Results

Your deep learning VM includes Ubuntu 22.04, an NVIDIA vGPU driver, a Docker Engine, and an NVIDIA Container Toolkit, and a reference RAG solution that uses the Llama-2-13b-chat model.