As a data scientist, you can deploy a GPU-enabled RAG workstation with a pgvector PostgreSQL database managed by VMware Data Services Manager (DSM) from the self-service Automation Service Broker catalog.

When you request the AI RAG Workstation with DSM in the catalog, you can use an existing database instance or create a new one. During deployment, both the deep learning VM and the database VM that it connects to are provisioned. The database is provisioned outside the RAG deployment.

  • If you select the Existing Database option, you use a pre-deployed database, which can be an external database or a database that was provisioned by another AI RAG Workstation with DSM deployment. During deployment, a new database instance is not provisioned.
  • If there is no existing database that you can use or you want your own private database for a specific use case, then select the New Database option.

Procedure

  1. On the Catalog page in Automation Service Broker, locate the AI RAG Workstation with DSM card and click Request.
  2. Select a project.
  3. Enter a name and description for your deployment.
  4. Configure the RAG workstation parameters.
    Setting Sample value
    VM class vgpu-1xa100-40c
    Minimum VM class specification:
    • CPU: 10 vCPUs
    • CPU RAM: 64 GB
    • GPU: 2xH100
    • GPU memory: 50 GB
    Data disk size 32 Gi

    Select a disk size between 20 GB and 1TB.

    User password Enter a password for the default user. You might be prompted to reset your password when you first log in.
    SSH public key This setting is optional.
  5. Configure the workstation database.
    Setting Sample value
    Database instance Existing database
    Connection string

    Provide the DSM connection string from the DSM deployment overview.

  6. Install software customizations.
    1. (Optional) If you want to install a custom cloud-init in addition to the cloud-init defined for the RAG software bundle, select the checkbox and paste the contents of the configuration package.
      VMware Aria Automation merges the cloud-init from the RAG software bundle and the custom cloud-init.
    2. Provide your NVIDIA NGC Portal access key.
    3. (Optional) Expose NVIDIA Data Center GPU Manager (DCGM) metrics via a load balancer.
      NVIDIA DCGM manages and monitors GPUs in data center environments.
    4. Enter Docker Hub credentials.
  7. Click Submit.