As a data scientist, you can deploy a deep learning-capable Tanzu Kubernetes Grid Cluster that uses a pgvector PostgreSQL database managed by VMware Data Services Manager (DSM) from the self-service Automation Service Broker catalog. You can use an existing database instance or create a new one.

When you request the AI Kubernetes RAG Cluster with DSM in the catalog, you can use an existing database instance or create a new one. During deployment, both the deep learning VM and the database VM that it connects to are provisioned. The database is provisioned outside the RAG deployment.

  • If you select the existing database option, you use a pre-deployed database, which can be an external database or a database that was provisioned by another AI RAG Workstation with DSM deployment. During deployment, a new database instance is not provisioned.
  • If there is no existing database that you can use or you want your own private database for a specific use case, then select the new database option.

Procedure

  1. On the Catalog page in Automation Service Broker, locate the AI Kubernetes RAG Cluster with DSM card and click Request.
  2. Select a project.
  3. Enter a name and description for your deployment.
  4. Select the number of control pane nodes.
    Setting Sample value
    Node count 1
    VM class best-effort-large

    The class selection defines the resources available within the virtual machine.

    For one worker node, you can use 1 vGPU or 2 vGPU per worker node. For two worker nodes, select 1vGPU per node.

  5. Configure the database.
    Setting Sample value
    Database instance Existing database
    Connection string Provide the DSM connection string from the DSM deployment overview.
  6. Install software customizations.
    1. Provide your NVIDIA AI enterprise API key.
    2. Select a NIM Model profile.
      The NIM Model profile defines what model engines NIM can use and the criteria to choose those engines.
    3. Enter Docker Hub credentials.
  7. Click Submit.