Deploy a RAG Cluster with a Vector Database by Using a Self-Service Catalog Item in VMware Aria Automation

As a data scientist, you can deploy a deep learning-capable Tanzu Kubernetes Grid Cluster that uses a pgvector PostgreSQL database managed by VMware Data Services Manager (DSM) from the self-service Automation Service Broker catalog. You can use an existing database instance or create a new one.

When you request the AI Kubernetes RAG Cluster with DSM in the catalog, you can use an existing database instance or create a new one. During deployment, both the deep learning VM and the database VM that it connects to are provisioned. The database is provisioned outside the RAG deployment.

If you select the existing database option, you use a pre-deployed database, which can be an external database or a database that was provisioned by another AI RAG Workstation with DSM deployment. During deployment, a new database instance is not provisioned.
If there is no existing database that you can use or you want your own private database for a specific use case, then select the new database option.

Procedure

On the Catalog page in Automation Service Broker, locate the AI Kubernetes RAG Cluster with DSM card and click Request.
Select a project.
Enter a name and description for your deployment.

Select the number of control pane nodes.

Setting	Sample value
Node count	`1`
VM class	`best-effort-large` The class selection defines the resources available within the virtual machine. For one worker node, you can use 1 vGPU or 2 vGPU per worker node. For two worker nodes, select 1vGPU per node.

Setting

Sample value

Node count

1

VM class

best-effort-large

The class selection defines the resources available within the virtual machine.

For one worker node, you can use 1 vGPU or 2 vGPU per worker node. For two worker nodes, select 1vGPU per node.

Configure the database.

Setting	Sample value
Database instance	Existing database
Connection string	Provide the DSM connection string from the DSM deployment overview.

Install software customizations.
1. Provide your NVIDIA AI enterprise API key.
2. Select a NIM Model profile.
  The NIM Model profile defines what model engines NIM can use and the criteria to choose those engines.
3. Enter Docker Hub credentials.
Click Submit.