As a data scientist, you can deploy a deep learning-capable Tanzu Kubernetes Grid Cluster that uses a pgvector PostgreSQL database managed by VMware Data Services Manager (DSM) from the self-service Automation Service Broker catalog. You can use an existing database instance or create a new one.
When you request the AI Kubernetes RAG Cluster with DSM in the catalog, you can use an existing database instance or create a new one. During deployment, both the deep learning VM and the database VM that it connects to are provisioned. The database is provisioned outside the RAG deployment.
- If you select the existing database option, you use a pre-deployed database, which can be an external database or a database that was provisioned by another AI RAG Workstation with DSM deployment. During deployment, a new database instance is not provisioned.
- If there is no existing database that you can use or you want your own private database for a specific use case, then select the new database option.
Procedure
- On the Catalog page in Automation Service Broker, locate the AI Kubernetes RAG Cluster with DSM card and click Request.
- Select a project.
- Enter a name and description for your deployment.
- Select the number of control pane nodes.
Setting Sample value Node count 1 VM class best-effort-large The class selection defines the resources available within the virtual machine.
For one worker node, you can use 1 vGPU or 2 vGPU per worker node. For two worker nodes, select 1vGPU per node.
- Configure the database.
Setting Sample value Database instance Existing database Connection string Provide the DSM connection string from the DSM deployment overview. - Install software customizations.
- Provide your NVIDIA AI enterprise API key.
- Select a NIM Model profile.
The NIM Model profile defines what model engines NIM can use and the criteria to choose those engines.
- Enter Docker Hub credentials.
- Click Submit.