As a data scientist, you can deploy a GPU-enabled RAG workstation with a pgvector PostgreSQL database managed by VMware Data Services Manager (DSM) from the self-service Automation Service Broker catalog.
When you request the AI RAG Workstation with DSM in the catalog, you can use an existing database instance or create a new one. During deployment, both the deep learning VM and the database VM that it connects to are provisioned. The database is provisioned outside the RAG deployment.
- If you select the Existing Database option, you use a pre-deployed database, which can be an external database or a database that was provisioned by another AI RAG Workstation with DSM deployment. During deployment, a new database instance is not provisioned.
- If there is no existing database that you can use or you want your own private database for a specific use case, then select the New Database option.
Procedure
- On the Catalog page in Automation Service Broker, locate the AI RAG Workstation with DSM card and click Request.
- Select a project.
- Enter a name and description for your deployment.
- Configure the RAG workstation parameters.
Setting Sample value VM class vgpu-1xa100-40c Minimum VM class specification:- CPU: 10 vCPUs
- CPU RAM: 64 GB
- GPU: 2xH100
- GPU memory: 50 GB
Data disk size 32 Gi Select a disk size between 20 GB and 1TB.
User password Enter a password for the default user. You might be prompted to reset your password when you first log in. SSH public key This setting is optional. - Configure the workstation database.
Setting Sample value Database instance Existing database Connection string Provide the DSM connection string from the DSM deployment overview.
- Install software customizations.
- (Optional) If you want to install a custom cloud-init in addition to the cloud-init defined for the RAG software bundle, select the checkbox and paste the contents of the configuration package.
VMware Aria Automation merges the cloud-init from the RAG software bundle and the custom cloud-init.
- Provide your NVIDIA NGC Portal access key.
- (Optional) Expose NVIDIA Data Center GPU Manager (DCGM) metrics via a load balancer.
NVIDIA DCGM manages and monitors GPUs in data center environments.
- Enter Docker Hub credentials.
- (Optional) If you want to install a custom cloud-init in addition to the cloud-init defined for the RAG software bundle, select the checkbox and paste the contents of the configuration package.
- Click Submit.