On a TKG cluster in a Supervisor, you can deploy a RAG workload based on the RAG Sample Pipeline from NVIDIA that uses a pgvector PostgreSQL database managed by VMware Data Services Manager.
Procedure
- Provision a GPU-accelerated TKG cluster.
- Install the RAG LLM Operator.
- Download the manifests for the NVIDIA sample RAG pipeline.
- Configure the sample RAG pipeline with the pgvector PostgreSQL database.
- Edit the sample pipeline YAML file.
- In the YAML file, configure the sample pipeline with the pgvector PostgreSQL database by using the database's connection string.
- To provide an external IP for the sample chat application, in the YAML file, set
frontend.service.type
to loadBalancer
.
- Start the sample RAG pipeline.
- To access the sample chat application, run the following command to get the application's external IP address.
kubectl -n rag-sample get service rag-playground
- In a Web browser, open the sample chat application at http://application_external_ip:3001/orgs/nvidia/models/text-qa-chatbot.