Deploying RAG Workloads in VMware Private AI Foundation with NVIDIA

A Retrieval-Augmented Generation (RAG) workload consists of an LLM and external knowledge base with latest data, stored in a vector database. In VMware Private AI Foundation with NVIDIA, you can configure a RAG workload to use embeddings from a vector database managed by VMware Data Services Manager.

Note: This documentation is based on VMware Cloud Foundation 5.2.1. For information on the VMware Private AI Foundation with NVIDIA functionality in VMware Cloud Foundation 5.2, see VMware Private AI Foundation with NVIDIA Guide for VMware Cloud Foundation 5.2.