About This Tutorial

By default, models are pulled down from the Internet during installations of GenAI on Tanzu Platform. While this may be acceptable in a few specific environments and situations, it’s unlikely to hold up against more stringent security requirements. In this tutorial you will learn how to download models, upload them somewhere on the internal network, and then configure GenAI on VMware Tanzu Platform for Cloud Foundry to pull models from there instead.

Target user role: Platform Operator
Complexity: Basic
Estimated time: 45-60 minutes
Pre-reqs: You have already uploaded the GenAI on VMware Tanzu Platform for Cloud Foundry tile into Tanzu Operations Manager and have access to an artifact repository in which to store models.
Topics covered: Deploying models using the Model URL configuration field.
Learning outcomes: An understanding of how to deploy models without requiring outbound Internet access during installation on GenAI on VMware Tanzu Platform for Cloud Foundry.

Step 1: Deciding where to host models on the Service Network

During this tutorial you will download a model from the Internet and then upload it somewhere that is accessible to the Service Network you have configured for GenAI on Tanzu Platform. The choice of where to upload the models to, and how to serve them, is entirely up to you. The only requirement is that the uploaded models can be accessed through HTTP(S) from the configured Service Network.

Step 2: Preparing the Model

This step must be performed from a device that is able to access HuggingFace. For the purposes of this tutorial, we use the casperhansen/llama-3-8b-instruct-awq model and we use the vLLM model provider to deploy it. A similar process can be followed for use with the Ollama model provider. However, note that each model provider expects a different file format. The vLLM model provider expects a .tar.gz file, whereas Ollama expects a .gguf file. You need to adapt this step according to the model and model provider you intend to use.

First, use git to clone the repository. You must have git-lfs installed.

git lfs install
git clone https://huggingface.co/casperhansen/llama-3-8b-instruct-awq

Next create a .tar.gz consisting of the model’s files (use the --strip-components flag to ensure that the files exist at the root of the compressed archive):

tar -cvzf llama-3-8b-instruct-awq.tar.gz --strip-components=1 llama-3-8b-instruct-awq/*.json llama-3-8b-instruct-awq/*.safetensors

Then upload the llama-3-8b-instruct-awq.tar.gz file into your chosen artifact repository. It must be possible to access this artifact through HTTP(S). Take note of the URL used to directly download the file.

Step 3: Deploying the Model

Now that the model is available on the internal network, you can follow the steps in Add and Remove Models to configure the model. Simply pass in the URL for the model in the artifact repository to the Model URL field.

check-circle-line exclamation-circle-line close-line
Scroll to top icon