Setting up S3 for file Storage

In this topic, you will learn how to set up an S3 bucket, how bucket permissions work, what we can store in a bucket, and how a pipeline may be set up to retrieve and store objects.

Why use S3?

Platform Automation Toolkit uses and produces file artifacts that are too large to store in git. For example, many .pivotal product files are several gigabytes in size. Exported installation files may also be quite large.
For environments that can't access the greater internet. This is a common security practice, but it also means that it's not possible to connect directly to the Broadcom Support portal to access the latest product versions for your upgrades.

The integration of S3 and Concourse makes it possible to store large file artifacts and retrieve the latest product versions in offline environments.

With S3, you can place product files and new versions of Operations Manager into a network "allow-listed" S3 bucket to be used by Platform Automation Toolkit tasks. You can even create a Resources pipeline that gets the latest version of products from the Broadcom Support portal and places them into your S3 bucket automatically.

Alternatively, because your foundation backup may be quite large, it is advantageous to persist it in a blobstore automatically through Concourse. Exported installations can then later be accessed through the blobstore. Because most object stores implement secure, durable solutions, exported installations in buckets are easily restorable and persistent.

Prerequisites

An Amazon Web Services account(AWS) with access to S3

Note S3 blobstore compatibility: Many cloud storage options exist, including Amazon S3, Google Storage, Minio, and Azure Blob Storage. However, not all object stores are "S3 compatible." Because Amazon defines the S3 API for accessing blobstores, and because the Amazon S3 product has emerged as the dominant blob storage solution, not all "S3 compatible" object stores have exactly the same behavior. In general, if a storage solution claims to be "S3 compatible," it should work with the Concourse S3 resource integration. But note that it may behave differently if interacting directly with the S3 API. Use the documentation for your preferred blobstore solution when setting up storage.
Set up S3. With your AWS account, navigate to the S3 console and sign up for S3. Follow the on-screen prompts. Now you are ready for buckets.

Important AWS root user: When you sign up for the S3 service on Amazon, the account with the email and password you use is the AWS account root user. As a best practice, you should not use the root user to access and manipulate services. Instead, use AWS Identity and Access Management (IAM) to create and manage users. For more information about how this works, see the Amazon IAM guide.
For simplicity, the rest of this guide uses the AWS root user to show how a bucket can be set up and used with Platform Automation Toolkit.

Your first bucket

S3 stores data as objects in buckets. An object is any file that can be stored on a file system. Buckets are the containers for objects. Buckets can have permissions for who can create, write, delete, and see objects in that bucket.

Go to the S3 console.
Click Create bucket.
Enter a DNS-compliant name for your new bucket.
- This name must be unique across all of AWS S3 buckets and adhere to general URL guidelines. Make it something meaningful and memorable.
Enter the Region you want the bucket to reside in.
Click Create.

This creates a bucket with the default S3 settings. Bucket permissions and settings can be set during bucket creation or changed later. Bucket settings can even be copied from other buckets you have. For a detailed look at creating buckets and managing initial settings, see Creating a bucket.

Bucket permissions

By default, only the AWS account owner can access S3 resources, including buckets and objects. The resource owner may allow public access, allow specific IAM users permissions, or create a custom access policy.

To view bucket permissions, from the S3 console, look at the "Access" column.

Amazon S3 has the following Access permissions:

Public: Everyone has access to one or more of the following: List objects, Write objects, Read and write permissions.
Objects can be public: The bucket is not public, but anyone with appropriate permissions can grant public access to objects.
Buckets and objects not public: The bucket and objects do not have any public access.
Only authorized users of this account: Access is isolated to IAM users and roles.

To change who can access buckets or objects in buckets:

Go to the S3 console.
Select the name of the bucket you created in the previous step.
In the top row, select Permissions.

In this tab, you can set the various permissions for an individual bucket. For simplicity, in this guide, we will use public permissions so that Concourse can access the files.

Under the permissions tab for a bucket, choose Public access settings.
Click Edit to change the public access settings.
Unselect all check boxes to allow public access.

In general, the credentials being used to access an S3 compatible blobstore through Concourse must have Read and Write permissions. It is possible to use different user roles with different credentials to separate users who can Read objects from the bucket and users who can Write objects to the bucket.

Note Amazon S3 provides many permission settings for buckets. Specific IAM users can have access and objects can have their own permissions. In addition, buckets can have their own custom policies. See Configuring ACLs. Refer to your organization's security policy for the best way to set up your S3 bucket.

Object versions

By default, an S3 bucket will be unversioned. An unversioned bucket will not allow different versions of the same object. In order to take advantage of using an S3 bucket with Platform Automation Toolkit, we will want to enable versioning. Enabling versioning is not required, but versioning does make the process easier, and will require less potential manual steps around naming updates to the new file whenever they are changed.

Go to the S3 console.
Select the name of the bucket you created in the previous step.
Click the Properties tab.
Click the Versioning tile.
Select Enable Versioning.

Now that versioning is enabled, we can store multiple versions of a file. For example, given the following object:

my-exported-installation.zip

We can now have multiple versions of this object stored in our S3 bucket:

my-exported-installation.zip (version 111111)
my-exported-installation.zip (version 121212)

Storing files in S3

Any file that can be stored on a computer can be stored on S3. S3 is especially good at storing large files because it is designed to scale with large amounts of data while still being durable and fast.

Platform Automation Toolkit users may want to store the following files in S3:

.pivotal product files
.tgz stemcell files
.ova Operations Manager files
.zip foundation exports

You should probably not store the following in S3:

.yaml configuration files - Git is better suited for this
secrets.yaml environment and secret files - There are a number of ways to handle these types of files, but they should not be stored in S3. See Using a secrets store to store credentials for information about working with these types of files.

Structuring your bucket

Buckets can have folders and any number of sub-folders. The following sample shows one way to set up your bucket file structure:

├── foundation-1
│   ├── products
│   │   ├── healthwatch
│   │   │     healthwatch.pivotal
│   │   ├── pas
│   │   │     pas.pivotal
│   │   └── ...
│   │
│   ├── stemcells
│   │   ├── healthwatch-stemcell
│   │   │     ubuntu-xenial.tgz
│   │   ├── pas-stemcell
│   │   │     ubuntu-xenial.tgz
│   │   └── ...
│   │
│   ├── foundation1-exports
│           foundation1-installation.zip

When viewing a bucket in the AWS S3 console, click Create Folder. To create a sub-folder, select Create Folder again.

When attempting to access a specific object in a folder, include the folder structure before the object name:

foundation1/products/healthwatch/my-healthwatch-product.pivotal

Using a bucket

When using the Concourse S3 Resource, several configuration properties are available for retrieving objects. The bucket name is required.

For your Concourse to have access to your S3 bucket, ensure that you have the appropriate firewall and networking settings to allow your Concourse instance to make requests to your bucket. Concourse uses various "outside" resources to perform certain jobs. Ensure that Concourse can communicate with your S3 bucket.

Reference resources pipeline

The resources pipeline may be used to download dependencies from the Broadcom Support portal and place them into a trusted S3 bucket. The various resources_types use the Concourse S3 Resource type and several Platform Automation Toolkit tasks to accomplish this. The following is an S3-specific breakdown of these components and where to find more information.

The download-product task

The download-product task lets you download products from the Broadcom Support portal. If S3 properties are set in the download config, these files can be put into an S3 bucket.

If S3 configurations are set, this task will perform a specific filename operation that will prepend metadata to the filename. If you are downloading:

product Example Product version 2.2.1 from the Broadcom Support portal
with product slug example-product
and version is 2.2.1

When downloaded directly from the Broadcom Support portal, the file might look like this:

product-2.2-build99.pivotal

Because the Broadcom Support portal file names do not always have the necessary metadata required by Platform Automation Toolkit, the download product task will prepend the necessary information to the filename before it is placed in the S3 bucket:

[example-product,2.2.1-build99]product-2.2-build99.pivotal

Important Do not change the meta information prepended by download-product. This information is required if using a download-product with a blobstore (that is, AWS, GCS) to properly parse product versions.
If placing a product file into an blobstore bucket manually, ensure that it has the proper file name format; opening bracket, the product slug, a single comma, the product's version, and finally, closing bracket. There should be no spaces between the two brackets. For example, for a product with slug of product-slug and version of 1.1.1:
[product-slug,1.1.1]original-filename.pivotal

The download-product task lets you download products from an blobstore bucket if you define the SOURCE param. The prefixed metadata added by download-product with SOURCE: pivnet is used to find the appropriate file. This task uses the same download-product config file as download-product to ensure consistency between what is put in the blobstore and what is being accessed later.

download-product with SOURCE: pivnet and download-product with SOURCE: s3|gcs|azure are designed to be used together. The download product config should be different between the two tasks.

For complete information on this task and how it works, see the Task Reference.