This section covers the core components of Data Loss Prevention (DLP) for the Cloud Web Security service and how they are used to create rules that prevent data leakage for a customer Enterprise. The DLP section concludes with a workflow for configuring a DLP Rule and verifying that the rule works properly.
Overview
Prerequisites
- A customer enterprise on a production VMware SASE Orchestrator with Cloud Web Security activated. Both the Edges and Orchestrator must use VMware Release 4.5.0 or later.
- A customer must have a Cloud Web Security Advanced package to access the DLP feature.
Important: A customer with a Cloud Web Security Standard package would not be able to access DLP and a locked icon would appear next to all DLP options on the Orchestrator UI.
Overview of DLP Dictionaries
A DLP Dictionary uses matching expressions to identify sensitive data. For example, credit card numbers and social security numbers follow a specific format. And dictionaries can match against those patterns and determine if sensitive data is or is not present in a file upload or text input.
Predefined Dictionaries
Cloud Web Security predefined data dictionaries are a combination of pattern matching, checksums, context scoring, and fuzzy logic to identify sensitive data. Cloud Web Security has more than 340 predefined data dictionaries covering the following major data categories:
- Document Classification
- Financial Data
- Health Care
- HIPAA
- Item Identifiers
- PCI DSS
- PII
Additionally, the predefined data dictionaries are region-specific to ensure correct pattern matching is applied across the globe. Data dictionaries can be set to 29 different countries or regions. Of those 29, two are reserved for Global and Other. These two options allow categorizing multinational data or data that does not neatly fit into a country or region category.
On this page, users are shown all the dictionaries available for use in DLP policy. The dictionaries are organized in a table containing names, descriptions, types, categories, and region fields.
- Name is used to identify the dictionary for use in policy.
- Description provides a high-level overview of what the dictionary matches.
- Type distinguishes two different dictionary types:
- Predefined
- Custom
- Category includes:
- Canadian Health Service
- Document Classification
- Financial Data, HIPAA
- HIPAA/Health Care
- Health Care
- Item Identifiers
- Other
- PCI DSS
- Personally Identifiable Information
- UK National Health Service
- Region represents the geography the dictionary applies to, which includes:
- Australia
- Belgium
- Brazil
- Canada
- Denmark
- Finland
- France
- Germany
- Global
- Hong Kong
- India
- Indonesia
- Ireland
- Italy
- Japan
- Malaysia
- Netherlands
- New York
- New Zealand
- Norway
- Other
- Poland
- Singapore
- South Africa
- Spain
- Sweden
- United Kingdom (UK)
- United States of America (USA)
- The Search bar applies to all fields on the Dictionaries page and can be used to quickly display specific dictionaries users are interested in viewing.
- Each row contains a dictionary that can be clicked to explore further.
- The Dictionaries per Page can display up to 100 entries on a single page.
- Page Navigation buttons are provided for going back or skipping ahead.
To continue this examination, find the Postal addresses [Global] dictionary and click on the blue text to bring up the Edit Dictionary screen.
Clicking on the Next button in the modal brings users to the Threshold settings. It is not recommended to adjust the Threshold Details from their default values unless necessary.
The screenshot above shows the weighted average number of violations for both File Uploads and User Inputs is set to 10. For predefined dictionaries, do not think of this as a simple occurrence count but rather a computational scoring of all information discovered in a document. This scoring mechanism helps to reduce the number of false positives observed when using this data dictionary. When finished viewing this modal, click Cancel. Please note that if users made changes to any editable values, users would need to click Update to preserve those changes.
Custom Dictionaries
Cloud Web Security DLP Custom Dictionaries give users the flexibility to create data dictionaries pertinent to their organization. As with predefined dictionaries, customer dictionaries start by having users add four fields:
- Name
- Description
- Category
- Country/Region
These are the same four fields shown for predefined dictionaries, but with the ability to set each value to what is relevant for the dictionary users are creating.
- String is used to match an exact combination of alphanumeric and special characters. It can be set to match or ignore casing.
- Expression uses Perl regular expressions (regex) to find data patterns that are otherwise difficult to find with a simple string.
To create a Custom Dictionary, click the New Dictionary button from page.
The Dictionary Details screen prompts users to enter values for Name, Description, Category, and Country/Region.
The screenshot above indicates that this dictionary is meant to identify Sensitive IP Addresses and is For Internal Use Only. The selection of Other for both category and country/region indicate that the data matched by this dictionary either does not fit into one of the preexisting categories, or the additional metadata is not necessary.
For the Match Data screen, the example configuration is based on the IP address ranges 192.0.2.0/24, 198.151.100.0/24, and 203.0.133.0/24 (RFC 5737), the sensitive data the company needs to protect. The regex used to look for any IP addresses in those ranges is: (192\.0\.2\..*|198\.51\.100\..*|203\.0\.113\..*)
The regex is not broken up over several lines using the Plus Icon to add another row because the dictionary logic across multiple rows is a logical AND. Had the Match Criteria been defined in this manner, the dictionary would trigger only when all three IP address ranges were present in a document.
After configuring the Custom Dictionary settings, click Finish to make the dictionary available for use in Cloud Web Security.
Auditors
An Auditor is someone in the organization designated to follow up on any incidents that pertain to the attempted exfiltration of data, whether intentional or accidental. This individual can be notified via email from the Orchestrator that a DLP rule has been violated. The email sent to the Auditor contains the name of the DLP rule, user's input or file name that contained sensitive data, the destination to which the user was trying to send the data, and the person's username that tried to expose the data. Optionally, user's input or file can be sent to the Auditor, either in its original format, as a ZIP file, or an encrypted ZIP file.
Users can add, edit, delete, and view auditors by logging into Cloud Web Security and navigating to:
In the Auditors screen, users can see that there are currently no auditors in the system. To add the first Auditor, select + NEW AUDITOR PROFILE. A pop-up will prompt users to provide the following information:
- Name (mandatory) is the name of Auditor.
- Email Address (mandatory) is a valid email address account for the individual.
- Description (optional) is any relevant information users would want to provide about the Auditor. For example, "PCI Auditor" if the Auditor's primary function is to monitor for PCI violations.
The next page will ask users for File Details. This page is completely optional, but it provides users with the option to send the offending file to the DLP Auditor for their review. Configuration options include:
- Send File to the Auditors, with the default behavior being to not send the file to the Auditor(s).
- File Format becomes available when users select Send the file to the Auditor(s). Users have the option of selecting the Original File, Zip, or Encrypted Zip. Since this file will contain sensitive information, it is recommended to use the Encrypted Zip option.
- Maximum File Size is the maximum size of the attachment included with the email that is sent by the system. The limit can be set for up to 1GB, but it is recommended to match their organization's email file size restrictions.
Important: If a file size exceeds the Maximum File Size value, then that file is bypassed. In other words, the file is not attached to the DLP violation alert, and the alert is sent without the file.
- Encrypted Zip Password is autogenerated by the system and can be regenerated if compromised. Users can also configure their own password if desired.
- Maximum File Size is the maximum size of the attachment included with the email that is sent by the system. The limit can be set for up to 1GB, but it is recommended to match their organization's email file size restrictions.
Click the Finish button to save the New DLP Auditor Profile configuration. The Auditor entry appears in the DLP Settings Auditor page. Optionally, users can view, edit, or delete the Auditor entry.
DLP Configuration workflow
Having covered the two key components that comprise the Data Loss Prevention (DLP) feature, this section will cover the overall DLP workflow.
Create, Configure, and Apply a Security Policy
A DLP rule is part of a Security Policy and thus prior to configuring a DLP rule, there must first be a Security Policy. For details on Creating, Configuring, or Applying a Security Policy for the Cloud Web Security service, consult the relevant documentation in the Cloud Web Security Configuration Guide.
Create and Apply a DLP Rule
To create and apply a DLP Rule, see Configure Data Loss Prevention Rules.
Verify a DLP Rule is Working
- Cloud Web Security blocks the exfiltration of sensitive data that matches a DLP Rule.
- Cloud Web Security detects and logs the attempt to exfiltrate sensitive data.
- Cloud Web Security sends an email alert to a DLP Auditor when the rule is triggered.
- From an endpoint device (Windows, MacOS, iOS, or Android) that sits behind an SD-WAN Edge, login to a file hosting service (for example, Apple iCloud, Dropbox, Google Drive, Microsoft OneDrive or similar).
- If the rule includes a Custom Dictionary, upload a Text Input, Text File, or PDF which matches the criteria set in the DLP Rule.
Note: Text Input is like a form post or text message. A Text File is an actual .txt attached to an upload.
- Alternatively, use any of the Predefined Dictionaries and their respective thresholds for PII data, Social Security numbers, Bank Account numbers, or something similar.
Note: With Predefined Dictionaries, the threshold to trigger the DLP violation is based on the combination of a sensitivity level and DLP engine heuristics. This contrasts with the Custom Dictionary which uses a specific repeat count.
- The text file/input or file upload is blocked.
- Verify in the DLP logs that the block action has been logged.
- The following is a sample log for a Text Input block in DLP Test which matches a Custom Dictionary the DLP Rule uses.
- The following is a sample log for a PDF file blocked in Dropbox for a Social Security number match from a Predefined Dictionary.
- Verify that a DLP Auditor has received an alert email based on the DLP Rule and the action configured for this rule.
- The following is a sample email for a Text Input block in DLP Test for a Custom Dictionary the DLP Rule uses.
- The following is a sample email for a PDF file blocked in Dropbox for a Social Security number match from a Predefined Dictionary.
Note: Non-text files may appear with an "Unknown" file name. As a result, the attached file in the Auditor email would also show as "Unknown".