A regular expression (commonly referred to as regex) is a sequence of characters that specifies a search pattern in text.
Become familiar with the following rules before using regular expressions:
- Platform search normalizes all tokens such that any uppercase characters are converted to lowercase, and all backslashes (
\
) are converted to forward slashes (/
). - Never include backslash or uppercase characters inside your regex statement.
- Any use of backslashes in regex are valid only for escaping.
Goal | Sample Search | ||||
---|---|---|---|---|---|
Using a generic regex. Example: |
|
||||
Looking for process names with double extensions. Example: file.doc.txt |
process_name:/\\.[^\\.]{2,3}\\.[^\\.]{2,3}/ process_name:/\\..{3}\\..{3}/ |
||||
All powershells that have performed a crossproc to anything but a specified process. |
|
||||
Looking for netconns to any domain except a specified one. | process_name:winword.exe AND netconn_domain:/@~(microsoft.com)/ |
||||
Looking for a file in a folder but not its subfolders. Example: C:\Users\<user>\123.exe, but not C:\Users\<user>\subfolder\123.exe |
filemod_name:/c:\/users\/[^\/]+\/[^\/]+\.exe/ |
||||
Looking for an exact filename and not that name as a substring. Example: find x64.exe BUT NOT installer-x64.exe |
(process_original_filename:x64.exe AND -process_original_filename:/@&~(x64.exe)/) |
Supported regex Syntax for Platform Search
When using Platform Search, any regex supported by Java is supported, with the Lucene syntax; thus: field:/regex/
.
- This documentation is compatible with Platform Search: https://www.elastic.co/guide/en/elasticsearch/reference/current/regexp-syntax.html.
- This regex validator produces results compatible with Platform Search:https://regex101.com
- Use caution when starting anything with
field:/.*something/
.- These do not perform well on any fields that have a lot of values (also known as high cardinality).
- Leading wildcard searches, such as
field:/*something/
, do not perform well.
-
All regex queries require an explicit fieldname, such as
field:/regex/
.Regex queries without a fieldname fail. For example,
/regex/
is not a valid query.
Supported Predefined Character Classes for Platform Search
Predefined character classes are not supported.
For example: \d \D \w \W \s \S
Works | process_name:/power.+?\..{3}/ |
Works | process_name:/power.+?\.[a-z0-9]{3}/ |
Does not Work | process_name:/power.+?\.\w{3}/ |
Use regex to Exclude a Specific String during a Platform Search
Example case:
You want to find any winword.exe processes that have connected to any domain other than microsoft.com.
process_name:winword.exe AND netconn_domain:* AND NOT netconn_domain:microsoft.comdoes not give you that result. It excludes all processes that have connected to microsoft.com at any point.
process_name:winword.exe AND netconn_domain:/[^.]+(\.[^.]+)+&@&~(.*microsoft.com)/
This query searches for any domain except the one provided. This ANYSTRING
syntax is documented here: https://www.elastic.co/guide/en/elasticsearch/reference/current/regexp-syntax.html#regexp-optional-operators.
Case-sensitive regex Searches
In Platform Search, all tokenized fields, such as process_name
, regmod_name
, and process_cmdline
, have their tokens converted to lowercase letters. Therefore, any regex searches that you perform on tokenized fields require you to use lowercase characters.
For example, if you are searching for a file with the string clip
in the filename:
Works | filemod_name:/clip\-[a-f0-9]{40}/ |
Works | filemod_name:/(clip|CLIP)\-[a-f0-9]{40}/ |
Does not Work | filemod_name:/CLIP\-[a-f0-9]{40}/ |