A regular expression is a pattern that describes a certain sequence of text characters, otherwise known as strings. You use regular expressions to search for, or match, specific strings or classes of strings in a body of text.

Using a regular expression is like performing a wildcard search, but regular expressions are far more powerful. Regular expressions can be very simple or very complex. An example of a simple regular expression is cat.

This finds the first instance of the letter sequence cat in any body of text that you apply it to. If you want to make sure it only finds the word cat, and not other strings like cats or hepcat, you could use this slightly more complex regular expression: \bcat\b.

This expression includes special characters that ensure a match occurs only if there are word breaks on both sides of the cat sequence. As another example, to perform a near equivalent to the typical wildcard search string c+t, you could use this regular expression: \bc\w+t\b.

This means find a word boundary (\b) followed by a c, followed by one or more non-whitespace characters, non-punctuation characters (\w+), followed by a t, followed by a word boundary (\b). This expression finds cot, cat, croat, but not crate.

Expressions can be very complex. The following expression finds any valid email address.


For more information on creating regular expressions, see http://userguide.icu-project.org/strings/regexp.