Regular Expressions Tutorial
What are Regular Expressions?
Regular Expressions (regex or regexp) are sequences of characters that form a search pattern. They are used for pattern matching within strings, allowing you to check for specific formats, extract portions of strings, or replace parts of them. Regular expressions are widely used in programming, data validation, and text processing.
Basic Syntax
Regular expressions consist of literals and special characters. Here are some basic elements:
- Literals: Characters that match themselves (e.g., 'a' matches 'a').
- Metacharacters: Special characters that have specific meanings, such as:
- . - Matches any single character.
- ^ - Matches the start of a string.
- $ - Matches the end of a string.
- * - Matches 0 or more occurrences of the preceding element.
- + - Matches 1 or more occurrences of the preceding element.
- ? - Matches 0 or 1 occurrence of the preceding element.
Character Classes
Character classes allow you to match any one of a set of characters. They are defined using square brackets:
You can also specify ranges:
Quantifiers
Quantifiers specify how many instances of a character or group must be present for a match. Here are some common quantifiers:
- * - 0 or more times
- + - 1 or more times
- ? - 0 or 1 time
- {n} - Exactly n times
- {n,} - At least n times
- {n,m} - Between n and m times
Groups and Capturing
Parentheses are used to create groups and capture parts of the matched text:
You can also use the pipe symbol | to specify alternatives:
Common Use Cases
Regular expressions can be used for various applications, including:
- Validation: Checking if input matches a specific format (e.g., email, phone numbers).
- Search: Finding specific patterns in text.
- Replace: Modifying parts of a string based on a pattern.
- Splitting: Dividing a string into an array based on a pattern.
Examples
Here are some practical examples of using regular expressions:
Example 1: Validating an Email Address
pattern = r'^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$'
This regex pattern checks for a valid email format.
Example 2: Extracting Phone Numbers
This pattern matches phone numbers in the format '123-456-7890'.
Example 3: Replacing Text
This replaces all occurrences of 'foo' with 'bar'.
Conclusion
Regular expressions are powerful tools for text processing and manipulation. Mastering regex can greatly enhance your ability to work with strings in programming, data validation, and much more. Practice using regex patterns to become proficient and confident in their usage!