Python Regex-syntax
Table of Contents
Description
Regex syntax is a set of special characters and rules used to define search patterns. These patterns help identify specific character sequences within text. Python uses the re module to work with regex.
Prerequisites
- Understanding of basic string operations
- Intro to regex (what it is and why it’s used)
- Familiarity with Python’s re module
Examples
Here's a simple program in Python:
✅ Basic Syntax Elements
import re
# . -> Matches any single character except newline
print(re.findall(r"a.c", "abc aac acc a-c")) # ['abc', 'aac', 'acc', 'a-c']
# ^ -> Matches the beginning of a string
print(re.findall(r"^Hello", "Hello World")) # ['Hello']
# $ -> Matches the end of a string
print(re.findall(r"World$", "Hello World")) # ['World']
# * -> Matches 0 or more repetitions
print(re.findall(r"lo*l", "ll lol loool loooool")) # ['ll', 'lol', 'loool', 'loooool']
# + -> Matches 1 or more repetitions
print(re.findall(r"lo+l", "ll lol loool loooool")) # ['lol', 'loool', 'loooool']
# ? -> Matches 0 or 1 repetition (optional)
print(re.findall(r"lo?l", "ll lol loool loooool")) # ['ll', 'lol']
# {} -> Matches exactly or range of repetitions
print(re.findall(r"\d{3}", "My PIN is 123456")) # ['123', '456']
print(re.findall(r"\d{2,4}", "1234 12 12345")) # ['1234', '12', '1234']
✅ Character Classes
# \d -> Matches any digit (0-9)
print(re.findall(r"\d", "abc123")) # ['1', '2', '3']
# \D -> Matches any non-digit
print(re.findall(r"\D", "abc123")) # ['a', 'b', 'c']
# \w -> Matches word characters (a-z, A-Z, 0-9, _)
print(re.findall(r"\w", "A_1@!")) # ['A', '_', '1']
# \W -> Matches non-word characters
print(re.findall(r"\W", "A_1@!")) # ['@', '!']
# \s -> Matches any whitespace character
print(re.findall(r"\s", "Hello World")) # [' ']
# \S -> Matches any non-whitespace
print(re.findall(r"\S", "Hello World")) # ['H', 'e', ...]
✅ Grouping and Alternation
# () -> Grouping
print(re.findall(r"(ha)+", "hahaha")) # ['haha']
# | -> Alternation (OR)
print(re.findall(r"cat|dog", "I have a cat and a dog")) # ['cat', 'dog']
✅ Escaping Special Characters
# Use \ to escape special characters
print(re.findall(r"\$", "Price is $5")) # ['$']
print(re.findall(r"\.", "www.google.com")) # ['.', '.']
Real-World Applications
Input validation (email, phone, password)
Data extraction (HTML scraping, log parsing)
Text cleaning for NLP
Finding and replacing sensitive info (credit card, SSNs)
Pattern detection in security systems
Where topic Can Be Applied
Web form data validation
NLP pre-processing pipelines
Log file monitoring and filtering
Custom search features in applications
Database querying with patterns (SQL-like use cases)
Resources
Topic video source
A comprehensive video
Watch
Python pdf
pdf on topic
Visit
Interview Questions
What does \d, \w, \s stand for in regex?
Difference between *, +, and ? in regex?
What is the use of ^ and $ anchors?
How do you match a literal dot . using regex?
How do you capture groups in regex?
What's the difference between \w and [A-Za-z0-9_]?
How do you use alternation (|) in regex?
How can you match a phone number using regex?
What does {2,4} mean in regex?
Why is escaping important in regex?