Python Regex-syntax
Table of Contents
Description
Regex syntax is a set of special characters and rules used to define search patterns. These patterns help identify specific character sequences within text. Python uses the re module to work with regex.
Prerequisites
- Understanding of basic string operations
- Intro to regex (what it is and why it’s used)
- Familiarity with Python’s re module
Examples
Here's a simple program in Python:
✅ Basic Syntax Elements import re # . -> Matches any single character except newline print(re.findall(r"a.c", "abc aac acc a-c")) # ['abc', 'aac', 'acc', 'a-c'] # ^ -> Matches the beginning of a string print(re.findall(r"^Hello", "Hello World")) # ['Hello'] # $ -> Matches the end of a string print(re.findall(r"World$", "Hello World")) # ['World'] # * -> Matches 0 or more repetitions print(re.findall(r"lo*l", "ll lol loool loooool")) # ['ll', 'lol', 'loool', 'loooool'] # + -> Matches 1 or more repetitions print(re.findall(r"lo+l", "ll lol loool loooool")) # ['lol', 'loool', 'loooool'] # ? -> Matches 0 or 1 repetition (optional) print(re.findall(r"lo?l", "ll lol loool loooool")) # ['ll', 'lol'] # {} -> Matches exactly or range of repetitions print(re.findall(r"\d{3}", "My PIN is 123456")) # ['123', '456'] print(re.findall(r"\d{2,4}", "1234 12 12345")) # ['1234', '12', '1234'] ✅ Character Classes # \d -> Matches any digit (0-9) print(re.findall(r"\d", "abc123")) # ['1', '2', '3'] # \D -> Matches any non-digit print(re.findall(r"\D", "abc123")) # ['a', 'b', 'c'] # \w -> Matches word characters (a-z, A-Z, 0-9, _) print(re.findall(r"\w", "A_1@!")) # ['A', '_', '1'] # \W -> Matches non-word characters print(re.findall(r"\W", "A_1@!")) # ['@', '!'] # \s -> Matches any whitespace character print(re.findall(r"\s", "Hello World")) # [' '] # \S -> Matches any non-whitespace print(re.findall(r"\S", "Hello World")) # ['H', 'e', ...] ✅ Grouping and Alternation # () -> Grouping print(re.findall(r"(ha)+", "hahaha")) # ['haha'] # | -> Alternation (OR) print(re.findall(r"cat|dog", "I have a cat and a dog")) # ['cat', 'dog'] ✅ Escaping Special Characters # Use \ to escape special characters print(re.findall(r"\$", "Price is $5")) # ['$'] print(re.findall(r"\.", "www.google.com")) # ['.', '.']Real-World Applications
Input validation (email, phone, password)
Data extraction (HTML scraping, log parsing)
Text cleaning for NLP
Finding and replacing sensitive info (credit card, SSNs)
Pattern detection in security systems
Where topic Can Be Applied
Web form data validation
NLP pre-processing pipelines
Log file monitoring and filtering
Custom search features in applications
Database querying with patterns (SQL-like use cases)
Resources
WatchTopic video source
A comprehensive video
VisitPython pdf
pdf on topic
Interview Questions
What does \d, \w, \s stand for in regex?
Difference between *, +, and ? in regex?
What is the use of ^ and $ anchors?
How do you match a literal dot . using regex?
How do you capture groups in regex?
What's the difference between \w and [A-Za-z0-9_]?
How do you use alternation (|) in regex?
How can you match a phone number using regex?
What does {2,4} mean in regex?
Why is escaping important in regex?