Python Regex-syntax

Introduction Reading Time: 10 min

Table of Contents

Description

Regex syntax is a set of special characters and rules used to define search patterns. These patterns help identify specific character sequences within text. Python uses the re module to work with regex.

Prerequisites

  • Understanding of basic string operations
  • Intro to regex (what it is and why it’s used)
  • Familiarity with Python’s re module

Examples

Here's a simple program in Python:

✅ Basic Syntax Elements
import re

# .  -> Matches any single character except newline
print(re.findall(r"a.c", "abc aac acc a-c"))  # ['abc', 'aac', 'acc', 'a-c']

# ^  -> Matches the beginning of a string
print(re.findall(r"^Hello", "Hello World"))  # ['Hello']

# $  -> Matches the end of a string
print(re.findall(r"World$", "Hello World"))  # ['World']

# *  -> Matches 0 or more repetitions
print(re.findall(r"lo*l", "ll lol loool loooool"))  # ['ll', 'lol', 'loool', 'loooool']

# +  -> Matches 1 or more repetitions
print(re.findall(r"lo+l", "ll lol loool loooool"))  # ['lol', 'loool', 'loooool']

# ?  -> Matches 0 or 1 repetition (optional)
print(re.findall(r"lo?l", "ll lol loool loooool"))  # ['ll', 'lol']

# {} -> Matches exactly or range of repetitions
print(re.findall(r"\d{3}", "My PIN is 123456"))  # ['123', '456']
print(re.findall(r"\d{2,4}", "1234 12 12345"))   # ['1234', '12', '1234']
✅ Character Classes
# \d  -> Matches any digit (0-9)
print(re.findall(r"\d", "abc123"))  # ['1', '2', '3']

# \D  -> Matches any non-digit
print(re.findall(r"\D", "abc123"))  # ['a', 'b', 'c']

# \w  -> Matches word characters (a-z, A-Z, 0-9, _)
print(re.findall(r"\w", "A_1@!"))  # ['A', '_', '1']

# \W  -> Matches non-word characters
print(re.findall(r"\W", "A_1@!"))  # ['@', '!']

# \s  -> Matches any whitespace character
print(re.findall(r"\s", "Hello World"))  # [' ']

# \S  -> Matches any non-whitespace
print(re.findall(r"\S", "Hello World"))  # ['H', 'e', ...]
✅ Grouping and Alternation
# ()  -> Grouping
print(re.findall(r"(ha)+", "hahaha"))  # ['haha']

# |  -> Alternation (OR)
print(re.findall(r"cat|dog", "I have a cat and a dog"))  # ['cat', 'dog']
✅ Escaping Special Characters
# Use \ to escape special characters
print(re.findall(r"\$", "Price is $5"))  # ['$']
print(re.findall(r"\.", "www.google.com"))  # ['.', '.']

      

Real-World Applications

Input validation (email, phone, password)

Data extraction (HTML scraping, log parsing)

Text cleaning for NLP

Finding and replacing sensitive info (credit card, SSNs)

Pattern detection in security systems

Where topic Can Be Applied

Web form data validation

NLP pre-processing pipelines

Log file monitoring and filtering

Custom search features in applications

Database querying with patterns (SQL-like use cases)

Resources

Topic video source

A comprehensive video

Watch

Python pdf

pdf on topic

Visit

Interview Questions

What does \d, \w, \s stand for in regex?

Difference between *, +, and ? in regex?

What is the use of ^ and $ anchors?

How do you match a literal dot . using regex?

How do you capture groups in regex?

What's the difference between \w and [A-Za-z0-9_]?

How do you use alternation (|) in regex?

How can you match a phone number using regex?

What does {2,4} mean in regex?

Why is escaping important in regex?