A regular expression, usually shortened to regex, is a compact pattern that describes a set of strings so you can search, match, validate, or replace text. Instead of writing loops to check characters one by one, you write a short pattern that says, in effect, what valid text looks like, and let the regex engine do the matching. Nearly every programming language and text editor in 2026 supports the same core syntax, so the skill transfers everywhere. It is powerful and concise, and also famous for producing patterns nobody can read a week later, so part of learning regex is knowing when not to use it.
What a regular expression actually is
A regex is a string in a special mini-language. Plain characters match themselves, and a handful of special characters, called metacharacters, describe patterns. The pattern cat matches the literal text cat. The pattern c.t matches cat, cot, cut, and any three-character string starting with c and ending with t, because . means any single character.
You hand the pattern and some input text to a regex engine, and it tells you whether and where the pattern matches.
-- Match a simple US-style ZIP code of five digits
import re
re.fullmatch(r"\d{5}", "90210") -- matches
The core building blocks
| Element |
Meaning |
Example |
. |
Any single character |
a.c matches abc, axc |
\d \w \s |
A digit, word char, or whitespace |
\d\d matches 42 |
[abc] |
A character class, any one listed |
[aeiou] matches a vowel |
* + ? |
Zero+, one+, or optional |
ab+ matches ab, abbb |
{n,m} |
Between n and m repeats |
\d{3,5} matches 3 to 5 digits |
^ $ |
Start and end anchors |
^cat$ matches only cat |
(...) |
A group you can capture or repeat |
(ab)+ matches abab |
These few pieces combine to describe surprisingly precise patterns. A pattern like ^\d{4}-\d{2}-\d{2}$ matches a date written as year-month-day, anchored so nothing else can sneak in.
When regex is the right tool
| Task |
Regex fit |
| Validate a format (ZIP, simple code) |
Strong |
| Extract fields from a known line format |
Strong |
| Find and replace across a codebase |
Strong |
| Parse nested formats (HTML, JSON, code) |
Poor, use a parser |
Regex shines on flat, predictable text. It struggles with anything recursive or deeply nested, because those structures are not regular in the formal sense, and an unreadable pattern works against writing better code. For HTML, JSON, or a programming language, a purpose-built parser is correct and a regex will eventually break on a case you did not foresee.
How to use regex well
- Start small and build up. Match the simplest version first, then add constraints one at a time.
- Test against real and edge-case input. Use an online tester or your editor to confirm matches and misses.
- Comment complex patterns. Many engines support a verbose mode with whitespace and comments; use it.
- Anchor when you mean the whole string. Without
^ and $, a pattern can match a fragment you did not intend.
- Prefer a named library for common needs. Email and URL validation are easy to get subtly wrong; a vetted validator beats a hand-rolled monster pattern.
What to skip
- Parsing HTML or nested structures with regex. It cannot handle arbitrary nesting reliably. Use a DOM parser.
- One giant unreadable pattern. A 200-character regex is a maintenance trap. Break the problem up or use code.
- Trusting a copied email regex blindly. Many floating around are wrong or too strict. Validate by sending a confirmation instead.
- Catastrophic backtracking. Certain patterns can hang on malicious input. Keep quantifiers tight and test performance on long strings.
FAQ
Is regex the same in every language?
The core syntax is shared, so skills transfer, but flavors differ in advanced features and escaping. Check your language documentation for the specifics.
What does the r before a string mean in Python?
It marks a raw string, so backslashes are not treated as escapes. This keeps regex patterns readable, since regex uses backslashes heavily.
When should I not use a regex?
When the text has nested or recursive structure, like HTML or source code. Those need a real parser; a regex will eventually fail on a valid but unexpected case.
How do I learn regex faster?
Use an interactive tester that highlights matches as you type. Building patterns piece by piece against sample text teaches the syntax much faster than reading reference tables.
Where to go next
See what a string in programming is in 2026, what a syntax error is in 2026, and what a shell is in 2026.