Regular expressions are powerful text-matching patterns used in every programming language. This guide provides battle-tested patterns for common validation and extraction tasks, with explanations of how each pattern works.
Key Takeaways
- 1Email: /^[^\s@]+@[^\s@]+\.[^\s@]+$/ covers most valid addresses
- 2Use lookaheads (?=...) for password requirements without enforcing order
- 3Capture groups (pattern) let you extract and rearrange matched text
- 4Avoid nested quantifiers (a+)+ to prevent catastrophic backtracking
- 5Use non-greedy .*? when you want the shortest possible match
1Email Validation Patterns
Email validation ranges from simple to RFC-5322 compliant. Here are patterns for different use cases.
// Simple email pattern (covers 99% of cases)
const emailBasic = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
// More thorough pattern
const emailStrict = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
// Examples
emailBasic.test('user@example.com'); // true
emailBasic.test('user.name@sub.domain.com'); // true
emailBasic.test('invalid@'); // false
emailBasic.test('@nodomain.com'); // false| Pattern Part | Meaning |
|---|---|
| ^[^\s@]+ | Start: one or more chars that aren't space or @ |
| @ | Literal @ symbol |
| [^\s@]+ | Domain: one or more chars that aren't space or @ |
| \. | Literal dot (escaped) |
| [^\s@]+$ | TLD: one or more chars to end |
No regex can validate all valid emails per RFC 5322. For production, use a simple pattern to catch obvious errors, then verify with a confirmation email.
2URL Validation Patterns
URL patterns match web addresses with optional protocols, paths, and query strings.
// Basic URL with required protocol
const urlWithProtocol = /^https?:\/\/[^\s]+$/;
// URL with optional protocol
const urlOptionalProtocol = /^(https?:\/\/)?([\w.-]+)\.([a-z]{2,})(\/\S*)?$/i;
// Detailed URL pattern with groups
const urlDetailed = /^(https?:\/\/)?(www\.)?([\w-]+\.)+[\w-]+(\/[\w-./?%&=]*)?$/;
// Extract domain from URL
const domainExtract = /^(?:https?:\/\/)?(?:www\.)?([^/]+)/;
const match = 'https://www.example.com/path'.match(domainExtract);
// match[1] = 'example.com'| Pattern | Matches |
|---|---|
| https?:\/\/ | http:// or https:// |
| (www\.)? | Optional www. |
| [\w-]+ | Word chars and hyphens |
| (\/\S*)? | Optional path (non-whitespace) |
| [?#].* | Query string or hash |
3Phone Number Patterns
Phone validation varies by country. Here are patterns for common formats.
// US phone (various formats)
const usPhone = /^(\+1)?[-. ]?\(?\d{3}\)?[-. ]?\d{3}[-. ]?\d{4}$/;
// International with country code
const intlPhone = /^\+[1-9]\d{1,14}$/; // E.164 format
// Indian mobile
const indiaPhone = /^(\+91)?[6-9]\d{9}$/;
// UK phone
const ukPhone = /^(\+44|0)\d{10,11}$/;
// Examples
usPhone.test('(555) 123-4567'); // true
usPhone.test('+1 555.123.4567'); // true
usPhone.test('555-123-4567'); // true
intlPhone.test('+14155551234'); // trueFor international apps, accept the E.164 format (+[country][number]) and normalize input by stripping spaces, dashes, and parentheses before validation.
4Password Strength Patterns
Password patterns use lookaheads to require specific character types without enforcing order.
// Minimum 8 chars, at least one letter and one number
const passBasic = /^(?=.*[A-Za-z])(?=.*\d)[A-Za-z\d]{8,}$/;
// Strong: 8+ chars, upper, lower, number, special
const passStrong = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;
// Very strong: 12+ chars, all requirements
const passVeryStrong = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&#^()_+=\[\]{}|;:,.<>?])[A-Za-z\d@$!%*?&#^()_+=\[\]{}|;:,.<>?]{12,}$/;
// Check each requirement separately (better UX)
const hasLower = /[a-z]/;
const hasUpper = /[A-Z]/;
const hasDigit = /\d/;
const hasSpecial = /[@$!%*?&]/;
const hasLength = /.{8,}/;| Lookahead | Requirement |
|---|---|
| (?=.*[a-z]) | At least one lowercase letter |
| (?=.*[A-Z]) | At least one uppercase letter |
| (?=.*\d) | At least one digit |
| (?=.*[@$!%*?&]) | At least one special character |
| {8,} | Minimum 8 characters total |
Lookaheads (?=...) check conditions without consuming characters. They can appear in any order because they all start matching from position 0.
5Data Extraction Patterns
Use capture groups to extract specific data from text. These patterns find and isolate information.
// Extract all numbers from text
const numbers = text.match(/\d+/g);
// Extract quoted strings
const quoted = text.match(/"([^"]*)"/g);
// Extract hashtags
const hashtags = text.match(/#\w+/g);
// Extract HTML tags
const htmlTags = /<(\w+)[^>]*>/g;
// Extract date parts (MM/DD/YYYY)
const datePattern = /(\d{2})\/(\d{2})\/(\d{4})/;
const [, month, day, year] = '12/25/2024'.match(datePattern);
// Extract key=value pairs
const keyValue = /(\w+)=(\w+)/g;
const pairs = [...text.matchAll(keyValue)];
// pairs = [[full, key, value], ...]
// Extract IP addresses
const ipAddress = /\b(\d{1,3}\.){3}\d{1,3}\b/g;Use matchAll() with the /g flag to get all matches with their capture groups. It returns an iterator you can spread into an array.
6Common Validation Patterns
Ready-to-use patterns for frequently validated data types.
// Credit card (basic - 13-19 digits)
const creditCard = /^\d{13,19}$/;
// Visa specifically
const visa = /^4\d{12}(\d{3})?$/;
// UUID v4
const uuidV4 = /^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/i;
// Hex color
const hexColor = /^#?([0-9A-Fa-f]{3}|[0-9A-Fa-f]{6})$/;
// IPv4 address
const ipv4 = /^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$/;
// Date YYYY-MM-DD
const isoDate = /^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/;
// Time HH:MM (24-hour)
const time24 = /^([01]\d|2[0-3]):[0-5]\d$/;
// Username (alphanumeric, underscore, 3-16 chars)
const username = /^[a-zA-Z0-9_]{3,16}$/;
// Slug (URL-safe string)
const slug = /^[a-z0-9]+(-[a-z0-9]+)*$/;| Type | Pattern | Example Match |
|---|---|---|
| UUID v4 | [0-9a-f]{8}-... | 550e8400-e29b-41d4-a716-446655440000 |
| Hex color | #?([0-9A-Fa-f]{3,6}) | #FF5733, abc |
| ISO date | YYYY-MM-DD | 2024-12-25 |
| Time 24h | HH:MM | 14:30 |
| Slug | [a-z0-9]+(-[a-z0-9]+)* | my-blog-post |
7Search and Replace Patterns
Use regex with replace() for powerful text transformations. Capture groups enable restructuring data.
// Remove extra whitespace
const cleaned = text.replace(/\s+/g, ' ').trim();
// Convert camelCase to kebab-case
const kebab = 'camelCase'.replace(/([a-z])([A-Z])/g, '$1-$2').toLowerCase();
// Result: 'camel-case'
// Mask credit card (show last 4)
const masked = '4111222233334444'.replace(/\d(?=\d{4})/g, '*');
// Result: '************4444'
// Format phone number
const formatted = '5551234567'.replace(/(\d{3})(\d{3})(\d{4})/, '($1) $2-$3');
// Result: '(555) 123-4567'
// Swap first and last name
const swapped = 'John Doe'.replace(/(\w+) (\w+)/, '$2, $1');
// Result: 'Doe, John'
// Remove HTML tags
const noHtml = html.replace(/<[^>]*>/g, '');
// Escape special regex characters
const escaped = text.replace(/[.*+?^${}()|\[\]\\]/g, '\\$&');In replacement strings, $1, $2, etc. refer to capture groups. $& refers to the entire match. Use $` for text before match and $\
Performance and Best Practices
Poorly written regex can cause performance issues. Follow these guidelines for efficient patterns.
- Be specific: [0-9] is clearer than \d when you mean digits only
- Avoid nested quantifiers: (a+)+ can cause exponential backtracking
- Use non-capturing groups (?:...) when you don't need the match
- Anchor patterns with ^ and $ when matching whole strings
- Prefer possessive quantifiers or atomic groups in languages that support them
- Test with pathological inputs: aaaaaaaaaaaaaaaaaaaab against (a+)+b
- Cache compiled regex: const pattern = /.../ outside loops
- Use indexOf() for simple substring checks—it's faster than regex
// Bad: Nested quantifiers (catastrophic backtracking risk)
const bad = /(a+)+$/;
// Good: Simplified
const good = /a+$/;
// Bad: Regex for simple check
if (/error/.test(message)) { }
// Good: Use includes() for literal strings
if (message.includes('error')) { }
// Cache regex outside loops
const pattern = /\d+/g; // Compiled once
for (const item of items) {
const matches = item.match(pattern);
}ReDoS (Regular Expression Denial of Service) attacks exploit inefficient patterns. Always validate user-provided regex and set timeouts for regex operations in production.
Frequently Asked Questions
What is the difference between .* and .*? in regex?
.* is greedy—it matches as many characters as possible, then backtracks. .*? is lazy (non-greedy)—it matches as few as possible. For ’<div>text</div>’, /<.*>/ matches the entire string, while /<.*?>/ matches just ’<div>’.
How do I make a regex case-insensitive?
Add the i flag: /pattern/i in JavaScript, re.IGNORECASE in Python, or (?i) inline. This makes [a-z] match both uppercase and lowercase letters.
What does the g flag do in JavaScript regex?
The g (global) flag finds all matches instead of stopping at the first. With match(), it returns an array of all matches. With replace(), it replaces all occurrences. Without g, only the first match is found/replaced.
How do I match a literal dot or other special character?
Escape special characters with a backslash: \. for dot, \* for asterisk, \? for question mark, \[ for bracket, etc. Inside character classes [.], most special chars are literal except ^, -, ], and \.
What is a lookahead and when should I use it?
Lookahead (?=...) checks if a pattern follows without including it in the match. Negative lookahead (?!...) checks that a pattern does NOT follow. Use them for password validation, matching words not followed by certain text, or complex conditional matching.