Skip to content

roadtocode4u/regex-guide-beginners

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Regular Expressions (Regex) for Complete Beginners

What is Regex ?

Regular Expression (or "Regex" for short) is a special way to search for patterns in text. Think of it like the "find" feature in your word processor, but much more powerful!

Instead of searching for exact words, regex lets you search for patterns like "any 5-digit number" or "any email address."

Why Learn Regex?

Regex helps you:

  • Check if information (like emails or phone numbers) is typed correctly
  • Find specific parts in large text
  • Replace or change many pieces of text at once
  • Save time by automating text tasks

A Simple Example

Let's say you want to make rules for usernames in your app. You want usernames to:

  • Only use letters, numbers, underscores, and hyphens
  • Be between 3-16 characters long

Here's a regex pattern for that:

^[a-zA-Z0-9_-]{3,16}$

This might look confusing, but let's break it down:

  • ^ means "start of the text"
  • [a-zA-Z0-9_-] means "any letter, number, underscore, or hyphen"
  • {3,16} means "between 3 and 16 of these characters"
  • $ means "end of the text"

So username123, john_doe, and cool-name would all be accepted!

Regex Building Blocks

1. Simple Characters

Letters, numbers, and most symbols just match themselves:

  • cat matches the word "cat"
  • 42 matches the number "42"

2. Special Characters

Some characters have special powers in regex. If you want to use these as normal characters, put a backslash (\) before them:

  • . (dot) - Matches any character
  • *, +, ? - Special repeat symbols
  • ^, $ - Position markers
  • \ - The escape character
  • [], (), {} - Grouping symbols

3. Character Groups

Square brackets [ ] let you match any ONE character from a list:

  • [aeiou] matches any vowel
  • [0-9] matches any digit
  • [a-zA-Z] matches any letter (upper or lowercase)

4. Shortcuts

Common patterns have shortcuts:

  • \d matches any digit (same as [0-9])
  • \w matches any "word character" (letters, numbers, underscore)
  • \s matches any space, tab, or line break

5. Repeat Symbols

  • * means "zero or more" (can appear any number of times or not at all)
  • + means "one or more" (must appear at least once)
  • ? means "zero or one" (optional, appears once or not at all)
  • {n} means exactly n times
  • {n,m} means between n and m times

10+ Common Regex Patterns with Examples

1. Phone Number Validation

Pattern: ^\d{10}$
Example: 1234567890

Explanation:

  • ^ ensures the pattern starts at the beginning of the string
  • \d{10} matches exactly 10 digits (0-9)
  • $ ensures the pattern ends at the end of the string
  • This pattern works for simple 10-digit phone numbers without any separators

For more complex phone formats with separators:

Pattern: ^(\+\d{1,3}[ -])?\(?\d{3}\)?[ -]?\d{3}[ -]?\d{4}$
Example: +1-123-456-7890 or (123) 456-7890

Explanation:

  • (\+\d{1,3}[ -])? optionally matches a country code with + followed by 1-3 digits and a space or hyphen
  • \(?\d{3}\)? matches 3 digits for the area code, optionally surrounded by parentheses
  • [ -]? optionally matches a space or hyphen as separators
  • \d{3}[ -]?\d{4} matches the remaining 7 digits with an optional separator in the middle

2. Pincode/ZIP Code Validation

For 6-digit Indian PIN codes:

Pattern: ^[1-9][0-9]{5}$
Example: 400001

Explanation:

  • ^[1-9] ensures the PIN code starts with a digit from 1-9 (not 0)
  • [0-9]{5}$ ensures the remaining 5 characters are digits from 0-9
  • This pattern follows the Indian PIN code format which is always a 6-digit number not starting with 0

3. Email Address Validation

Pattern: ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Examples: [email protected], [email protected]

Explanation:

  • ^[a-zA-Z0-9._%+-]+ matches one or more characters that could be letters, numbers, dots, underscores, percent signs, plus signs, or hyphens (valid username characters)
  • @ matches the @ symbol literally
  • [a-zA-Z0-9.-]+ matches one or more characters for the domain name (letters, numbers, dots, or hyphens)
  • \. matches a dot literally (escaped because dot is a special character in regex)
  • [a-zA-Z]{2,}$ matches at least 2 letters for the top-level domain (like com, org, etc.)

The pattern works for both complex emails with special characters like [email protected] and simpler common formats like [email protected].

4. Password Strength Validation

For a password that requires at least 8 characters, one uppercase letter, one lowercase letter, one number, and one special character:

Pattern: ^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
Example: Password1!

Explanation:

  • ^ ensures the pattern starts at the beginning of the string
  • (?=.*[a-z]) is a positive lookahead that ensures there is at least one lowercase letter
  • (?=.*[A-Z]) ensures there is at least one uppercase letter
  • (?=.*\d) ensures there is at least one digit
  • (?=.*[@$!%*?&]) ensures there is at least one of the special characters
  • [A-Za-z\d@$!%*?&]{8,}$ ensures the password is at least 8 characters long and only contains allowed characters

5. URL Validation

Pattern: ^(https?:\/\/)?(www\.)?[a-zA-Z0-9]+\.[a-zA-Z]{2,}(\/\S*)?$
Example: https://www.example.com/path

Explanation:

  • ^(https?:\/\/)? optionally matches http:// or https://
  • (www\.)? optionally matches "www."
  • [a-zA-Z0-9]+\. matches one or more alphanumeric characters followed by a dot (domain name)
  • [a-zA-Z]{2,} matches at least 2 letters (top-level domain)
  • (\/\S*)?$ optionally matches a slash followed by any non-whitespace characters (URL path)

6. Date Validation (DD/MM/YYYY format)

Pattern: ^(0[1-9]|[12][0-9]|3[01])\/(0[1-9]|1[0-2])\/\d{4}$
Example: 25/12/2023

Explanation:

  • ^(0[1-9]|[12][0-9]|3[01]) matches valid day values from 01-31
    • 0[1-9] matches days 01-09
    • [12][0-9] matches days 10-29
    • 3[01] matches days 30-31
  • \/ matches a forward slash literally
  • (0[1-9]|1[0-2]) matches valid month values from 01-12
    • 0[1-9] matches months 01-09
    • 1[0-2] matches months 10-12
  • \/\d{4}$ matches a forward slash followed by exactly 4 digits for the year

7. IP Address Validation

Pattern: ^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
Example: 192.168.1.1

Explanation:

  • The complex pattern (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) matches a valid octet in an IP address (0-255)
    • 25[0-5] matches 250-255
    • 2[0-4][0-9] matches 200-249
    • [01]?[0-9][0-9]? matches 0-199
  • \. matches a dot literally
  • The entire pattern repeats 4 times with dots in between to match all four octets of an IP address

8. Username Validation

For a username with 3-16 characters, allowing letters, numbers, underscores, and hyphens:

Pattern: ^[a-zA-Z0-9_-]{3,16}$
Example: user_name123

Explanation:

  • ^[a-zA-Z0-9_-] ensures the username only contains letters, numbers, underscores, or hyphens
  • {3,16}$ ensures the username is between 3 and 16 characters long
  • These restrictions are common for usernames to ensure they're easy to type and remember

9. Credit Card Number Validation

Pattern: ^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|6(?:011|5[0-9][0-9])[0-9]{12})$
Example: 4111111111111111 (Visa)

Explanation:

  • This pattern validates the most common credit card formats:
    • 4[0-9]{12}(?:[0-9]{3})? matches Visa cards (13 or 16 digits starting with 4)
    • 5[1-5][0-9]{14} matches MasterCard (16 digits starting with 51-55)
    • 3[47][0-9]{13} matches American Express (15 digits starting with 34 or 37)
    • 6(?:011|5[0-9][0-9])[0-9]{12} matches Discover cards (16 digits starting with 6011 or 65)

10. HTML Tag Validation

Pattern: ^<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)$
Example: <div class="example">Content</div>

Explanation:

  • ^<([a-z]+) matches the opening of an HTML tag and captures the tag name
  • ([^<]+)* matches any attributes in the tag
  • (?:>(.*)<\/\1>|\s+\/>)$ matches either:
    • A closing >, any content, and then a closing tag with the same name as the opening tag (\1 refers to the first captured group)
    • OR a self-closing tag like <img src="example.jpg" />

11. Hexadecimal Color Code Validation

Pattern: ^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$
Example: #FF5733 or #F73

Explanation:

  • ^# matches the hash symbol at the start
  • ([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$ matches either:
    • Exactly 6 hexadecimal characters (0-9, A-F, a-f) for full hex color codes like #FF5733
    • OR exactly 3 hexadecimal characters for shorthand hex color codes like #F73

12. Time Validation (24-hour format)

Pattern: ^([01]?[0-9]|2[0-3]):[0-5][0-9]$
Example: 13:45

Explanation:

  • ^([01]?[0-9]|2[0-3]) matches valid hour values from 0-23
    • [01]?[0-9] matches hours 0-19
    • 2[0-3] matches hours 20-23
  • : matches the colon literally
  • [0-5][0-9]$ matches valid minute values from 00-59

Common Regex Special Characters

Character Description Example
. Matches any single character a.c matches "abc", "adc", etc.
^ Matches start of a string ^hello matches strings that start with "hello"
$ Matches end of a string world$ matches strings that end with "world"
* Matches 0 or more occurrences ab*c matches "ac", "abc", "abbc", etc.
+ Matches 1 or more occurrences ab+c matches "abc", "abbc", but not "ac"
? Matches 0 or 1 occurrence ab?c matches "ac" and "abc" only
\ Escapes special characters a\.c matches "a.c" literally
\d Matches any digit (0-9) \d{3} matches "123", "456", etc.
\w Matches any word character (a-z, A-Z, 0-9, _) \w+ matches "abc_123"
\s Matches any whitespace character a\sb matches "a b"
[...] Matches any one character in brackets [abc] matches "a", "b", or "c"
[^...] Matches any one character NOT in brackets [^abc] matches any character except "a", "b", or "c"
{n} Matches exactly n occurrences a{3} matches "aaa"
{n,} Matches n or more occurrences a{2,} matches "aa", "aaa", etc.
{n,m} Matches between n and m occurrences a{1,3} matches "a", "aa", "aaa"
() Groups expressions and remembers matched text (ab)+ matches "ab", "abab", etc.
` ` Acts like OR operator

Testing Your Regex

You can test your regex patterns on these websites:

  1. RegExr: https://regexr.com/
  2. Regex101: https://regex101.com/

Tips for Beginners

  1. Start simple: Begin with basic patterns and gradually add complexity
  2. Test thoroughly: Always test your regex with various input strings
  3. Use anchors: ^ and $ ensure the entire string matches your pattern
  4. Be specific: Make your patterns as specific as possible to avoid false matches
  5. Use online tools: Regex testing websites help visualize how your pattern works
  6. Break it down: Complex patterns can be understood by breaking them into smaller parts
  7. Comment your regex: In code, comment complex regex to explain what it does

Remember, regex is powerful but can be complex. Take your time to understand each part of a pattern before using it in your projects.

Happy pattern matching!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages