Regular expression usage instructions

introduce

Regular Expressions, referred to as regular expressions, are a powerful text matching and processing tool. It uses a series of characters and special symbols to define matching patterns for finding, replacing, and extracting specific text in strings.

In this blog, we’ll take a deep dive into the basic syntax of regular expressions, commonly used metacharacters, and some practical examples.

basic grammar

Regular expressions consist of ordinary characters and special metacharacters. Here are some basic regular expression metacharacters:

  • .: Matches any single character (except newline).
  • *: Matches the preceding element zero or more times.
  • +: Matches the previous element one or more times.
  • ?: Matches the previous element zero or one time.
  • []: Matches any character within the brackets.
  • ^: Matches the beginning of a line.
  • $: Matches the end of the line.
  • \: Escape character, used to match the special character itself.

Common usage

1. Verify email address

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

This regular expression can be used to verify whether the email address is legitimate.

2. Date of extraction

(\d{4})-(\d{2})-(\d{2})

This regular expression can extract a date in the format of "YYYY-MM-DD" from a string.

3. Filter HTML tags

<[^>]+>

This regular expression can be used to remove HTML tags from strings.

python example

1. Verify mobile phone number

import re

def validate_phone_number(number):
    pattern = r'^\d{11}$'
    return re.match(pattern, number) is not None

phone_number = "12345678901"
if validate_phone_number(phone_number):
    print(f"{
      
      phone_number} 是合法的手机号码")
else:
    print(f"{
      
      phone_number} 不是合法的手机号码")

2. Extract links

import re

text = "请访问我的个人网站:http://www.example.com,或者也可以在社交媒体上关注我:https://www.weibo.com/user123"
pattern = r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'
urls = re.findall(pattern, text)

print("提取的链接:")
for url in urls:
    print(url)

3. Split the string

import re

text = "apple,banana,cherry,orange"
pattern = r','
fruits = re.split(pattern, text)

print("分割后的水果列表:")
for fruit in fruits:
    print(fruit)

4. Replace text

import re

text = "Hello, my name is John. Nice to meet you, John!"
pattern = r'John'
replacement = "Alice"
new_text = re.sub(pattern, replacement, text)

print("替换后的文本:")
print(new_text)

5. Match multiple lines of text

import re

text = """
Title: Introduction to Programming
Date: 2023-04-20
Description: This course will cover the basics of programming using Python.
"""

pattern = r'^Title: (.+)$\n^Date: (.+)$\n^Description: (.+)$'
matches = re.match(pattern, text, re.MULTILINE)

if matches:
    title = matches.group(1)
    date = matches.group(2)
    description = matches.group(3)
    print(f"Title: {
      
      title}\nDate: {
      
      date}\nDescription: {
      
      description}")

These examples demonstrate the use of regular expressions in different scenarios, including validating, extracting, splitting, and replacing text.

Precautions

  • Greedy matching: By default, the regular expression will match as many characters as possible. Non-greedy matching can be achieved using non-greedy quantifiers such as *?, +?, and ??.
  • escape characters: Some characters have special meanings, such as ., *, etc. If you want to match these characters themselves, you need to use escape characters\.
  • Regular function: There are different regular functions in different programming languages, such as the re module in Python, the RegExp object in JavaScript, etc.

end

Everyone is welcome to discuss and learn!

Guess you like

Origin blog.csdn.net/Silver__Wolf/article/details/132141855