PHP study notes 10-regular expressions

Chapter 9 Regular Expressions

"Regular expression" describes one or more strings to be matched when searching the text body. This expression can be used as a character pattern compared with the text to be searched. You can use regular expressions to search for patterns in strings, replace text, and extract substrings.

9.1 The purpose of regular expressions

By using regular expressions, you can test the patterns in the string.
E.g:

  • You can test the input string to see if the phone number pattern or the credit card number pattern appears in the string, which is called data verification.
  • To replace text, you can use regular expressions to identify specific text in the document, delete the text completely, or replace it with other text.
  • Extracting substrings from strings based on pattern matching, you can find specific text in documents or input fields.

9.2 The syntax of regular expressions

The structure of regular expressions is similar to that of arithmetic expressions created. Larger expressions can be created from small expressions by using various metacharacters and operators to combine. Each component of a regular expression can be a single character, a character set, a character range, or a selection between several characters, or any combination of these components.

A regular expression can be constructed by placing various components of the expression between a pair of separators. In PHP, the separator is a pair of forward slash (/) characters, as in the following example:

/^(\d+)?\.\d+$/

9.2.1 Elements in regular expressions

The elements that construct regular expressions generally include ordinary characters, metacharacters, qualifiers, anchor points, non-printing characters, and designated replacement items.

  • Normal characters

The simplest regular expression is a single ordinary character compared to the search string. For example, a single-character regular expression A will always match the letter A, no matter where it appears in the search string. You can combine multiple single characters to form a longer expression. For example, the regular expression /the/ will match "the", "there", "other", and "over the lazy dog" in the search string. There is no need to use any concatenation operators, just input characters consecutively.

  • Metacharacter

In addition to ordinary characters, regular expressions can also contain "metacharacters". Among them, metacharacters can be divided into single-character metacharacters and multi-character metacharacters. For example, the metacharacter \d, which matches a numeric character. Common characters include all printable and non-printable characters that are not explicitly designated as metacharacters, including all uppercase and lowercase letters, numbers, punctuation marks, and some symbols.

Single-character metacharacter

Insert picture description here
These special characters lose their meaning when they appear in bracket expressions and represent ordinary characters. To match these special characters, you must first escape the characters, that is, add the backslash character "\" in front of the character. For example, to search for "+" text characters, you can use the expression "\+".

In addition to the above single-character metacharacters, there are some multi-character metacharacters:

Multi-character metacharacter

Insert picture description here
Insert picture description here
Insert picture description here

  • Non-printing characters

Non-printing characters are characters that are escaped by ordinary characters and used to match specific behaviors in regular expressions, such as line feed, page feed, and white space.

Non-printing characters

Insert picture description here

  • Prioritization

The calculation method of regular expressions is very similar to that of arithmetic expressions, that is, they are calculated from left to right and follow the order of priority.

Regular expression precedence

Insert picture description here

In addition, characters have a higher priority than replacement operators. For example, "m|food" is allowed to match "m" or "food".

9.3 Using regular expressions

9.3.1 Matching and Finding

  1. preg_match() function

The preg_match() function searches and matches the string according to the regular expression pattern. The syntax is as follows:

int preg_match(string $pattern, string $subject [, array &$matches [, int $flags =  0 [, int $offset = 0]]])

Description:

  • $pattern: the pattern to be searched, such as'/^def/'
  • $subject: the specified string to be searched
  • $matches: optional parameters, which are filled as search results. $matches[0] will contain the text matched by the complete pattern, $matcher[1] will contain the text matched by the first capture subgroup, and so on.
  • $flags can be set to PREG_OFFSET_CAPTURE. If this flag is passed, for each occurrence of a match, the string offset will be appended when returned. (Note: This will change the array filled in the matches parameter, each element becomes a string from the 0th element is the matched string, and the second element is the offset of the matched string from the target string $subject .)
  • If the $offset parameter is set, the search will start from the value of the target string offset $offset.
  • preg_match() returns the number of matches of the pattern, and its value is 0 or 1, because preg_match() will stop searching after the first match.
<?php
$subject = "abcdefghijklmnfdef";
$pattern_1 = '/def/';
$num = preg_match($pattern_1, $subject, $matches_1, PREG_OFFSET_CAPTURE, 8);
var_dump($matches_1);
var_dump($num);
$pattern_2 = '/def$/';
$num = preg_match($pattern_2, $subject, $matches_2, PREG_OFFSET_CAPTURE, 3);
var_dump($matches_2);
?>
array(1) {
  [0]=>
  array(2) {
    [0]=>
    string(3) "def"
    [1]=>
    int(15)
  }
}
int(1)
array(1) {
  [0]=>
  array(2) {
    [0]=>
    string(3) "def"
    [1]=>
    int(15)
  }
}
  1. preg_match_all() function
<?php
$subject = "abcdefghijkdefabcedfdefxyzdef";
$pattern_1 = '/(def)(abc)/';
$num_1 = preg_match_all($pattern_1, $subject, $matches_1, PREG_PATTERN_ORDER);
var_dump($matches_1);
var_dump($num_1);
$pattern_2 = '/(def)(abc)/';
$num_2 = preg_match_all($pattern_2, $subject, $matches_2, PREG_OFFSET_CAPTURE, 3);
var_dump($matches_2);
var_dump($num_2);
?>
array(3) {
  [0]=>
  array(1) {
    [0]=>
    string(6) "defabc"
  }
  [1]=>
  array(1) {
    [0]=>
    string(3) "def"
  }
  [2]=>
  array(1) {
    [0]=>
    string(3) "abc"
  }
}
int(1)
array(3) {
  [0]=>
  array(1) {
    [0]=>
    array(2) {
      [0]=>
      string(6) "defabc"
      [1]=>
      int(11)
    }
  }
  [1]=>
  array(1) {
    [0]=>
    array(2) {
      [0]=>
      string(3) "def"
      [1]=>
      int(11)
    }
  }
  [2]=>
  array(1) {
    [0]=>
    array(2) {
      [0]=>
      string(3) "abc"
      [1]=>
      int(14)
    }
  }
}
int(1)
  1. preg_grep() function The
    preg_grep() function can return array entries of matching patterns, the syntax is as follows:
array preg_grep(string $pattern, array $input [, int $flags = 0])

Description:

  • $pattern: the pattern to be searched
  • $input: input array
  • $flags: When flags is set to PREG _ GREP _ INVERT, this function will return the input array with the given mode flags is set to PREG\_GREP\_INVERT, this function will return the input array with the given modeF L A G S set is set to P R & lt E G _ G R & lt E P _ the I N V E R & lt T so , the function number will be returned back to input the number of sets in with a given mode of Formula array element pattern does not match the composition .
<?php
$subject = ['abc', 'def', 'efg', 'hijk', 'abcdef', 'defabc'];
$pattern = '/def$/';
$grep_1 = preg_grep($pattern, $subject);
var_dump($grep_1);
$grep_2 = preg_grep($pattern, $subject, PREG_GREP_INVERT);
var_dump($grep_2);
?>
array(2) {
  [1]=>
  string(3) "def"
  [4]=>
  string(6) "abcdef"
}
array(4) {
  [0]=>
  string(3) "abc"
  [2]=>
  string(3) "efg"
  [3]=>
  string(4) "hijk"
  [5]=>
  string(6) "defabc"
}

9.3.2 Search and Replace

  1. preg_replace() function
<?php
$str1 = 'Life is like a sea and only a strong--willed man can cross it.';
$pattern_1 = ['/like/', '/sea/', '/man/'];
$replacement_1 = ['likes', 'book', 'woman'];
echo preg_replace($pattern_1, $replacement_1, $str1);
echo "\n";
$arr = ['seek the truth from facts', 'There is a will, there is a way'];
$pattern_2 = ['/seek/', '/facts/'];
$replacement_2 = ['why', '?'];
$res = preg_replace($pattern_2, $replacement_2, $arr);
var_dump($res);
?>
Life is likes a book and only a strong--willed woman can cross it.
array(2) {
  [0]=>
  string(20) "why the truth from ?"
  [1]=>
  string(31) "There is a will, there is a way"
}
  1. preg_filter() function

The preg_filter() function is also used to perform a regular expression search and replacement, which is equivalent to preg_replace(), except that preg_filter() only returns results that match the target.

<?php
$subject = array('1', 'a', '2', 'b', '3', 'A', 'B', '4');
$pattern = array('/\d/', '/[a-z]/', '/[1a]/');
$replace = array('A:$0', 'B:$0', 'C:$0');
echo "preg_filter returns\n";
print_r(preg_filter($pattern, $replace, $subject));
echo "preg_replace returns\n";
print_r(preg_replace($pattern, $replace, $subject));
?>
preg_filter returns
Array
(
    [0] => A:C:1
    [1] => B:C:a
    [2] => A:2
    [3] => B:b
    [4] => A:3
    [7] => A:4
)
preg_replace returns
Array
(
    [0] => A:C:1
    [1] => B:C:a
    [2] => A:2
    [3] => B:b
    [4] => A:3
    [5] => A
    [6] => B
    [7] => A:4
)

9.3.3 Segmentation and Escaping

  1. preg_split() function
<?php
$str = "Java, PHP, C#, Python";
$pattern = '/[\s, ]+/';
var_dump(preg_split($pattern, $str));
?>
array(4) {
  [0]=>
  string(4) "Java"
  [1]=>
  string(3) "PHP"
  [2]=>
  string(2) "C#"
  [3]=>
  string(6) "Python"
}
  1. preg_quote() function
<?php
$keywords = '$40 for \a g3/400 * 10/x';
$keywords = preg_quote($keywords, 'x');
echo $keywords;
echo "\n";
$textbody = "This book is *very* difficult to learn.";
$word = "*very*";
$textbody = preg_replace("/" . preg_quote($word) . "/", $word, $textbody);
echo $textbody;
?>
\$40 for \\a g3/400 \* 10/\x
This book is *very* difficult to learn.

Guess you like

Origin blog.csdn.net/username666/article/details/106734935