1.part1-2day php regular expression

Learning Objectives: matching characters

First, the regular expression

Regular expression and a character element consisting of text characters and modifiers, a specific format pattern, matching can, alternatively, the interception of string matching.

Second, the modifier defined: i ignore case; G global matching; m multiple rows match; S single text mode, in this mode, only the roadway can newline character; X Ignore whitespace; U Chinese not match distortion (.)

Third, the special characters:

1. Start ^ character, such as: ^ m, m represents the start with the letter string; ^ represents also negated, such as: [^ m] does not contain the string m, the need to match escapes ^ \ ^

2. $ end character, such as: $ m, m represents the end of the string of letters, required to match escapes $ \ $

3. \ escape character, to match \ be escaped as \\, n matches the character n, \ n newline

4. () mark a substring of beginning and end, with () This string will simultaneously output, to match (), with \ (and \)

5. [] mark the start and end bracket expression, to match (), with \ [and \]

{6} mark the start and end qualifier expressions to match {}, with \ {and \}

7.. In addition to matching newline \ any single character string other than n, to match with \.

8. | between any two specified in a meet can be. Such as: m | n, m and n can exist string matching to match |, with \ |

9. * Matches the preceding subexpression zero or more times, to match * with \ *

10 + submatch previous expression one or more times to match +, with \ +

11.? Matches the preceding subexpression zero or one, to match? ,use\?

Fourth, the qualifier:

1. {n} n is a non-negative integer n times match

2. {n,} n is a non-negative integer, matches at least n times

3. {n, m} n and m are both a non-negative integer n times a minimum matching, matching up m times

4.*,+,?

V. locator

1. \ b matches a word boundary, i.e., the position of the space between the words and, if a match m, tim matches, html not match

2. \ B non-word boundary system matches m, tim not match, html to match

3.^,$

Sixth, the character class [], hyphen (-), dot characters.

1. The character class []: a one character only match brackets

2. A hyphen (-): To match 12346, may be used [1-6], [1-6] is called a character clusters

3. The dot characters: match any character except for a newline

Seven, packet capture

1. Symbols in parentheses ()

Can change the scope qualifier, such as: (thir | four) th, put parentheses are matching words thirth or fourth, if not matching parentheses becomes the fourth or thir

That is subexpressions packet, such as: (. \ [0-9] {1,3}) {3}, that is, the packet (\ [0-9] {1,3}.) Repeat.

2. After back-reference that is a reference to

Such as: to match zery zery, can be written as: \ b \ w + \ b \ s \ w + \ b, but can also be written as \ b (\ w +) \ b \ s \ 1 \ b, wherein \ 1 is the previous references brackets packet (\ w +), packet sequence number references a left to right 1,2,3 ..., when referring to the front to add \, 0 indicates that the entire contents of the index matching

3. (? <Name> exp) packet custom name, with? <Name> is used when referring to \ K <name> can be, such as: \ b (? <Char> \ w +) \ b \ s \ k <char> \ b

4. Text (?: exp) exp match, i.e., does not match the captured text is not outputted, nor the group number assigned to this packet, can not be used in the backward references

The following is a zero-width assertion: may or may not be zero, and the space width content

. 5 (? = Exp) matching the preceding exp character, such as: how are you doing regular: this is to take all the characters ing the foregoing, and defines a named txt (<txt> + (= ing)?.?) capturing packets, where the value txt how are you do

. 6 (? <= Exp) matching the back exp character, such as: how are you doing regular: this is to take all of the characters behind how, and defines a name (<txt> + (<= how)?.?) txt to capture packets, where are you doing value txt

. 7 (?! exp) exp matching string does not contain the later, such as: 123abc regular: \ d {3} (?! \ D) matches the first three digit number 123 is not behind a string

. 8 (?! <Exp) exp matching string does not contain the foregoing, such as: abc123 regular: (?! <\ D) \ d {3} match three numbers 123 not preceded by a string of numbers

Eight common form

1.x | y matching x or y

2. [xyz] matches x, y, z in any character 

3. [^ Xyz] In addition to matching x, any y, z of a character

Any 4. [az] matches the lowercase letters a ~ z

Any 5. [AZ] Write large letters matches the letter A ~ Z

6. [0-9] or \ d matches any of the numbers 0-9

7. \ D == [^ 0-9] matches a non-numeric character

8. \ w matching letters, numbers, underscores. It is equivalent to [a-zA-Z0-9_]

9. \ W matches non-alphabetic, numeric, underscore. Is equivalent to [^ a-zA-Z0-9_]

10. \ s match any whitespace characters, including spaces, tabs, page breaks. Is equivalent to [\ f \ n \ r \ t \ v]

11. \ S matches any non-whitespace characters, including spaces, tabs, page breaks. Is equivalent to [^ \ f \ n \ r \ t \ v]

12. \ f page breaks

13. \ n newline

14. \ r carriage

15. \ t tab

16. \ v vertical tab

Nine, php how to use regular expressions (PCRE)

<? PHP 
  the preg_match ( ' / \ + D / U ' , ' 123 ' ); // Parameter 1 positive expression to be added at the beginning and end / and / 2 are the parameter string to be matched. Match only once or returns 0. 1 
  the preg_match ( ' / \ + D / U ' , ' 123 ' ); // Parameter to add regular expression / and / 2 are parameters to match the beginning and end of the character 1 string. Matching the number (possibly zero) to return multiple matches, or returns an error to false 
  preg_replace ( ' / \ + D / U ' , ' 123 ' , ' 456789aa ' , ' . 1 ' ); //Regular expression parameter 1 to be added at the beginning and end / and /, the parameter string 2 is used to match the successful replacement string of atoms, 3 to the parameter string search elements, the number of parameters is replaced 4 The default -1 unlimited 
?>

Regular expressions Links

Guess you like

Origin www.cnblogs.com/ldwtry/p/12185565.html