php, linux, javascript regular expressions

PHP

Regular expressions contain three elements: quantifiers, metacharacters, modifier
quantifiers
Syntax description

  • matches any string that contains at least one leading
  • Match any string containing zero or more leading characters
    ? Match any string containing zero or one leading string
    . Match any string
    containing {x} Match any string containing x leading strings
    {x,y} Match any string containing x to y leading string
    {x,} matches any string containing at least x leading strings
    $ matches the end
    of the line of the string ^ matches the beginning of the line of the
    string | matches the left or right of the string
    () surrounds a character grouping or defines a backreference, Can be extracted using \1\2

Metacharacter
Syntax Description
[az] Matches any string containing lowercase letters az
[AZ] Matches any string containing uppercase letters AZ
[0-9] Matches any string containing digits 0-9
[abc] Matches any string containing lowercase letters A, b, c string
[âbc] matches any string that does not contain lowercase letters a, b, c
[a-zA-Z0-9_] matches any string that contains a-zA-Z0-9 and an underscore
\w matches any string containing a-zA-Z0-9 and an underscore (same as above)
\W matches any string without an underscore and alphanumeric
\d matches any numeric character, same as [0-9]
\D matches any Non-numeric characters, same as [^0-9]
\s matches any whitespace character
\S matches any non-whitespace character
\b matches whether the word boundary is reached
\B matches whether the word boundary is not reached
\ matches special characters in the regular

Modifier
Syntax Description
i Complete case-insensitive search
m Use multi-line identification when matching the first or last content Match x Ignore blanks
in regular expressions end

* preg_filter — 执行一个正则表达式搜索和替换
* preg_grep — 返回匹配模式的数组条目
* preg_last_error — 返回最后一个PCRE正则执行产生的错误代码
* preg_match_all — 执行一个全局正则表达式匹配
* preg_match — 执行一个正则表达式匹配
* preg_quote — 转义正则表达式字符
* preg_replace_callback — 执行一个正则表达式搜索并且使用一个回调进行替换
* preg_replace — 执行一个正则表达式的搜索和替换
* preg_split — 通过一个正则表达式分隔字符串

$pattern = '/([\d])\/([\d])\/([\d]*)/';
$string = '26/06/2014';
echo preg_replace($pattern, "$3/$2/$1", $string);

javascript

 创建方式:
      1. 
           var pattern = new RegExp('box');
           var pattern = new RegExp('box','ig');
      2.
           var pattern = /box/;
           var pattern = /box/ig;  

Optional parameters for pattern modifiers
Parameter meaning
i ignore case
g global matching
m multi-line matching

Methods of the RegExp object
method function
test Tests for pattern matching in a string, returns true or false
exec Performs a matching search in a string and returns an array of results
Example 1: var pattern = /box/i;
var str = "This is a Box !";
alert(pattern.test(str));
Example 2: var pattern = /box/i;
var str = "This is a Box!";
alert(pattern.exec(str));

String objects also provide four methods for using regular expressions.

Method meaning
match(pattern) Returns the substring in pattern or null
replace(pattern, replacement) Replaces pattern with replacement
search(pattern) Returns the starting position of pattern in the string
split(pattern) Returns the array split by the specified pattern

Static properties of the RegExp object
Property short name Meaning
input $_ The currently matched string
lastMatch $& The last matched string
lastParen $+ The last matched substring in parentheses
leftContext $` The substring before the last match
multiline $* Boolean value used to specify whether all expressions are used on multiple lines
rightContext $' Substring after last match

Single character and numeric
metacharacter/metasymbol matching cases
. Matches any character except newline
[a-z0-9] Matches any character in the set of characters enclosed in parentheses
[â-z0-9] Matches any character not enclosed in parentheses The set of characters
\d matches digits
\D matches non-digits, same as [^0-9]
\w matches letters and numbers and
\W matches non-alphanumerics and

Character class: Whitespace Metacharacter
/metasymbol matching case
\0 matches null character
\b matches space character
\f matches paper feed character
\n matches newline character
\r matches carriage return character
\t matches tab character
\s matches whitespace character , spaces, tabs, and newlines
\S matches non-whitespace characters

Character class: Anchor character
meta-character/meta-symbol matching case
^ start of line match
$ end of line match
\A only match at the beginning of the string
\b match word boundary, the word is invalid when the word is within []
\B match non-word boundary
\G match The starting position of the current search
\Z matches the end of the string or the end of the line
\z matches only the end of the string

Character class: repeated character
metacharacter/metasymbol match case
x? matches 0 or 1 x
x* matches 0 or any number of x
x+ matches at least one x
(xyz)+ matches at least one (xyz)
x{m, n} matches at least m and at most n x's

Character class: alternate character
metacharacter/metasymbol match case
this|where|logo matches any of this or where or logo

Character class: records character
metacharacter/metasymbol matches
(string) grouping for backreferences
\1 or $1 matches what is in the first grouping
\2 or $2 matches what is in the second grouping
\3 or $3 match the contents of the third group
var pattern = /(\d )\/(\d )\/(\d*)/;
var str = '26/06/2014';
var result = str.replace(pattern ,'$3/$2/$1');
alert(result);

linux

The meaning of character class [:alnum:] represents English uppercase and lowercase characters and numbers, ie 0-9, AZ, az[:alpha:] represents any English uppercase and lowercase characters, ie AZ, az[:lower:] represents lowercase characters, ie az[:upper:] represents uppercase characters, that is, AZ[:digit:] represents numbers, that is, 0-9[:xdigit:] represents the hexadecimal number type, so the numbers including 0-9, AF, and af are the same as The characters [:blank:] represent the space bar and the tab key [:graph:] All other keys except the space and tab keys [:space:] Any character that produces a blank, including the space bar, tab key, CR, etc. [ :cntrl:] represents the control keys on the keyboard, including CR, LF, Tab, Del, etc. [:print:] represents any printable character [:punct:] represents punctuation, ie " ' ? ! ; : # $

Basic regular expression syntax (RE syntax): If a string is represented by a regular expression, any character in it is called an RE character.

Special characters are:
only support normal regular expression syntax ^ $ . \ [ ] " '
support extended regular expression syntax ^ $ .
\ [ ] " ' + ? | ( ) Basic RE character meaning and example ^word
meaning: to be searched The string (word) is at the
beginning of the line Example: Find which line starts with #, and list the line number
grep -n '^#' regular_express.txtword$
Meaning: The string (word) to be searched is at the end of the line
Example: Print out the line ending with ! and list the line number
grep -n '!$' regular_express.txt.
Meaning: It means that there must be a character of any character (except the newline), which can match the newline in awk
Examples: The searched string can be (eve) (eae) (eee), etc., that is, there must be a character between e and e, not (ee)!
grep -n 'ee' regular_express.txt \
meaning: escape characters, remove the special meaning of special symbols, and turn ordinary characters into special characters.
Example: find the line containing the single quote '
grep -n \' regular_express.
txt meaning: repeat 0 to infinity of the previous character
Example: find a string containing (es)(ess)(esss), etc. Note that because it
can be 0, es also matches the string to be searched. In addition, because it is a symbol that repeats the "previous RE character", it must be followed by an RE character! For example, any character is .
grep -n 'ess
' regular_express.txt[list]
Meaning: Find the characters you want to select from the RE characters in the character set (excluding line breaks), and line breaks can be included in awk. Note that special characters such as .* in square brackets are turned into general characters (except []).
Example: Find the line containing (gl) or (gd), it should be noted that the character in [] represents a character to be searched, for example "a[afl]y" means the search string can be aay, afy , or aly
grep -n 'g[ld]' regular_express.txt[n1-n2]
Meaning: Find the range of characters you want to select from the RE characters of the character set and RE characters
Example: Find the line containing any number. It should be noted that the minus sign - in the character set [] has a special meaning, it represents all consecutive characters between two characters (related to the encoding order)
grep -n '[0-9]' regular_express.txt [^list]
meaning: reverse selection
example: the search string can be (oog)(ood) but not (oot)
grep -n 'oo[^t]' regular_express.txt
{n}
{n,}
{ The meaning of n,m}
: the previous RE character of n to m consecutive, if it is {n}, it is the previous RE character of n consecutive, if it is {n,}, it is the previous RE character of more than n consecutive .
Use another form in regular expressions that support extended, and n, m must be an integer between 0 and 255:
Note: Essentially the syntax example of extended regular expressions
: there are 2 between g and g to a string where 3 o's exist
grep -n 'go{2,3}' regular_express.txt 4 To support
extended regular expression syntax : add -E to grep (or add \ when using the extension symbol) add -r to sed (or use the extension Add \ when symbol)


Awk, perl itself supports extending this expression (that is to say, if you want to quote in awk (use [(]) for ordinary characters.) Extended RE character meaning and example +
meaning: repeat one or more previous RE characters
o+ stands for more than one o?
Meaning: zero or one of the previous RE character o
? stands for empty or o|
,
ABC|DEF means ABC or DEF, A(BC|DE)F means ABCF or ADEF
Example: remove blank lines and lines with # at the
beginning grep -Env '^$|^#' regular_express.txt ()
Meaning: find Out the "group" string, the extended
example of []: find the two strings glad or good, because g and d are repeated, so you can list la and oo in () in the form of or
grep -En ' g(la|oo)d' regular_express.txt () + meaning: repeating one or more of the previous "groups"
{n}
{n,}
{n,m} has the same meaning as ordinary regular expressions, but only supports This form is used in extended regular expressions, that is, used in awk, grep -E, sed -r.
5, metacharacters

Metacharacter: It is a perl-style regular expression, only some text processing tools support it, not all tools support it.
Shorthand equivalent to [character set]. Metacharacter meaning and example \b
meaning: word boundary
example: \bcool\b matches cool but not coolant, special characters cannot be followed by ? + equivalent quantifier \B
meaning: non-word boundary
example: cool\B matches cool but not cool, special characters cannot be followed by
? + equivalent quantifier \d
Meaning: a single digit character
Example: b\db matches b2b, but not bcb\D
Meaning: a single non-digital character
Example: b\Db matches bcb, but does not match b2b \w
meaning: single word character (letter, number, and _)
Example: \w matches 1 or a, etc., but not %, etc. \W meaning: single non-word character \n meaning: newline \s meaning: single whitespace character , form feed, tab, line feed, carriage return, and space. [\f\t\n\r ]\S meaning: single non-whitespace character \r meaning: carriage return
6, several useful regular expression items Regular expressions match words in regular text
\b[[:alpha: ]]+\b
or
(^| )["({[] book[]})"?,.:;!'s ] ( |$) matches empty lines ^$ matches empty and empty lines containing spaces^ space $ matches the entire line ^. $ matches one or more spaces space space matches a string containing any random combination of abc before s [abc]s matches formatted dollar amounts \$[space 0-9]*.[0-9][0-9] matches email addresses [A-Za-z0-9.]+@[A-Za-z0- 9.]+.[a-zA-Z]{2,4} matches an HTTP URL http://[a-zA-Z0-9-.]+.[a-zA-Z]{2,4}

Linux wildcard
shell wildcard
Note that although the wildcard here is similar to the regular expression, it is parsed based on the bash interpreter, while the regular expression is parsed by regular engine software (such as awk, grep, sed, etc.), the two are completely different.

wildcard characters

*Represents 0 or more arbitrary characters? Represents there must be an arbitrary character [ ][abcd], which means a character, or a or b or c or d[-][0-9], which means a number, 0 to 9 A certain [^][^abc] between, represents a character, and is not a, b, c

example:

[python] view plaincopy

  1. [root@linux ~]# ls test #That means that no matter how many characters follow it, it will be accepted
  2. [root@linux ~]# ls test? #That? means "must" be followed by "a" character
  3. [root@linux ~]# ls test??? #That??? means "must be followed by three" characters!
  4. [root@linux ~]# cp test[1-5] /tmp # Copy test1, test2, test3, test4, test5 to /tmp if they exist
  5. [root@linux ~]# cp test[!1-5] /tmp # As long as it is not test1, test2, test3, test4, test5 other tests? Copy to /tmp
  6. [root@linux ~]# cd /lib/modules/ uname -r/kernel/drivers # The system first executes uname -r to find the output results; accumulate the results in the directory to execute the function of cd!
  7. = cd /lib/modules/$(uname -r)/kernel #In addition, this quot (`) function can also be replaced by $()!
  8. [root@linux ~]# cp [AZ] /tmp #Indicates that the file contains uppercase letters
  9. [root@linux ~]# ls -lda /etc/ [35] #Indicates that the file contains the number 3 or 5.

Regular expression (regular express) is basically a "notation" that performs string processing behavior in units of behavior. Can only be used in tools that support it (such as vi, grep, awk, sed). The relationship between regular expressions and shell wildcards is like the relationship between local variables and global variables (that is, if you encounter a command that supports regular expressions in the future, the concept of wildcards is discarded. Otherwise, wildcards are used).

For more articles, please follow: http://www.ilovehai.com

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324777722&siteId=291194637