1. Briefly
- Regular expressions (Regular Expression, regexp) is a descriptive text string feature syntax rules, to verify that various string matches (Match) This feature, and thus achieve advanced text search, replace, intercept operation content. For example, to find a string that matches a large number of features in the text, this will be characterized in accordance with the regular expression syntax written, form a pattern recognition program for a computer (the Pattern), then the computer program will be based on this pattern is matched to the text.
- Regular expressions formation and development has a long history, it is widely used in a variety of computer software. For example, the operating system (UNIX, Linux, etc.), programming languages (C, C ++, Java, PHP, Python,] avaScript, etc.), use the server software (Apache, Nginx) will meet in regular expressions.
2. Getting grammar
Background
POSIX and Perl syntax. grep command supports two kinds of POSIX and Perl regular expression syntax, default is POSIX BRE syntax, specify the -E option to switch to the POSIX ERE syntax, specify the -P option to switch to the Perl syntax (E and P option can not be used simultaneously) . The following command to explain the basis of Perl syntax .
1.grep command
grep support content from the standard input, pipes (pipe) input, as well as a text file regular expression search
1) Standard Input
[root@ localhost ~]$grep --color 'hello'
In the above example, co l or grep command options are colored to match the content of the label, the parameter is a line hello regular expression syntax pattern matching for matching content. After this command line is displayed at the cursor using the keyboard hello world or any other content, and press Enter grep submitted to matching results are shown below.
hello world
hello world
2) pipeline input mode
Linux is a communication pipe support mechanism, which acts to the output of a program as input to another program. Symbol conduit is |, by connecting a plurality of command symbols before and after this, the following specific examples.
[root@centos test]# cat word.txt
hello world
[root@centos test]# cat word.txt |grep --color llo
hello world3) papers
The second parameter is an optional parameter grep command, for reading the contents of the file specified will be a regular pattern matching. Word.txt to read the specified file, for example, the following specific examples.
[root@centos test]# grep wor word.txt
hello world
2. metacharacter, text characters and escape characters
A complete regular expression consists of two parts meta characters and text characters. Wherein the meta character is a character that has special meaning, such as the aforementioned "^", "$", ".", "*", The text is plain text characters, such as letters and numbers. Regular expression defines a number of meta-characters used to implement complex match, but the content is to match these characters themselves, we need to add an escape character before, "\", such as "\ ^." "\" Character itself belongs to the yuan, with "\\" escape.
1) use the escape character.
[root@centos test]# grep --color '[\^\$\\\.]'
12$fd^34.45\
12$fd^34.45\2) packet
In the regular expression also supports packet (also known as sub-mode, the sub-matching), with parentheses "()" to achieve. A sub-mode for the nested brackets, as in the following example implements Matching aa occurred three times.
[root@centos test]# grep -P --color '(aa){2}'
1234aaaadfg
1234aaaadfg3) in single quotation marks matching unit, using \ X27
[root @ centos test] # grep --color -P '\ x27'
asss'dd
Benseghir ' dd
3. syntax rules
1. Locator
1) LocatorNote: You can use a blank line to match $ ^
Locator Explanation Examples Match result ^ Matching string starting position ^hello helloworld $ End of the string matching position world$ helloworld 2) selectors, |
[root@centos test]# grep -P --color 'java|PHP'
who is the best language java or PHP
who is the best language java or PHP3) character range
Examples Explanation Match result [abc] Matching characters a, b, c abcdef [^abc] Matching is not a, b, c character abcdef [a-z] ~ Z matching letters within a range of letters abcdef 4) point character and qualifiers
character Explanation Examples result . Matches any character a.d Matching can add asd afd ? Matches the preceding character 0 or 1 he?p Can match hep heep + In front of the match character one or more times w+w Can match www wwww * Matches the preceding character zero or more times go*gle Can match gogle google gooogle ... {n} Matches the preceding character n times go{1}gle You can only match google
4. Applications
1) Verify that the file name extension
Only allow access to html, css, jpg file extension
^.*?\.(html|css|jpg)$
2) Verify IP Address
^(([1-9]?\d|1\d{2}|2[0-4]\d|5[0-5]))\.){3}([1-9]?\d|1\d{2}|2([0-4]\d|5[0-5]))$
3) Verify the date format
^[1-9]\d{3}-([1-9]|1[0-2])-([1-9]|[1-2]\d|3[01])$