Getting Started with Perl regular expressions

1. Briefly

  • Regular expressions (Regular Expression, regexp) is a descriptive text string feature syntax rules, to verify that various string matches (Match) This feature, and thus achieve advanced text search, replace, intercept operation content. For example, to find a string that matches a large number of features in the text, this will be characterized in accordance with the regular expression syntax written, form a pattern recognition program for a computer (the Pattern), then the computer program will be based on this pattern is matched to the text.
  • Regular expressions formation and development has a long history, it is widely used in a variety of computer software. For example, the operating system (UNIX, Linux, etc.), programming languages ​​(C, C ++, Java, PHP, Python,] avaScript, etc.), use the server software (Apache, Nginx) will meet in regular expressions.

2. Getting grammar

Background

POSIX and Perl syntax. grep command supports two kinds of POSIX and Perl regular expression syntax, default is POSIX BRE syntax, specify the -E option to switch to the POSIX ERE syntax, specify the -P option to switch to the Perl syntax (E and P option can not be used simultaneously) . The following command to explain the basis of Perl syntax .

1.grep command

grep support content from the standard input, pipes (pipe) input, as well as a text file regular expression search

1) Standard Input

[root@ localhost ~]$grep --color 'hello'

In the above example, co l or grep command options are colored to match the content of the label, the parameter is a line hello regular expression syntax pattern matching for matching content. After this command line is displayed at the cursor using the keyboard hello world or any other content, and press Enter grep submitted to matching results are shown below.

hello world

hello world

2) pipeline input mode

Linux is a communication pipe support mechanism, which acts to the output of a program as input to another program. Symbol conduit is |, by connecting a plurality of command symbols before and after this, the following specific examples.

[root@centos test]# cat word.txt 
hello world
[root@centos test]# cat word.txt |grep --color llo
hello world

3) papers

The second parameter is an optional parameter grep command, for reading the contents of the file specified will be a regular pattern matching. Word.txt to read the specified file, for example, the following specific examples.

[root@centos test]# grep wor word.txt 
hello world

 

2. metacharacter, text characters and escape characters

A complete regular expression consists of two parts meta characters and text characters. Wherein the meta character is a character that has special meaning, such as the aforementioned "^", "$", ".", "*", The text is plain text characters, such as letters and numbers. Regular expression defines a number of meta-characters used to implement complex match, but the content is to match these characters themselves, we need to add an escape character before, "\", such as "\ ^." "\" Character itself belongs to the yuan, with "\\" escape.

1) use the escape character.

[root@centos test]# grep --color '[\^\$\\\.]'
12$fd^34.45\
12$fd^34.45\

2) packet

In the regular expression also supports packet (also known as sub-mode, the sub-matching), with parentheses "()" to achieve. A sub-mode for the nested brackets, as in the following example implements Matching aa occurred three times.

[root@centos test]# grep -P --color '(aa){2}'
1234aaaadfg
1234aaaadfg

3) in single quotation marks matching unit, using \ X27

[root @ centos test] # grep --color -P '\ x27'
asss'dd
Benseghir ' dd

 

 3. syntax rules

1. Locator

1) Locator
Locator Explanation Examples Match result
^ Matching string starting position ^hello helloworld
$ End of the string matching position world$ helloworld
Note: You can use a blank line to match $ ^

2) selectors, |

[root@centos test]# grep -P --color 'java|PHP'
who is the best language java or PHP
who is the best language java or PHP

3) character range

Examples Explanation Match result
[abc] Matching characters a, b, c abcdef
[^abc] Matching is not a, b, c character abcdef
[a-z] ~ Z matching letters within a range of letters abcdef

4) point character and qualifiers

character Explanation Examples result
. Matches any character a.d Matching can add asd afd
? Matches the preceding character 0 or 1 he?p Can match hep heep
+ In front of the match character one or more times w+w Can match www wwww
* Matches the preceding character zero or more times go*gle Can match gogle google gooogle ...
{n} Matches the preceding character n times go{1}gle You can only match google

 

 

 4. Applications

1) Verify that the file name extension

Only allow access to html, css, jpg file extension

^.*?\.(html|css|jpg)$

2) Verify IP Address

^(([1-9]?\d|1\d{2}|2[0-4]\d|5[0-5]))\.){3}([1-9]?\d|1\d{2}|2([0-4]\d|5[0-5]))$

3) Verify the date format

^[1-9]\d{3}-([1-9]|1[0-2])-([1-9]|[1-2]\d|3[01])$

Guess you like

Origin blog.csdn.net/ly853602/article/details/93028716