definition
JavaScript regular expression, there are two kinds of defined ways, similar to the definition of a matching <% XXX%> string
1. Constructor
var reg=new RegExp('<%[^%>]+%>','g');
2. literal
var reg=/<%[^%>]%>/g;
- G: , Ltd. Free Join, full-text search, the default search results connected to the first stop
- i: ingore Case, ignoring case, the default case sensitive
- m: Multiple Lines, multi-line search (Change ^ and $ meaning that they are the first line treatment and Ending in any line, and not just at the beginning and end of the match the entire string in)
Metacharacters
Regular expressions people to an important reason is that the escape character too prohibitive, the combination very much, but regular expression metacharacters (special characters have special meaning in the regular expression can be used to specify its leader character) is not much
Yuan characters :( [{\ ^ $ |) * +?.
Not every character has its own special meaning yuan, has different meanings in different combinations Ghost character, look at the classification
Predefined special characters
character | meaning |
\t | Horizontal tab |
\r | Carriage return |
\n | Newline |
\f | Page breaks |
\cX | X and corresponding control character (Ctrl + X) |
\ v | Vertical tab |
\0 | Null character |
Character Classes
In general a regular expression character (an escape character count) corresponding to a character string, the meaning of the expression ab \ t is
But we can use meta characters [] to build a simple class called class means, consistent with certain features of the object, is a general reference, rather than specific to a character, and we can use the expression [abc] to character a or b or c classified as a class, such expression can match a character
Yuan characters [] combination can create a class , we can also use the metacharacter ^ create a reverse class / negative category, the reverse kind of meaning does not belong to the class of XXX content, the expression [^ abc] represent a character or not b, or content c
Class range
Follow the instructions above if we want to match a single digit then the expression like this
[0123456789]
If the letter is so. . . Good trouble, regular expressions also provides a range of classes, we can use to connect two characters xy represents any character from x to y, which is a closed interval, that contains the x and ybenshen, so that it is matched lowercase simple
[a-z]
If you want to match all letters it? In [] is the internal classes ligatures, we can write [a-zA-Z]
Predefined classes
Just use regular we created several classes, to represent numbers, letters, etc., but it is very troublesome to write, regular expression provides several predefined classes common for us to match common characters
character | Equivalence classes | meaning |
. | [^\n\r] | All characters except carriage return and line breaks |
\d | [0-9] | Numeric characters |
\D | [^0-9] | Non-numeric characters |
\s | [ \t\n\x0B\f\r] | Whitespace |
\S | [^ \t\n\x0B\f\r] | Non-whitespace characters |
\w | [A-zA-Z_0-9] | Word characters (letters, numbers, underscores) |
\W | [^ A-zA-Z_0-9] | Non-word character |
With these predefined classes, write some regular very convenient, for example, we want to match a string ab + digital + any character, you can write a ab \ d.
boundary
Regular Expressions provides several common boundary matching character
character |
meaning |
^ |
Beginning with xx |
$ |
Ending xx |
\b |
Word boundary, characters refer to [a-zA Z_0-9-] outside |
\B |
Non-word boundary |
Look irresponsible mailboxes regular match (do not imitate, parentheses will be mentioned later) \ w + @ \ w + \. (Com) $
quantifier
Before we introduce the method is one to one matched, if we want to match a string of 20 consecutive digital Do we need to be written as
\d\d\d\d...
To this end the regular expression introduced some quantifiers
character | meaning |
? | Appear zero (at most once) or once |
+ | Occur one or more times (at least once) |
* | Appear zero or more times (many times) |
{n} | Appears n times |
{n,m} | To appear m times n |
{n,} | Appears at least n times |
Look at a few examples of using quantifiers
\ W + \ b Byron matching words + boundary + Byron
(/\w+\b Byron/).test('Hi Byron'); //true (/\w+\b Byron/).test('Welcome Byron'); //true (/\w+\b Byron/).test('HiByron'); //false
\ D + \. \ D {1,3} three decimal places matching digital
Greedy and non-greedy mode
Read the above described quantifiers, maybe love to think the students will think of some questions about the matching principle, such as {3,5} quantifier, if there have been ten times in a sentence types, so each match he is three or five anyway 3,4,5 satisfy conditions 3-5, quantifiers are as many matches by default, that is, we often say greedy
'123456789'.match(/\d{3,5}/g); //["12345", "6789"]
Since there are greedy, then there will certainly be non-greedy mode, so as little as possible to match the regular expression, meaning that once a successful match does not continue to try and approach is very simple, after the quantifier plus? To
'123456789'.match(/\d{3,5}?/g); //["123", "456", "789"]
Packet
Sometimes we want to use quantifiers when matching multiple characters, rather than the example above is just match a such hope 20 string matching Byron appear, if we write Byron {20}, then the match is Byro + n appears 20 times, how Byron put it as a whole? Use () can achieve the purpose of times, we called packets
(Byron){20}
If you want to match Byron Casper appear 20 times or how to do it? You can use characters | meet or efficacy
(Byron|Casper){20}
We see the figure there are a # 1 stuff, what is it? Use grouping regular expression matches will be placed in a packet, the default is to press the numbered distribution, obtained different contents of a packet capture based on numbers, this function is some hope in the first few matches of the specific operation is useful
(Byron).(ok)
If there is the case of nested packet, outside of the front group number
((^|%>)[^\t]*)
Sometimes we do not want to capture some packets, only need to add in the group:? On it, does not mean that the contents of the packet does not belong to the regular expression, but will not add to this group numbered only
(?:Byron).(ok)
In fact, C # and other languages can also be grouped from the name, but does not support JavaScript
Preview
expression | meaning |
exp1(?=exp2) | Followed by a matching exp2 exp1 |
exp1(?!exp2) | Exp2 not match the back of exp1 |
Some say the abstract, look at an example good (? = Byron)
(/good(?=Byron)/).exec('goodByron123'); //['good'] (/good(?=Byron)/).exec('goodCasper123'); //null (/bad(?=Byron)/).exec('goodCasper123');//null
Exp1 can be seen by the above example (? = Exp2) expression will match exp1 expression, but only behind its content is exp2 time will match, that is, two conditions, exp1 (?! Exp2) compare like with like
good(?!Byron)
(/good(?!Byron)/).exec('goodByron123'); //null (/good(?!Byron)/).exec('goodCasper123'); //['good'] (/bad(?!Byron)/).exec('goodCasper123');//null