RegExp review

If you feel that something is moving, you will forget it if you don't use it for a long time, so record it...


regular expression

Regular basis

  • Escape symbol->\
  • Escape character->+escaped character
  • Quotation mark problem
var str = "我是一名"牛逼"的程序员";// 语法错误 ,牛逼是一个变量,变量和字符串直接应该有加号
var str = "我是一名"+牛逼+"的程序员"//引用错误,牛逼这个变量没有被定义

var str = "我是一名\"牛逼\"的程序员";
// '我是一名"牛逼"的程序员'

var str = "我是一名\\牛逼\\的程序员";
// '我是一名\牛逼\的程序员'
  • n、r、t
  1. \n newline
var str = "我是一名\n牛逼\n的程序员";
console.log(str) //1. 浏览器控制台换行

document.write(str) //2. document没有换行
// 我是一名 牛逼 的程序员  显示为空格
  • Line breaks in editing systems, such as editors, consoles, etc. are all \n
  • HTML is a plain text display, and its line break is \n\r. Only \n cannot be recognized, so it is displayed as a space.

window->carriage return and line feed+\r\n
mac->carriage return and line feed+\r
linux->carriage return and line feed+\n

  1. \t tab character, four spaces
  • Newline of string
var str = "我是一名
            牛逼
            的程序员"

// 转义,把回车换行转义成空格,这是系统默认加的
var str = "我是一名\
            牛逼\
            的程序员"

Literals, constructors, flag parameters (i, g, m)

var reg = new RegExp('Test','i'); 
var str = 'this is Test'
reg.test(str) //true

var reg = new RegExp('test','i'); //ignore case 忽略大小写
var str = 'this is Test,i am test2'
str.match(reg) //['Test', index: 8, input: 'this is Test,i am test2', groups: undefined]
// 只匹配了第一个Test

//全匹配 
var reg = new RegExp('test','ig'); //g 全局匹配 global
var str = 'this is Test,i am test2'
str.match(reg) //['Test', 'test']

var reg = new RegExp('^test','igm'); //m 多行匹配 multi-line
var str = 'this is Test,\ntest is now'
str.match(reg) //['test']


// 字面量 不用写引号
var reg = /test/ 
var str = 'this is test'
reg.test(str) //true
// 忽略大小写
var reg = /test/i 
var str = 'this is Test'
reg.test(str) //true
  • Should I use literals or objects new for regular expressions?
    It is recommended to use regular literals. If the regular expression also has variables, it can only be declared with an instantiated object.

  • Compared

 var reg = /test/;
 var newReg = new RegExp('test');
//  reg 和  newReg 是不同的两个对象

var reg = /test/;
reg.a = 1;
var newReg = RegExp('test');
console.log(newReg.a)
// 也是不同的两个对象

var reg = /test/;
var newReg = RegExp(reg); //这个它两是相同的引用

var reg = /test/;
var newReg = new RegExp(reg);
// 不同的两个对象

expression[]

[xyz]

A collection of characters. Matches any character within square brackets

For example, [abcd] and [ad] are the same. They all match the 'b' in "brisket", and they all match the 'c' in "city".

var str = "0154eer878we232";
var reg = /[12345678][12345678][12345678]/g;
str.match(reg) //(3) ['154', '878', '232']

var str = "015284eer878we232";
var reg = /[12345678][12345678][12345678]/g;
str.match(reg) //(3)  ['152', '878', '232']
// 匹配过的不再匹配
  • [0-9]
  • [AZ], [Az] (cannot be lowercase to uppercase: Invalid regular expression: /[0-9a-Z]/: Range out of order in character class)
  • [a-z]
  • [0-9A-Z]

[^xyz]

A reverse character set. That is, it matches any character not enclosed in square brackets. You can use dashes (-) to specify a range of characters. Any normal character will work here.

For example, [^abc] and [^ac] are the same. They match the 'r' in "brisket" and the 'h' in "chop".

x|y

Matches 'x' or 'y'.

For example, /green|red/ matches 'green' in "green apple" and 'red' in "red apple"

var reg = /green|red/g
var str = 'green apple'
console.log(str.match(reg))//['green']

Metacharacters

A metacharacter matches one bit

\w、\W

  • w -> [0-9A-z]
  • W -> [^\w]
var reg = /\wab/g
var str = '2345abc-avc'
str.match(reg) //['5ab']

var reg = /\Wab/g
var str = '2345abc-ab'
str.match(reg) //['-ab']

var reg = /[\W][\w][\w]/g
var str = '2345abc-ab'
str.match(reg) //['-ab']

\d、\D

  • d -> [0-9]
  • D -> [^\d]

\s、\S

s matches a whitespace character, including space, tab, form feed, and newline characters
s -> [\r\n\t\v\f]
S -> matches a non-whitespace character

  1. For example, /\s\w*/ matches ' bar ' in "foo bar."
  2. For example, /\S\w*/ matches 'foo' in "foo bar."
var reg = /\sab/g
var str = '23\nab'
console.log(str.match(reg))

\b、\B

b-> matches a word boundary

var reg = /\bthis\b/g
var str = 'this is a test'
console.log(str.match(reg))//['this']

var reg = /\Bhis\b/g
var str = 'this is a test'
console.log(str.match(reg))//['his']

.

(decimal point) Defaults to matching any single character except newlines.

var reg = /.n/g
var str = 'nay, an apple is on the tree'
console.log(str.match(reg))//(2) ['an', 'on']

regular quantifier

  • Do not look back
  • greedy pattern

+

Matches the previous expression 1 or more times. Equivalent to {1,}.

var reg = /\w+/g
var str = '1@candy#23'
str.match(reg)// ['1', 'candy', '23']

*

Matches the previous expression 0 or more times. Equivalent to {0,}

var reg = /\w*/g
var str = '1@candy#23'
str.match(reg)// (6) ['1', '', 'candy', '', '23', '']

var reg = /\d*/g
var str = 'abcde'
str.match(reg)//(6) ['', '', '', '', '', '']
// 字符串从左到右依次匹配多,如果一旦匹配上就不回头
// 贪婪匹配原则:能匹配多个绝不匹配少个
// 1a2b3c4d5e6
// 每个光标处都会匹配到 6 

var reg = /\w*/g
var str = 'abcde'
str.match(reg) //(2) ['abcde', '']

var reg = /\d*/g
var str = 'sdf12'
str.match(reg) //(5) ['', '', '', '12', '']
// \d* 是匹配连续的0个或多个数字。并且是贪婪匹配,能匹配2个时就不会只匹配1个。
// 结果是 ['', '', '', '12', '']
// 因为\d* 可以匹配0个数字,也就等同匹配空字符串''
// 第1次的''是匹配s前面的空字符串
// 第2次的''是匹配d前面的空字符串
// 第3次的''是匹配f前面的空字符串
// 第4次'12'是能匹配2个时就不会只匹配1个。
// 第5次的''是匹配字符串结尾前的空字符串
  • It is zero times and multiple times, and there is one more empty string than +

Greedy matching first, then matching the empty string

?

Matches the previous expression 0 or 1 times. Equivalent to {0,1}.

var reg = /\d?/g
var str = 'abcde'
str.match(reg) //(6) ['', '', '', '', '', '']

var reg = /\w?/g
var str = 'abcde'
str.match(reg) //(6) ['a', 'b', 'c', 'd', 'e', '']

If it follows any quantifier *, +, ? or {}, it will make the quantifier non-greedy (match as few characters as possible), which is exactly the opposite of the
default greedy mode (match as many characters as possible) .
For example, using /\d+/ for "123abc" will match "123", while using /\d+?/ will only match "1".

var reg = /\d+/g
var str = '123abc'
str.match(reg) //['123']

var reg = /\d+?/g
var str = '123abc'
str.match(reg) // ['1', '2', '3']

var reg = /\d?/g
var str = '123abc'
str.match(reg) //['1', '2', '3', '', '', '', '']

{n,m}

n and m are both integers. Match the previous character at least n times and at most m times. If the value of n or m is 0, this value is ignored.

var reg = /\w{1,2}/g
var str = 'abcdefg'
str.match(reg)//(4) ['ab', 'cd', 'ef', 'g']
  • Note: spaces in regular expressions are real spaces

{n,}

n is a positive integer that matches the previous character appearing at least n times.

{n}

n is a positive integer, matching the previous character exactly n times.

^

Matches the start of input. If the multiline flag is set to true, then the position immediately following a newline character is also matched.

For example, /^A/ does not match the 'A' in "an A", but it does match the 'A' in "An E".

$

Matches the end of input. If the multiline flag is set to true, then the position before a newline character is also matched.

For example, /t$/ does not match the 't' in "eater", but it does match the 't' in "eat".

// 匹配abc开头,abc结尾的字符
var reg = /^abc\w*abc$/g
var str = 'abcdefgabc'
str.match(reg)//['abcdefgabc']

var reg = /^abc[\w]*abc$/g
var str = 'abcdefgabc'
str.match(reg)//['abcdefgabc']

var reg = /[^abc][abc$]/g
var str = 'abcdefgabc'
str.match(reg)//['abcdefgabc']

Subexpressions (x) and backreferences

It will match 'x' and remember the match. The brackets are called capturing brackets.

  • '(foo)' and '(bar)' in the pattern /(foo) (bar) \1 \2/ match and remember the first two words in the string "foo bar foo bar".

    var str = 'foo bar foo bar'
    var reg =  /(foo) (bar) \1 \2/
    str.match(reg)
    //(3) ['foo bar foo bar', 'foo', 'bar', index: 0, input: 'foo bar foo bar', groups: undefined]
    //=====================
    var str = 'abcabc'
    var reg = /bc/
    str.match(reg) //['bc', index: 1, input: 'abcabc', groups: undefined]
    // 加括号捕获子表达式
    var str = 'abcabc'
    var reg = /(b)(c)/
    str.match(reg) //(3) ['bc', 'b', 'c', index: 1, input: 'abc', groups: undefined]
    // 不捕获子表达式b
    var str = 'abcabc'
    var reg = /(?:b)(c)/
    str.match(reg)// (2) ['bc', 'c', index: 1, input: 'abcabc', groups: undefined]
    
    
  • \1 and \2 in the pattern represent the first and second substrings matched by the capturing brackets, namely foo and bar, which match the last two words in the original string. This is a back reference

  • Note that \1, \2,...,\n are used in the matching link of regular expressions

  • In the replacement process of regular expressions, you need to use words like $1, 2, . . ., 2,...,2 , . . . , n Such syntax, for example, 'bar foo'.replace(/(…) (…)/, '$21 ′ ). 1').1 ). & represents the entire original string used for matching.

var reg = /(foo) (bar) \1 \2/g
var str = 'foo bar foo bar'
str.match(reg)//['foo bar foo bar']

forward lookup

x(?=y) look-ahead assertion

Matches 'x' only if 'x' is followed by 'y'. This is called a lookahead assertion.

Lookahead assertions are used to check whether the next occurrence is a specific character set.

x(?!y) forward negative search

Match 'x' only if 'x' is not followed by 'y', this is called a forward negative search.

Forward negative search checks for a specific set of strings that should not appear next.

(?:x) non-capturing parentheses

Matches 'x' but does not remember the match. Such brackets are called non-capturing brackets

RegExpProperties

  1. RegExp.prototype.global
    tests a regular expression against all possible matches in a string, or just the first match.

  2. RegExp.prototype.ignoreCase
    Whether to ignore case when matching text.


  3. Whether RegExp.prototype.multiline performs multi-line searches.

  4. RegExp.prototype.source
    The text of the regular expression.

  5. RegExp.lastIndex static property
    The index indicating where to start the next match

RegExp method

test()

The test() method performs a search to see if the regular expression matches the specified string. Returns true or false.

let str = 'hello world!';
let result = /^hello/.test(str);
console.log(result);
// true

exec()

  • Regular match iterator

The exec() method performs a search match within a specified string. Returns a result array or null.

var myRe = /ab*/g; //如果没有g 永远匹配第一项,永远不会有null,下面的判断会死循环
var str = 'abbcdefabh';
var myArray;
while ((myArray = myRe.exec(str)) !== null) {
    
    
    // 循环判断exec返回null之后又会从从头开始匹配
  var msg = 'Found ' + myArray[0] + '. ';
  msg += 'Next match starts at ' + myRe.lastIndex;//lastIndex是可以修改的
  console.log(msg);
}
// Found abb. Next match starts at 3 。 ['abb', index: 0, input: 'abbcdefabh', groups: undefined]
// Found ab. Next match starts at 9 。  ['ab', index: 7, input: 'abbcdefabh', groups: undefined]

match exec

Greedy mode and non-greedy mode

var str = "abc{
    
    {def}}abc{
    
    {xyz}}";
var reg = /{
     
     {.*}}/g
str.match(reg)  //['{
    
    {def}}abc{
    
    {xyz}}']
//本意是想匹配{
    
    {def}}和{
    
    {xyz}}
//任何语言的正则都是贪婪模式

var reg = /{
     
     {.*?}}/g //?将贪婪模式变为非贪婪模式
str.match(reg)   // ['{
    
    {def}}', '{
    
    {xyz}}']

var str = 'aaaaaa';
var reg = /\w?/g //贪婪模式  问号匹配零次或一次,贪婪模式优先匹配一次
str.match(reg) //(7) ['a', 'a', 'a', 'a', 'a', 'a', '']

var reg = /\w??/g  //+? 非贪婪
str.match(reg) //(7) ['', '', '', '', '', '', '']

Guess you like

Origin blog.csdn.net/uotail/article/details/124718922