Javascript understanding of regular expressions

Foreword

4089 word article, read it takes about 12 minutes.

In summary: Based on Javascript regular expression, I personally think the combination to explain the characteristics of the regular expression.

With respect to their parents, the United States had dishes that look.

text

I believe many people first saw the first impression of regular expressions are ignorant of force, for starters, a regular expression is a string of a string of meaningless, confusing. However, the regular expression is a very useful feature, whether it is Javascript, PHP, Java or Python has a regular expression. Just like regular expression has evolved into a small language. As part of the programming language, it does not want variables, functions, objects so easy to understand this concept. Many people understand regular expressions are based on simple matching, used in business until completely rely on to solve the problem copy from the Internet. I have to say, with the development of a variety of open source community, by copy can indeed solve most of the problems in business, but as a pursuit of programmers, it would not rely solely on yourself Ctrl C + Ctrl Vto programming. Based on Javascript regular expression, combined with my personal thinking within the community and some excellent article to a regular expression regular expressions to explain.

Javascrip the use of regular expressions

Under a brief introduction, the use of regular expressions in Javascript in two ways:

  1. Constructor: Use the built RegExpconstructor;
  2. Literal: double slash (//);

Using the constructor:

var regexConst = new RegExp('abc');

Use double slash:

var regexLiteral = /abc/;

Matching

Javascript in regular expression object has two main methods, testand exec:

test()The method takes one parameter, the parameter is a string and a regular expression for matching, the following examples:

var regex = /hello/;
var str = 'hello world';
var result = regex.test(str);
console.log(result);
// returns true

exec()Performing a method of matching a specified search string. Or return a result array null.

var regex = /hello/;
var str = 'hello world';
var result = regex.exec(str);
console.log(result);
// returns [ 'hello', index: 0, input: 'hello world', groups: undefined ]
// 匹配失败会返回null
// 'hello' 待匹配的字符串
// index: 正则表达式开始匹配的位置
// input: 原始字符串

Below are used test()to test the method.

Mark

Flag is used to indicate a search string parameter range, there are 6 flags:

Mark description
g Global search.
i Not case-sensitive search.
m Multi-line search.
s Allow .to match newline.
in Use unicode mode code match.
Y The implementation of "sticky" search, matching the current position of the target from the beginning of the string, you can use the y flag.

Double slash syntax:

var re = /pattern/flags;

The constructor syntax:

var re = new RegExp("pattern", "flags");

Facie instance:

var reg1 = /abc/gi;
var reg2 = new RegExp("abc", "gi");
var str = 'ABC';
console.log(reg1.test(str)); // true
console.log(reg2.test(str)); // true

Regular expressions thinking

A regular expression is a pattern matching the string.

Remember, the regular expression is operating on strings, it generally has a string type of programming language, there will be regular expressions.

For strings, it is composed of two parts: the content and location .

For example, a string:

'hello World';

It is the content:

'h', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd'

As the string of letters are each independently of the contents of the string, and the location means:

Position is referred to a position between adjacent characters, i.e. the position of the figure above the arrow.

Matching content compared to the matching position is more complicated, look under the simple matching method:

Simple match

The easiest way is to complete the match to match a string:

 var regex = /hello/;
 console.log(regex.test('hello world'));
 // true

Complicated match

There are a lot of regular expressions used to match a string of special characters, how much solution is to match (by location match) and match who (by content matching) problems . Let's look at some of the matches to determine who the special characters:

Matched content

Simple special character

Simple matching content with special characters as follows:

  • [xyz]: Character set, a character matching for any square brackets, for example:

    var regex = /[bt]ear/;
    console.log(regex.test('tear'));
    // returns true
    console.log(regex.test('bear'));
    // return true
    console.log(regex.test('fear'));
    // return false

    Note: In addition to the special characters ^, all of the other special characters in the character set (in square brackets) will lose its special meaning.

  • [^xyz]: This is a character set and a different character sets things above, which is used to match all characters not in brackets. such as:

    var regex = /[^bt]ear/;
    console.log(regex.test('tear'));
    // returns false
    console.log(regex.test('bear'));
    // return false
    console.log(regex.test('fear'));
    // return true

    For lowercase letters, uppercase letters and numbers of these three very popular character, but also provides a relatively simple wording:

  • \d: The equivalent [0-9], matching numeric characters.

  • \D: The equivalent of [^0-9]matching non-numeric characters.

  • \w: The equivalent [a-zA-Z0–9_], matching numbers, lowercase letters, uppercase letters and underlined.

  • \W: The equivalent [^A-Za-z0-9_], matching non-digital, non-lowercase letters, uppercase letters and non-non-underlined.

  • [a-z]: If we want to match all the letters, it is a stupid way to do all letters are written in square brackets, but it is clear that to achieve very elegant, easy to read and very easy to miss letters. There is a simpler implementation, is to specify a range of characters, such as [AH] is to match all the letters between the letters of a letter to H, it can also be matched in addition to lowercase and uppercase letters, numbers, [0-9] matches to 0 number between. 9, [AZ] matches all capital letters a to between Z. such as:

    var regex = /[a-z0-9A-Z]ear/;
    console.log(regex.test('fear'));
    // returns true
    console.log(regex.test('tear'));
    // returns true
    console.log(regex.test('1ear'));
    // returns true
    console.log(regex.test('Tear'));
    // returns true
  • x|y: Match x or y. such as:

    var regex = /(green|red) apple/;
    console.log(regex.test('green apple'));
    // true
    console.log(regex.test('red apple'));
    // true
    console.log(regex.test('blue apple'));
    // false
  • .: Matches any single character except a newline, if there are signs sit will match a newline example:

    var regex = /.n/ ;
    console.log(regex.test('an'));
    // true
    console.log(regex.test('no'));
    // false
    console.log(regex.test('on'));
    // true
    console.log(regex.test(`
    n`));
    // false
    console.log(/.n/s.test(`
    n`)); // 注意这里的正则
    // true
  • \: This is used to escape special characters, such as we want to match the square brackets, you can use \the escape, the same match \ can also be used \to escape, such as:

    var regex = /\[\]/;
    console.log(regex.test('[]')); // true

The above special characters can only match a target string once, but many scenes we need to match the target string multiple times, such as we want to match numerous a, the above special characters will not be able to meet our needs, so the content match special characters in some of it is used to solve this problem:

  • {n}: Character n times before matching braces. example:

    var regex = /go{2}d/;
    console.log(regex.test('good'));
    // true
    console.log(regex.test('god'));
    // false

    Well understood, the above canonical equivalent /good/.

  • {n,}: Before matching brace character at least n times. example:

    var regex = /go{2,}d/;
    console.log(regex.test('good'));
    // true
    console.log(regex.test('goood'));
    // true
    console.log(regex.test('gooood'));
    // true
  • {n,m}: Before brace characters match at least n times at most m times. example:

    var regex = /go{1,2}d/;
    console.log(regex.test('god'));
    // true
    console.log(regex.test('good'));
    // true
    console.log(regex.test('goood'));
    // false

For more convenient to use, but also provides a more convenient three commonly used rule of writing:

  • *: Equivalent {0,}. It indicates that the preceding character appears at least 0 times, that is, appear any number of times.
  • +: Equivalent {1,}. It indicates that the preceding character appears at least once.
  • ?: Equivalent {0,1}. It indicates that the characters do not appear or appear more than once.

Using the above content matching ordinary characters has to meet the demand, but as line breaks, page breaks, and carriage returns and other special symbols or more special characters can not meet the demand, so the regular expression matching is also designed to provide a special symbol Special characters:

  • \s: Matches a whitespace characters, including spaces, tabs, page breaks and line breaks. Look at an example:

    var reg = /\s/;
    console.log(reg.test(' ')); // true
  • \S: Matches a non-whitespace character;

  • \t: Matching a horizontal tab.

  • \n: Matches a newline.

  • \f: Match a feed character.

  • \r: Matching a carriage return.

  • \v: Matching a vertical tab.

  • \0: Match NULL (U + 0000) character.

  • [\b]: Matches a backspace.

  • \cX: When X is in between characters A to Z, a matching string of control character.

Advanced content match
  • (x): X match and remember x, in brackets is known as capturing groups. The brackets can support strong sub-expressions, that can be written in regular brackets go, then as a whole to match. There is also a special character called \n, and in front of the n line breaks are not the same, this is a variable refers to the number, used to record the number of capturing groups. example:

    console.log(/(foo)(bar)\1\2/.test('foobarfoobar')); // true
    console.log(/(\d)([a-z])\1\2/.test('1a1a')); // true
    console.log(/(\d)([a-z])\1\2/.test('1a2a')); // false
    console.log(/(\d){2}/.test('12')); // true

    In the regular expression replacement part, will have to use something like $1, , $2..., $nso syntax, for example 'bar foo'.replace(/(...) (...)/, '$2 $1'). $&It represents the entire original string for matching.

  • (?:x): Matches 'x' but does not remember the match. It is called non-capturing group. Here \ 1 will not take effect, will it as an ordinary character. example:

    var regex = /(?:foo)bar\1/;
    console.log(regex.test('foobarfoo'));
    // false
    console.log(regex.test('foobar'));
    // false
    console.log(regex.test('foobar\1'));
    // true

Matching position

Again, the location here is the location of the previous figure in arrow.

  • ^: Match beginning of the string, which is the position of the first arrow of our previous location map. Attention and [^xy]the ^distinction between two entirely different meaning, see ^example:

    var regex = /^g/;
    console.log(regex.test('good'));
    // true
    console.log(regex.test('bad'));
    // false
    console.log(regex.test('tag'));
    // false

    I.e. above meaning regular matching strings beginning with the letter g.

  • $: End position of the matching string, examples:

    var regex = /.com$/;
    console.log(regex.test('[email protected]'));
    // true
    console.log(regex.test('test@testmail'));
    // false

    I.e. above meaning regular matches strings ending .com

  • \b: Matches a word boundary. Note that the match is a word boundary, this boundary means that a word is not another "word" character followed with front position or other "word" character position. That is a position to meet the requirements of both sides are not all normal characters are incomplete or special symbols . Look at an example:

    console.log(/\bm/.test('moon')); // true 匹配“moon”中的‘m’,\b的左边是空字符串,右边是'm'
    console.log(/oo\b/.test('moon')); // false 并不匹配"moon"中的'oo',因为 \b左边上oo,右边是n,全是正常字符
    console.log(/oon\b/.test('moon')); // true 匹配"moon"中的'oon',\b左边是oon,右边是空字符串
    console.log(/n\b/.test('moon   ')); // true 匹配"moon"中的'n',\b左边是n,右边是空格
    console.log(/\bm/.test('   moon')); // true 匹配"moon"中的'm',\b左边是空字符串 右边是m
    console.log(/\b/.test('  ')); // false 无法匹配空格,\b左边是空格或空字符串,右边是空格或是空字符串,无法满足不全是正常字符或是不全是正常字符

    If this is not well understood, you can look at \B, to understand a little better.

  • \B: Matches a non-word boundary, and \bthe opposite, that match is about both sides of all is the location of all normal characters or special symbols . Look at an example:

    console.log(/\B../.test('moon')); // true 匹配'moon'中的'oo' \B左边是m,右边是o
    console.log(/\B./.exec('  ')); // true 匹配'  '中的' ' \B左边是空字符串,右边是空格' '
  • x(?!y): Matches only 'x' when 'x' is not followed behind 'y', which is known as negative look forward. example:

    var regex = /Red(?!Apple)/;
    console.log(regex.test('RedOrange')); // true
  • (?<!y)x: Matches only 'x' when 'x' is not in front of 'y', which is referred to as a reverse lookup denied. example:

    var regex = /(?<!Red)Apple/;
    console.log(regex.test('GreenApple')); // true
  • x(?=)y: Match 'x' only if 'x' followed by a 'y' which is called the first assertion. example:

    var regex = /Red(?=Apple)/;
    console.log(regex.test('RedApple')); // true
  • (?<=y)x: Match 'x' only if 'x' front 'y' which is called the trailing assertion. example:

    var regex = /(?<=Red)Apple/;
    console.log(regex.test('RedApple')); // true

JS can use regular expressions

method description
RegExp.prototype.exec Performing a lookup in a method RegExp matching string, it returns an array (not match returns to null).
RegExp.prototype.test Test Method RegExp a match in the string, which returns true or false.
String.prototype.match Find a matching String execution method in the string, it returns an array, when not to match returns null.
String.prototype.matchAll Find all String methods to perform a match in the string, it returns an iterator (iterator).
String.prototype.search Test method for matching a String in the string, it returns to the location index matched, or -1 on failure.
String.prototype.replace Performing a lookup in a string matching method String, and used to replace the replace the substring matching string.
String.prototype.split Using a regular expression or a character string separated by a fixed string, array and stores the substring to the partition Stringmethod.

Exercise

  • Matches any 10-digit:
var regex = /^\d{10}$/;
console.log(regex.test('9995484545'));
// true

Regular analysis under the above:

  1. We want to match across the entire string, the string can not have what we want to match can therefore use ^and $limit the beginning and end;
  2. \dTo match the number that corresponds to [0-9];
  3. {10}Matching the \dexpression, i.e., \drepeated 10 times;
  • Match the date format DD-MM-YYYYor DD-MM-YY:
var regex = /^(\d{1,2}-){2}\d{2}(\d{2})?$/;
console.log(regex.test('10-01-1990'));
// true
console.log(regex.test('2-01-90'));
// true
console.log(regex.test('10-01-190'));

The above analysis Regular:

  1. Similarly we use ^and $limit the beginning and end;
  2. \d{1,2}, Which matches one or two digits;
  3. -To match a hyphen, no special meaning;
  4. ()It wrapped up a sub-expression, also known as capturing groups;
  5. {2}It represents two sub-matching expression above;
  6. \d{2}Match two digits;
  7. (\d{2})?Subexpression matched two numbers and then match or do not match a sub-expression;
  • CamelCase turn underlined name:
var reg = /(\B[A-Z])/g;
'oneTwoThree'.replace(reg, '_$1').toLowerCase();

The above analysis Regular:

  1. \BAvoid capitalize the first letter of the characters are also cast away;
  2. ([A-Z])Capture captured by the group in capital letters;
  3. Then replace in using $nthis syntax to express in front of the capture;
  4. Call toLowerCaselowercase letters;

in conclusion

正则表达式各种规则很难记,希望本篇文章可以帮大家更好的去记忆这些特殊字符,也希望大家能写出牛叉的正则表达式。共勉。最后提供一个练习正则表达式的链接:https://www.hackerrank.com/domains/regex和一个工具网站:https://regex101.com/

以上。


能力有限,水平一般,欢迎勘误,不胜感激。

订阅更多文章可关注公众号「前端进阶学习」,回复「666」,获取一揽子前端技术书籍

前端进阶学习

Guess you like

Origin www.cnblogs.com/jztan/p/12358144.html