The regular assertion that the positive and negative doubles meet my 100% matching requirements

foreword

Regular expression application; Regular expressions are an indispensable weapon in daily data processing, especially when they exist as scripting languages, regular expressions are a must-have skill!!! You may not like to use regular expressions, because every time you write regular expressions, it is very easy trouble. But regex is really flexible. Today we will learn the placeholder assertion function of regex, which can make our regex more precise.

Scenes

Simple

  • In the regular expression, you want to set an alias for the match. I introduced this kind of operation before, and it feels very cool to use it later. Here it is regarded as a simple entry-level.
 
 

java

copy code

Pattern compile = Pattern.compile("(.*)\\*\\{(?<c>\\d+)\\}"); Matcher matcher = compile.matcher("zxhtom*{3}"); if (matcher.find()) { System.out.println(matcher.group("c")); }

  • We only need to add the new form in the corresponding brackets ?<name>. At this time, we only need to get the matching content named c. There is no need to care about his position in the regularization.

Advanced

I believe that most people stay at the entry level. At least I've been dealing with day-to-day development scenarios with entry-level techniques. But regularization isn't just about those functions. Understanding advanced techniques allows us to use regular expressions gracefully

Look

  • This function exists in the field of regular expressions  断言 , which can be understood as  java assertions, because the assertions here do not take up space and are only used to judge whether to continue to function. For subdivision, there are positive and negative pre- and post-predicate assertions.

  • I drew a picture and roughly described what is called first and last. What is positive and negative.
  • As shown above, the h in our regularization matches the red place. The latter  ?=tom means that the following  tom string must be matched, and the final matching content is  h. Because the assertion does not occupy a place.
  • So what exactly is positive and negative, and what is the first move before the move? assert that the format is (?[direction][updown][pattern])
  • Among them, variable variables are enclosed in square brackets.
  • direction stands for first and last
  • updown represents positive and negative
  • pattern represents our match expression. Can be our regular regular
concept explain official
Forward matching content updown is = sign
Negative Negate, do not match content updown for the ! number
precedence match left to right direction is empty
backward match from right to left direction is the < sign
  • Ok now we know the concept. h Now what do we do to match the first character  .

precedence

  • First, we use a look-ahead strategy to match. The first is the order from left to right, then we need to look at  h the content behind. The one behind is  zxh. Then we can write like this  h(?=zxh) , then we will look for  hzxh the h at the beginning of .

backward

  • After the implementation of the first line, let's take a look at how to implement the second line. The next row is the order from right to left. So what do we need to look at  h to the left? Yes ~ nothing. Nothing There is a special symbol in the regular expression  \b to indicate the boundary of the word.
  • Combined with the negative matching we said, the negation of nothing is any symbol. We can express  (?!\w) .
  • So can I  \bh either (?<!\w)h

summary

  • Positive and negative are easy to understand, that is, the two functions of complete matching and negation

  • Going ahead and going behind are mainly the directions to consider. For example, now I have a regular pattern,  (?<=zxh)123 how will he match it? First of all, we  < can know that it is a backward match. That is, the match is the left part of the assertion. That is, it matches 123 in zxh123.

  • There are also assertions that don't take up space. This must be clear. (?<=\w)zxh It can only match zxh. But I want to match zxh inside the word and contain 3 characters to the left. An assertion is just a marker. Unable to participate in actual matching.

  • Assertions are also multi-checkable. Normally we assert that it is an AND relationship, and we can use regular expressions   to invalidate the OR effect.

Specific case

  • Regular expressions are slightly different in different languages. Next, let's talk about the use of regular expressions for my commonly used languages.

linux

  • Regular matching in Linux is a regular expression that I personally think is a castrated version. There are probably vim, grep, and sed places where regular expressions are used in Linux.
  • And grep is actually a search operation. All uses of regular expressions only have matching functions. And sed is an operation on characters, so the use of regular expressions in sed is very complete. The rest is the use of regular expressions in vim.

echo "<title>hello</title>sdf<title>nihao</title>" | sed -n '/title.\?/p'

  • Above we simply use grep's regular expressions. sed query is the same operation. () However, it should be noted that key symbols such as those in regular  + expressions need to be escaped before use  \ .

echo "hello i am zxhtom" | sed 's/\(.*\)zxhtom/\1d/'

  • What we also need to pay attention to in the shell is the replacement of characters. After all find matches use mostly the same. In the shell, use =~ to perform regular matching.
 
 

shell

copy code

if [[ "${version}" =~ (.*)(1.2.5\.[0-9]{4}[01]{1}[0-9]{1}[0-3]{1}[0-9]{1}.*) ]] then pre=${BASH_REMATCH[1]} version=${BASH_REMATCH[2]} echo ${version} fi

  • In the above, we get the specified matching item while matching, and then splicing through the shell language to achieve the function of character replacement.

reference link

JavaScript regular expression test

relax for a moment

Correction does much, but encouragement does more. — Johann Wolfgang von Goethe

Guess you like

Origin blog.csdn.net/BASK2312/article/details/131272090