Commonly used regular expressions in python

 Nginx access log matches re.compile

#re.compile rule explanation, the change rule must be matched from the front and written to the back one by one, and the previous one is modified and all errors are later. The special standard ends with an empty symbol or double quotation marks: change the symbol to start from "starting to match an infinite number of" to "end" (?P<request>[^"]*)

Example 2:

Matching color correspondence table: 
formula: 123, 123, 123, 1
123 are represented by colors respectively.

{"level":"info","message":"Sendno: 1974497337","timestamp":"2017-01-18 08:00:08:545"}
reg = re.compile('\{"level":(?P<remote_ip>[^.]*),"message":(?P<messagea>[^.]*),"timestamp":(?P <time>[^.]*)\}') 

########################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################## ###############################
a='{"level":"info","message":"::ffff:10.42.99.232 - - [18/Jan/2017:00:00:05 +0000] \"POST /api/v1/push HTTP/1.1\" 200 84 \"-\" \"-\"\n","timestamp":"2017-01-18 08:00:05:998"}'
reg = re.compile('\{"level":"(?P<remote_ip>[^.]*)","message":"(?P<messagea>.*?)","timestamp":"(?P<time>[^.]*)"}', re.S)

reg = re.compile('\{(?P<remote_ip>[^,]*),(?P<messagea>[^,]*),(?P<time>[^,]*)') # Universal Brainless Matching Type


copy code
import re
line ='192.168.0.1 25/Oct/2012:14:46:34 "GET /api HTTP/1.1" 200 44 "http://abc.com/search"
"Mozilla/5.0"'
reg = re.compile('^(?P<remote_ip>[^ ]*) (?P<date>[^ ]*) "(?P<request>[^"]*)"
(?P<status>[^ ]*) (?P<size>[^ ]*) "(?P<referrer>[^"]*)" "(?P<user_agent>[^"]*)"')
regMatch = reg.match(line)
linebits = regMatch.groupdict ()
print linebits
for k, v in linebits.items() :
    print k+": "+v
copy code

 

Common methods of python regular re module

re.match #match from the beginning

re.search #globally match the first matching string

copy code
import re

val="1-2*((60-30+(-40.0/5)*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2)"

def f1(x):
    b=re.search("([+-/*])*",x)
    print (b.groups(),b.span(),b) #b.groups take out the matched value, b.span #take out the position of the matched value.
#while True:
b = re.split("\(([^()]+)\)",val,1)
if len(b) == 3:
     before,cotent,after=b
     r = f1(cotent)
     new_r = before+str(r)+after
     val=new_r
Result: ('-',) (0, 1) <_sre.SRE_Match object; span=(0, 1), match='-'>
copy code

 

re.findall #match all, put it in the list

re.finditer # Find all matching substrings and return them as an iterator. The matches are returned in order from left to right. If there is no match, return an empty list. Use a for loop to output matched values.

re.split #match to the value to split.

val = "hello alex bcd"
re.split("(a\w+)",val) #Match all strings, and include a\w+ to match the value, often grouped by parentheses.
re.split("a\w+",val) #Without parentheses, it is only divided into 2 parts according to the matching value.

re.sub The re.sub function performs regular expression-based substitutions.

f=re.sub("\d+","112",'dsdfgda3434fgfg')
print (f)
dsdfgda112fgfg

print (r.group()) # print all the regular matches

print (r.groups()) #Print the things in the regular group, the single in () brackets.

print(r.groupdict()) #Put the matched data into a dictionary (?P<key1>) Add KEY in parentheses

r=re.match("(?P<n1>h)(\w+)")  print(r.groupdict())
['n1':'h']

 

metacharacter illustrate
. represents any character
\  
[ ] matches any character or subexpression inside
[^] character set and negation
- define an interval
\ Negate the next character (usually normal becomes special, special becomes normal)
* Matches the preceding character or subexpression 0 or more times
*? Lazy match previous
+ Matches the previous character or subexpression one or more times
+? Lazy match previous
? Match the previous character or subexpression 0 or 1 repetitions
{n} matches the previous character or subexpression
{m,n} matches the previous character or subexpression at least m times and at most n times
{n,} Matches the previous character or subexpression at least n times
{n,}? Lazy match of the previous one
^ matches the beginning of the string
\A matches the beginning of the string
$ end of match string
[\b] backspace character
\c matches a control character
\d match any number
\D matches characters other than numbers
\t match tab
\w matches any digit letter underscore
\W Does not match alphanumeric underscore

 

The intermediate article introduces subexpressions, searches forward and backward, and backtracks the reference link: http://www.cnblogs.com/chuxiuhong/p/5907484.html

 

1. The expression of the check digit

 

  • Numbers: ^[0-9]*$

  • n-digit number: ^\d{n}$

  • At least n digits: ^\d{n,}$

  • mn digits: ^\d{m,n}$

  • Numbers starting with zero and non-zero: ^(0|[1-9][0-9]*)$

  • Non-zero leading numbers with up to two decimal places: ^([1-9][0-9]*)+(.[0-9]{1,2})?$

  • Positive or negative numbers with 1-2 decimal places: ^(\-)?\d+(\.\d{1,2})?$

  • Positive, negative, and decimal numbers: ^(\-|\+)?\d+(\.\d+)?$

  • Positive real numbers with two decimal places: ^[0-9]+(.[0-9]{2})?$

  • Positive real numbers with 1 to 3 decimal places: ^[0-9]+(.[0-9]{1,3})?$

  • Non-zero positive integer: ^[1-9]\d*$ or ^([1-9][0-9]*){1,3}$ or ^\+?[1-9][0- 9]*$

  • Non-zero negative integer: ^\-[1-9][]0-9″*$ or ^-[1-9]\d*$

  • Non-negative integer: ^\d+$ or ^[1-9]\d*|0$

  • Non-positive integer: ^-[1-9]\d*|0$ or ^((-\d+)|(0+))$

  • Non-negative floating point number: ^\d+(\.\d+)?$ or ^[1-9]\d*\.\d*|0\.\d*[1-9]\d*|0?\ .0+|0$

  • Non-positive floating point number: ^((-\d+(\.\d+)?)|(0+(\.0+)?))$ or ^(-([1-9]\d*\.\d *|0\.\d*[1-9]\d*))|0?\.0+|0$

  • Positive float: ^[1-9]\d*\.\d*|0\.\d*[1-9]\d*$ or ^(([0-9]+\.[0-9 ]*[1-9][0-9]*)|([0-9]*[1-9][0-9]*\.[0-9]+)|([0-9]* [1-9][0-9]*))$

  • Negative float: ^-([1-9]\d*\.\d*|0\.\d*[1-9]\d*)$ or ^(-(([0-9]+\ .[0-9]*[1-9][0-9]*)|([0-9]*[1-9][0-9]*\.[0-9]+)|([ 0-9]*[1-9][0-9]*)))$

  • Floating point numbers: ^(-?\d+)(\.\d+)?$ or ^-?([1-9]\d*\.\d*|0\.\d*[1-9]\d *|0?\.0+|0)$

 

Second, the expression of the check character

 

  • Chinese characters: ^[\u4e00-\u9fa5]{0,}$

  • English and numeric: ^[A-Za-z0-9]+$ or ^[A-Za-z0-9]{4,40}$

  • All characters of length 3-20: ^.{3,20}$

  • A string consisting of 26 English letters: ^[A-Za-z]+$

  • A string consisting of 26 uppercase English letters: ^[AZ]+$

  • A string consisting of 26 lowercase English letters: ^[az]+$

  • A string consisting of numbers and 26 English letters: ^[A-Za-z0-9]+$

  • A string consisting of numbers, 26 English letters or underscores: ^\w+$ or ^\w{3,20}$

  • Chinese, English, numbers including underscore: ^[\u4E00-\u9FA5A-Za-z0-9_]+$

  • Chinese, English, numbers but not including underscores and other symbols: ^[\u4E00-\u9FA5A-Za-z0-9]+$ or ^[\u4E00-\u9FA5A-Za-z0-9]{2,20}$

  • You can enter characters including ^%&',;=?$\": [^%&',;=?$\x22]+

  • Characters containing ~ are prohibited: [^~\x22]+

 

Three, special needs expression

 

  • Email地址:^\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*$

  • 域名 : [a-zA-Z0-9] [- a-zA-Z0-9] {0,62} [/ [a-zA-Z0-9] [- a-zA-Z0-9] {0 , 62}) + /.?

  • InternetURL:[a-zA-z]+://[^\s]* 或 ^http://([\w-]+\.)+[\w-]+(/[\w-./?%&=]*)?$

  • Mobile number: ^(13[0-9]|14[5|7]|15[0|1|2|3|5|6|7|8|9]|18[0|1|2|3| 5|6|7|8|9])\d{8}$

  • Phone numbers ("XXX-XXXXXXX", "XXXX-XXXXXXXX", "XXX-XXXXXXX", "XXX-XXXXXXXX", "XXXXXXX" and "XXXXXXXX"): ^($$\d{3,4}-)|\ d{3.4}-)?\d{7,8}$

  • Domestic phone numbers (0511-4405222, 021-87888822): \d{3}-\d{8}|\d{4}-\d{7}

  • ID number (15 digits, 18 digits): ^\d{15}|\d{18}$

  • Short ID number (number, letter x ending): ^([0-9]){7,18}(x|X)?$ or ^\d{8,18}|[0-9x]{8, 18}|[0-9X]{8,18}?$

  • Whether the account number is legal (begins with a letter, allows 5-16 bytes, and allows alphanumeric underscores): ^[a-zA-Z][a-zA-Z0-9_]{4,15}$

  • Password (start with a letter, length between 6~18, can only contain letters, numbers and underscores): ^[a-zA-Z]\w{5,17}$

  • Strong password (must contain a combination of uppercase and lowercase letters and numbers, no special characters, length between 8-10): ^(?=.*\d)(?=.*[az])(?=.* [AZ]).{8,10}$

  • Date format: ^\d{4}-\d{1,2}-\d{1,2}

  • 12 months of a year (01~09 and 1~12): ^(0?[1-9]|1[0-2])$

  • 31 days of a month (01~09 and 1~31): ^((0?[1-9])|((1|2)[0-9])|30|31)$

  • Money input format:

    • There are four representations of money we can accept: "10000.00" and "10,000.00", and "10000" and "10,000" without "cents": ^[1-9][0-9]*$

    • This means any number that does not start with 0, however, it also means that a character "0" is not passed, so we use the following form: ^(0|[1-9][0-9]*)$

    • A 0 or a number that does not start with 0. We can also allow a negative sign at the beginning: ^(0|-?[1-9][0-9]*)$

    • This means a 0 or a number that does not start with 0 that may be negative. Let the user start with 0. Also remove the negative sign, because money can't always be negative. What we want to add below is to show that it is possible Fractional part: ^[0-9]+(.[0-9]+)?$

    • It must be noted that there should be at least 1 digit after the decimal point, so "10." is not passed, but "10" and "10.2" are passed: ^[0-9]+(.[0-9] {2})?$

    • In this way, we stipulate that there must be two digits after the decimal point. If you think it is too harsh, you can do this: ^[0-9]+(.[0-9]{1,2})?$

    • This allows the user to write only one decimal place. Now it's time to consider commas in numbers, we can do this: ^[0-9]{1,3}(,[0-9]{3})*(.[0-9]{1,2}) ?$

    • 1 to 3 digits followed by any comma + 3 digits, the comma becomes optional, not required: ^([0-9]+|[0-9]{1,3}(,[0-9 ]{3})*)(.[0-9]{1,2})?$

    • Note: This is the final result, don't forget that "+" can be replaced with "*". If you think the empty string is acceptable (weird, why?) Finally, don't forget to remove the backslash when using the function, the general error is here

  • xml 文件 : ^ ([a-zA-Z] + -?) + [a-zA-Z0-9] + \\. [x | X] [m | M] [l | L] $

  • Regular expression for Chinese characters: [\u4e00-\u9fa5]

  • Double-byte characters: [^\x00-\xff] (including Chinese characters, can be used to calculate the length of the string (a double-byte character counts as 2, ASCII character counts as 1))

  • Regular expression for blank lines: \n\s*\r (can be used to delete blank lines)

  • Regular expression of HTML tags: <(\S*?)[^>]*>.*?</\1>|<.*? /> (The version circulating on the Internet is too bad, the above is only part of it, Still powerless for complex nested tags)

  • Regular expression for leading and trailing whitespace characters: ^\s*|\s*$ or (^\s*)|(\s*$) (can be used to delete whitespace characters (including spaces, tabs) at the beginning and end of a line , form feeds, etc.), very useful expressions)

  • Tencent QQ number: [1-9][0-9]{4,} (Tencent QQ number starts from 10000)

  • China Postal Code: [1-9]\d{5}(?!\d) (China Postal Code is 6 digits)

  • IP address: \d+\.\d+\.\d+\.\d+ (useful when extracting IP address)

  • IP address: ((?:(?:25[0-5]|2[0-4]\\d|[01]?\\d?\\d)\\.){3}(?:25 [0-5]|2[0-4]\\d|[01]?\\d?\\d)) (Provided by @Feilongsanshao, thanks for sharing)

Original source: https://www.cnblogs.com/cp-miao/p/5567115.html

#re.compile rule explanation, the change rule must be matched from the front and written to the back one by one, and the previous one is modified and all errors are later. The special standard ends with an empty symbol or double quotation marks: change the symbol to start from "starting to match an infinite number of" to "end" (?P<request>[^"]*)

Example 2:

Matching color correspondence table: 
formula: 123, 123, 123, 1
123 are represented by colors respectively.

{"level":"info","message":"Sendno: 1974497337","timestamp":"2017-01-18 08:00:08:545"}
reg = re.compile('\{"level":(?P<remote_ip>[^.]*),"message":(?P<messagea>[^.]*),"timestamp":(?P <time>[^.]*)\}') 

########################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################## ###############################
a='{"level":"info","message":"::ffff:10.42.99.232 - - [18/Jan/2017:00:00:05 +0000] \"POST /api/v1/push HTTP/1.1\" 200 84 \"-\" \"-\"\n","timestamp":"2017-01-18 08:00:05:998"}'
reg = re.compile('\{"level":"(?P<remote_ip>[^.]*)","message":"(?P<messagea>.*?)","timestamp":"(?P<time>[^.]*)"}', re.S)

reg = re.compile('\{(?P<remote_ip>[^,]*),(?P<messagea>[^,]*),(?P<time>[^,]*)') # Universal Brainless Matching Type


copy code
import re
line ='192.168.0.1 25/Oct/2012:14:46:34 "GET /api HTTP/1.1" 200 44 "http://abc.com/search"
"Mozilla/5.0"'
reg = re.compile('^(?P<remote_ip>[^ ]*) (?P<date>[^ ]*) "(?P<request>[^"]*)"
(?P<status>[^ ]*) (?P<size>[^ ]*) "(?P<referrer>[^"]*)" "(?P<user_agent>[^"]*)"')
regMatch = reg.match(line)
linebits = regMatch.groupdict ()
print linebits
for k, v in linebits.items() :
    print k+": "+v
copy code

 

Common methods of python regular re module

re.match #match from the beginning

re.search #globally match the first matching string

copy code
import re

val="1-2*((60-30+(-40.0/5)*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2)"

def f1(x):
    b=re.search("([+-/*])*",x)
    print (b.groups(),b.span(),b) #b.groups take out the matched value, b.span #take out the position of the matched value.
#while True:
b = re.split("\(([^()]+)\)",val,1)
if len(b) == 3:
     before,cotent,after=b
     r = f1(cotent)
     new_r = before+str(r)+after
     val=new_r
Result: ('-',) (0, 1) <_sre.SRE_Match object; span=(0, 1), match='-'>
copy code

 

re.findall #match all, put it in the list

re.finditer # Find all matching substrings and return them as an iterator. The matches are returned in order from left to right. If there is no match, return an empty list. Use a for loop to output matched values.

re.split #match to the value to split.

val = "hello alex bcd"
re.split("(a\w+)",val) #Match all strings, and include a\w+ to match the value, often grouped by parentheses.
re.split("a\w+",val) #Without parentheses, it is only divided into 2 parts according to the matching value.

re.sub The re.sub function performs regular expression-based substitutions.

f=re.sub("\d+","112",'dsdfgda3434fgfg')
print (f)
dsdfgda112fgfg

print (r.group()) # print all the regular matches

print (r.groups()) #Print the things in the regular group, the single in () brackets.

print(r.groupdict()) #Put the matched data into a dictionary (?P<key1>) Add KEY in parentheses

r=re.match("(?P<n1>h)(\w+)")  print(r.groupdict())
['n1':'h']

 

metacharacter illustrate
. represents any character
\  
[ ] matches any character or subexpression inside
[^] character set and negation
- define an interval
\ Negate the next character (usually normal becomes special, special becomes normal)
* Matches the preceding character or subexpression 0 or more times
*? Lazy match previous
+ Matches the previous character or subexpression one or more times
+? Lazy match previous
? Match the previous character or subexpression 0 or 1 repetitions
{n} matches the previous character or subexpression
{m,n} matches the previous character or subexpression at least m times and at most n times
{n,} Matches the previous character or subexpression at least n times
{n,}? Lazy match of the previous one
^ matches the beginning of the string
\A matches the beginning of the string
$ end of match string
[\b] backspace character
\c matches a control character
\d match any number
\D matches characters other than numbers
\t match tab
\w matches any digit letter underscore
\W Does not match alphanumeric underscore

 

The intermediate article introduces subexpressions, searches forward and backward, and backtracks the reference link: http://www.cnblogs.com/chuxiuhong/p/5907484.html

 

1. The expression of the check digit

 

  • Numbers: ^[0-9]*$

  • n-digit number: ^\d{n}$

  • At least n digits: ^\d{n,}$

  • mn digits: ^\d{m,n}$

  • Numbers starting with zero and non-zero: ^(0|[1-9][0-9]*)$

  • Non-zero leading numbers with up to two decimal places: ^([1-9][0-9]*)+(.[0-9]{1,2})?$

  • Positive or negative numbers with 1-2 decimal places: ^(\-)?\d+(\.\d{1,2})?$

  • Positive, negative, and decimal numbers: ^(\-|\+)?\d+(\.\d+)?$

  • Positive real numbers with two decimal places: ^[0-9]+(.[0-9]{2})?$

  • Positive real numbers with 1 to 3 decimal places: ^[0-9]+(.[0-9]{1,3})?$

  • Non-zero positive integer: ^[1-9]\d*$ or ^([1-9][0-9]*){1,3}$ or ^\+?[1-9][0- 9]*$

  • 非零的负整数:^\-[1-9][]0-9″*$ 或 ^-[1-9]\d*$

  • 非负整数:^\d+$ 或 ^[1-9]\d*|0$

  • 非正整数:^-[1-9]\d*|0$ 或 ^((-\d+)|(0+))$

  • 非负浮点数:^\d+(\.\d+)?$ 或 ^[1-9]\d*\.\d*|0\.\d*[1-9]\d*|0?\.0+|0$

  • 非正浮点数:^((-\d+(\.\d+)?)|(0+(\.0+)?))$ 或 ^(-([1-9]\d*\.\d*|0\.\d*[1-9]\d*))|0?\.0+|0$

  • 正浮点数:^[1-9]\d*\.\d*|0\.\d*[1-9]\d*$ 或 ^(([0-9]+\.[0-9]*[1-9][0-9]*)|([0-9]*[1-9][0-9]*\.[0-9]+)|([0-9]*[1-9][0-9]*))$

  • 负浮点数:^-([1-9]\d*\.\d*|0\.\d*[1-9]\d*)$ 或 ^(-(([0-9]+\.[0-9]*[1-9][0-9]*)|([0-9]*[1-9][0-9]*\.[0-9]+)|([0-9]*[1-9][0-9]*)))$

  • 浮点数:^(-?\d+)(\.\d+)?$ 或 ^-?([1-9]\d*\.\d*|0\.\d*[1-9]\d*|0?\.0+|0)$

 

二、校验字符的表达式

 

  • 汉字:^[\u4e00-\u9fa5]{0,}$

  • 英文和数字:^[A-Za-z0-9]+$ 或 ^[A-Za-z0-9]{4,40}$

  • 长度为3-20的所有字符:^.{3,20}$

  • 由26个英文字母组成的字符串:^[A-Za-z]+$

  • 由26个大写英文字母组成的字符串:^[A-Z]+$

  • 由26个小写英文字母组成的字符串:^[a-z]+$

  • 由数字和26个英文字母组成的字符串:^[A-Za-z0-9]+$

  • 由数字、26个英文字母或者下划线组成的字符串:^\w+$ 或 ^\w{3,20}$

  • 中文、英文、数字包括下划线:^[\u4E00-\u9FA5A-Za-z0-9_]+$

  • 中文、英文、数字但不包括下划线等符号:^[\u4E00-\u9FA5A-Za-z0-9]+$ 或 ^[\u4E00-\u9FA5A-Za-z0-9]{2,20}$

  • 可以输入含有^%&’,;=?$\”等字符:[^%&',;=?$\x22]+

  • 禁止输入含有~的字符:[^~\x22]+

 

三、特殊需求表达式

 

  • Email地址:^\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*$

  • 域名:[a-zA-Z0-9][-a-zA-Z0-9]{0,62}(/.[a-zA-Z0-9][-a-zA-Z0-9]{0,62})+/.?

  • InternetURL:[a-zA-z]+://[^\s]* 或 ^http://([\w-]+\.)+[\w-]+(/[\w-./?%&=]*)?$

  • 手机号码:^(13[0-9]|14[5|7]|15[0|1|2|3|5|6|7|8|9]|18[0|1|2|3|5|6|7|8|9])\d{8}$

  • 电话号码(“XXX-XXXXXXX”、”XXXX-XXXXXXXX”、”XXX-XXXXXXX”、”XXX-XXXXXXXX”、”XXXXXXX”和”XXXXXXXX):^($$\d{3,4}-)|\d{3.4}-)?\d{7,8}$

  • 国内电话号码(0511-4405222、021-87888822):\d{3}-\d{8}|\d{4}-\d{7}

  • 身份证号(15位、18位数字):^\d{15}|\d{18}$

  • 短身份证号码(数字、字母x结尾):^([0-9]){7,18}(x|X)?$ 或 ^\d{8,18}|[0-9x]{8,18}|[0-9X]{8,18}?$

  • 帐号是否合法(字母开头,允许5-16字节,允许字母数字下划线):^[a-zA-Z][a-zA-Z0-9_]{4,15}$

  • 密码(以字母开头,长度在6~18之间,只能包含字母、数字和下划线):^[a-zA-Z]\w{5,17}$

  • 强密码(必须包含大小写字母和数字的组合,不能使用特殊字符,长度在8-10之间):^(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,10}$

  • 日期格式:^\d{4}-\d{1,2}-\d{1,2}

  • 一年的12个月(01~09和1~12):^(0?[1-9]|1[0-2])$

  • 一个月的31天(01~09和1~31):^((0?[1-9])|((1|2)[0-9])|30|31)$

  • 钱的输入格式:

    • 有四种钱的表示形式我们可以接受:”10000.00″ 和 “10,000.00″, 和没有 “分” 的 “10000″ 和 “10,000″:^[1-9][0-9]*$

    • 这表示任意一个不以0开头的数字,但是,这也意味着一个字符”0″不通过,所以我们采用下面的形式:^(0|[1-9][0-9]*)$

    • 一个0或者一个不以0开头的数字.我们还可以允许开头有一个负号:^(0|-?[1-9][0-9]*)$

    • 这表示一个0或者一个可能为负的开头不为0的数字.让用户以0开头好了.把负号的也去掉,因为钱总不能是负的吧.下面我们要加的是说明可能的小数部分:^[0-9]+(.[0-9]+)?$

    • 必须说明的是,小数点后面至少应该有1位数,所以”10.”是不通过的,但是 “10″ 和 “10.2″ 是通过的:^[0-9]+(.[0-9]{2})?$

    • 这样我们规定小数点后面必须有两位,如果你认为太苛刻了,可以这样:^[0-9]+(.[0-9]{1,2})?$

    • 这样就允许用户只写一位小数。下面我们该考虑数字中的逗号了,我们可以这样:^[0-9]{1,3}(,[0-9]{3})*(.[0-9]{1,2})?$

    • 1到3个数字,后面跟着任意个 逗号+3个数字,逗号成为可选,而不是必须:^([0-9]+|[0-9]{1,3}(,[0-9]{3})*)(.[0-9]{1,2})?$

    • 备注:这就是最终结果了,别忘了”+”可以用”*”替代。如果你觉得空字符串也可以接受的话(奇怪,为什么?)最后,别忘了在用函数时去掉去掉那个反斜杠,一般的错误都在这里

  • xml 文件 : ^ ([a-zA-Z] + -?) + [a-zA-Z0-9] + \\. [x | X] [m | M] [l | L] $

  • Regular expression for Chinese characters: [\u4e00-\u9fa5]

  • Double-byte characters: [^\x00-\xff] (including Chinese characters, can be used to calculate the length of the string (a double-byte character counts as 2, ASCII character counts as 1))

  • Regular expression for blank lines: \n\s*\r (can be used to delete blank lines)

  • Regular expression of HTML tags: <(\S*?)[^>]*>.*?</\1>|<.*? /> (The version circulating on the Internet is too bad, the above is only part of it, Still powerless for complex nested tags)

  • Regular expression for leading and trailing whitespace characters: ^\s*|\s*$ or (^\s*)|(\s*$) (can be used to delete whitespace characters (including spaces, tabs) at the beginning and end of a line , form feeds, etc.), very useful expressions)

  • Tencent QQ number: [1-9][0-9]{4,} (Tencent QQ number starts from 10000)

  • China Postal Code: [1-9]\d{5}(?!\d) (China Postal Code is 6 digits)

  • IP address: \d+\.\d+\.\d+\.\d+ (useful when extracting IP address)

  • IP address: ((?:(?:25[0-5]|2[0-4]\\d|[01]?\\d?\\d)\\.){3}(?:25 [0-5]|2[0-4]\\d|[01]?\\d?\\d)) (Provided by @Feilongsanshao, thanks for sharing)

Original source: https://www.cnblogs.com/cp-miao/p/5567115.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325217240&siteId=291194637
Recommended