re.S usage

re.S usage

The role of re.S:

When re.S is not used, it will only match within each line . If there is no line, change the next line and start again. After using the re.S parameter, the regular expression will treat this string as a whole, in the whole Matching is generally used frequently in crawler projects.

Example:

import re
a = """This is 
a*webspider*item!
maoyanmovierank"""

b = re.findall('a(.*?)item',a)
c = re.findall('a(.*?)item',a,re.S)
print (b)
print(c)

Output result:

b:[]
c:['webspider']

Here re.S . "" On behalf of the regular expression ( '.? Item a (* )') matches , including line breaks , including all characters (not including the newline itself:. \ N \ r)

The following are some modifiers of the re module:

Regular expressions can contain some optional flag modifiers to control the matching pattern. The modifier is specified as an optional flag. Multiple flags can be specified by bitwise OR (|) them. For example,
re.I | re.M is set to I and M flags

  1. re.I ignore case
  2. re.L means the special character set \w, \W, \b, \B, \s, \S depends on the current environment
  3. re.M multi-line mode
  4. re.S is. and any character including the newline character (. does not include the newline character)
  5. re.U means the special character set \w, \W, \b, \B, \d, \D, \s, \S depends on the Unicode character attribute database
  6. In order to increase readability, re.X ignores spaces and comments after #

Guess you like

Origin blog.csdn.net/m0_46202060/article/details/109262415