Foundations and permutations python regex

Breaking the code

arrangement

Concept: Remove from the n elements m (m <= n) elements, in a certain order in a row, called out a permutation of m elements (Arrangement) from n elements, in particular, when m = n when this arrangement is referred to as a full permutation (permutation)

'''
需求:1 2 3 4
假设从中取3个数字,然后对这三个数字进行排列
'''
#需求:从[1,2,3,4]4个数中随机取出3个数进行排列
import itertools
myList = list(iterator.permutations([1,2,3,4], 3))
print(mylist)
print(len(mylist))

'''
规律总结:
4 - 3  24
4 - 2  12
4 - 1  4
排列的可能性次数:n!/(n-m)!
'''
combination

Concept: from m different elements, take any one of n (n <= m) of a set of elements, called combined n elements taken from m different elements.

import itertools
'''
[1,2,3,4,5]中选取4个数的组合方式有几种?
'''
myList = list(itertools.combinations([1,2,3,4,5],4))
print(myList)
print(len(muyList))
'''
规律总结:
m   n
5   5   1
5   4   5
5   3   10
5   2   10
m!/(n!x(m-n)!)
'''
Permutations
import itertools
myList = list(itertools.product("0123456789QWERTYUIOPASDFGHJKLZXCVBNMqwertyuiopasdfghjklzxcvbnm",repeat=6))
#可以尝试,有可能电脑会卡住
#多线程也不行,电脑内存不够,咋处理都白搭
print(len(myList))

Note: Whenever it comes to passwords, usually encrypt, common encryption methods are MD5, RSA, DES, etc.

Mad code-breaking

Enemy 1000 injury since the loss of ten thousand hack

import time
import itertools

password = ("".join(x) for x in itertools.product("0123456789QWERTYUIOPASDFGHJKLZXCVBNMqwertyuiopasdfghjklzxcvbnm",repeat=6))
#print(len(myList))
while True:
    try:
        str1 = next(password)
        time.sleep(0.5)
        print(str1)
    except StopIteration as e:
        break

Regular Expressions

Common requirements
Judgment QQ No.

Requirements: a design method, passing a number QQ, QQ number to determine the legality.

'''
分析:
1.全是数字
2.位数:4~11
3.第一位不能为0
'''

def checkQQ(str1):
    #不管传入的str是否合法,我们假设是合法的
    result = True
    #寻找条件推翻最初的假设
    try:
        #判断是否全部为数字
        num = int(str1)
        if len(str1) >= 4 and len(str1) <= 11:
            #判断是否以数字[0]开头
            if str1[0] == '0':
                result = False
        else:
             result = False
   except BaseException:
        result = False
print(ckeckQQ("123284u3t95"))
Regular Overview

Regular expressions (Regular Exception), to describe the use of a single string, string search pattern matches a series of laws in line with a statement

Search mode may be used to replace text search and text

Regular expressions are search patterns formed by a sequence of characters

When you search for data in the text, you can use the search mode to describe what you want to query

Regular expressions can be a simple character, it can be a complex pattern

Regular expressions can be used for all text search and replace text operations.

In python by re embedded into the module, the programmer may direct calls to implement regular matches the regular expression is compiled into a series of bytecodes, then a matching engine performs the c.

Module Introduction

python from 1.5 version adds the re module that provides Perl-style regular expression pattern

re the python language module has all the features of regular expressions

The re module provides functions identical with the functions of these methods, these functions using a pattern string as their first argument.

Regular expression metacharacters
import re
#匹配单个字符与数字
r'''
.       匹配除换行符以外的任意字符
[0123456789] []是字符集合,表示匹配方括号中所包含的任意一个字符
[good]   匹配good中任意一个字符
[a-z]    匹配任意小写字母
[A-Z]    匹配任意大写字母
[0-9]   匹配任意数字
[0-9a-zA-Z] 匹配任意的数字和字母
[0-9a-zA-Z_]匹配任意的数字,字母以及下划线
[^good]     匹配除了good这几个字母以外的所有字符,中括号里的^称为脱字符,表示不匹配集合中的字符
[^0-9]      匹配所有的非数字字符
\d          匹配数字,效果同[0-9]
\D          匹配非数字字符,效果同[^0-9]
\w          匹配数字,字母和下划线,效果同[0-9a-zA-Z_]
\W          匹配非数字,字母和下划线,效果同[^0-9a-zA-Z_]
\s          匹配任意的空白符(空格、回车、换行、制表、换页),效果同[\r\n\t\f]
\S          匹配任意的非空白符,效果同[^\f\n\r\t]
'''
print(re.findall("\d","you are good1 man"))

r'''
^   首行匹配,和在[]里的^不是一个意思
$  行尾匹配
\A  匹配字符串开始,它和^的区别是,\A只匹配整个字符串的开头,即使在re.M模式下也不会匹配它行的行首
\Z  匹配字符串结束,它和$的区别是,\Z只匹配整个字符串的结束,即使在re.M模式下也会匹配它行的行尾

\b  匹配一个单词的边界,也就是指单词和空格的位置
    'er\b'可以匹配never,不能匹配nerve

\B  匹配非单词边界
'''
print(re.search("^good","you are a good man"))
print(re.search("man$","you are a good man"))

print(re.search("^good","you are a good man",re.M))
print(re.search("\Agood","you are a good man",re.M))
print(re.search("man$","you are a good man",re.M))
print(re.search("man\Z","you are a good man",re.M))

print(re.search(r"er\b","never"))
print(re.search(r"er\b","neve"))

print(re.search(r"er\B","never"))
print(re.search(r"er\B","neve"))
'''
说明:下方的x,y均为假设的普通字符,n,m(非负整数),不是正则表达式的元字符
(xyz)   匹配小括号内的xyz(作为一个整体去匹配)
x?      匹配0个或者1个x
x*      匹配0个或者任意多个x(.*表示匹配0个或者任意多个字符(换行符除外))
x+      匹配至少一个x
x{n}    匹配确定的n个x(n是一个非负整数)
x{n,}   匹配至少n个x
x{n,m}  匹配至少n个最多m个x,注意n<=m
x|y     |表示或,匹配的是x或y
'''

print(re.findall(r"a?","aaa"))#非贪婪匹配,尽可能少的匹配
print(re.findall(r"a*","aaabaa"))#贪婪匹配,尽可能多的匹配

print(re.findall(r"a+","aaabaaaa"))#贪婪匹配,尽可能多的匹配
print(re.findall(r"a{3}","aaabaaaa"))
print(re.findall(r"a{3,}","aaabaaaa"))#贪婪匹配,尽可能多的匹配
print(re.findall(r"a{3,6}","aaabaaaa"))
print(re.findall(r"(a|A)n","anaabaaaAn"))

Demand: Extract: you ... man

str1 = "you are a good man,you are a nice man ,you are a great man,you are a..."
print(re.findall(r"you.*?man",str1))

'''
*?  +?  x? 最小匹配,通常都是尽可能多的匹配,可以使用这种贪婪匹配(?:x) 类似于(xyz),但是不表示一个组
'''
#注释:/* part1 */ /* part2 */
print(re.findall(r"/*.*?/*/",r"/* part1 */ /* part2 */"))

re module functions commonly used functions

complie()

Compile a regular expression pattern and returns an object model. (You can put those commonly used regular expressions compiled regular expression object, aim to improve the efficiency point)

format:

re.complie(pattern,flags=0)

pattern: the expression string of compile-time

flags: the compiler flag is used to modify the regular expression matching methods, such as whether or not case-sensitive, multi-line matches, and so on.

import re
tt = "Tina is a good girl, she is cool, clever, and so on..."
rr = re.compile(r'\w*oo\w*')
print(rr.findall(tt))   #查找所有包含'oo'的单词

#执行结果如下:
#['good', 'cool']
match()

Re deciding whether to match the position of the beginning of the string

Note: This method is not an exact match, if there is a surplus character string when the end of the pattern, is still considered a match success

Want to add borders to match exact match symbol "$" at the end of the expression

grammar:

re.match(pattern,string,flags=0)

import re
print(re.match("com","comww.rnfregcoomn").group())
print(re.match("com",'Comwww.runcomoob',re.I).group())
search () function

grammar:

re.search(pattern,string,flags= 0)

re.search function looks for pattern matching in strings, just find the first match and then return, if no match is found it returns None

import re
print(re.search('\dcom','www.4comrunoob.5com').group())
#执行结果如下:
#4com
findall()

re.findall match traversal can obtain all strings that match the string, returns a list

grammar:

re.findall(pattern,string,flag=0)

import re
p = re.compile(r"\d+")
print(p.findall('o1n2m3k4'))
#执行结果如下:
#['1', '2', '3', '4']
import re
tt = "Tina is a good girl, she is cool, clever, and so on..."
rr = re.compile(r'\w*oo\w*')
print(rr.findall(tt))
print(re.findall(r'(\w)*oo(\w)',tt))#()表示子表达式 
#执行结果如下:
#['good', 'cool']
#[('g', 'd'), ('c', 'l')]
split ()

Search string, returns a sequential access every match result (Match objects) iterator, find all substrings RE matches, and returns them as an iterator.

grammar:

re.finditer(pattern,string,flags=0)

import re
iter = re.finditer(r'\d+','12 drumm44ers drumming, 11 ... 10 ...')
for i in iter:
    #print(i)
    #print(i.group())
    print(i.span())
"""
执行结果如下:
<_sre.SRE_Match object; span=(0, 2), match='12'>
12
(0, 2)
<_sre.SRE_Match object; span=(8, 10), match='44'>
44
split()

Can be matched according to sub-strings will return a list of the string dividing

Re.split may be used to split the string.

grammar:

re.split(pattern,string[,maxsplit])

maxsplit used to specify the maximum number of divisions, if not specified, all divided.

import re
print(re.split('\d+','one1two2three3four4five5'))
#执行结果如下:
#['one', 'two', 'three', 'four', 'five', '']
sub()

Use re replaced by a new replacement string is the string returned string of each sub-matching.

grammar:

re.sub(pattern,repl,string,count)

A parameter: the string to be matched, two parameters: the string to be replaced

Three parameters: content to match four parameters: Specifies the number of replacement

import re
text = "Bob is a handsome boy, he is cool, clever, and so on..."
print(re.sub(r'\s+', '-', text))
#执行结果如下:
#JGood-is-a-handsome-boy,-he-is-cool,-clever,-and-so-on...
#其中第二个函数是替换后的字符串;本例中为'-'

#第四个参数指替换个数。默认为0,表示每个匹配项都替换。

note:

1.re.match () and re.search () also re.findall () difference

re.match only matches the beginning of the string, the re.search match the entire string, returns the first matching result, the re.findall entire string, return all matching results.

Published 31 original articles · won praise 4 · Views 3510

Guess you like

Origin blog.csdn.net/qq_29074261/article/details/80104013