Chapter VI module
6.1 module definitions
- Module Definition
- py files written to the programmer to provide a direct functional aspects of file
- Defined package
- Folder to store multiple file folders py
- If a package is introduced, the bag is not used the default module
- Introducing a package corresponds to the content execution _ _ init _ _.py file
- Python2 difference with seven python3
- python2: folder must have _ _ init _ _.py file
- python3: does not require _ _ init _ _.py file
- Recommendation: After the recommended write code, whether or python2 python3, should add this file
6.2 Classification Module (library)
6.2.1 built-in module
- Internal functions provided by python
- After import module, can be used directly
6.2.1.1 random
- Random number of modules
- randint: obtain a random number
import random # 导入一个模块
v = random.randint(起始,终止) # 得到一个随机数
#示例:生成随机验证码
import random
def get_random_code(length=6):
data = []
for i in range(length):
v = random.randint(65,90)
data.append(chr(v))
return ''.join(data)
code = get_random_code()
print(code)
- uniform: generating a random decimal
- choice: an object extraction
- Application: verification code lottery
- sample: extracting a plurality of objects
- Application: an award to extract more than one person
- shuffle: scrambled
- Applications: shuffling algorithm
6.2.1.2 Hash
- Digest algorithm module
- Ciphertext verification
- Consistency checking files
- md5
# 将指定的 “字符串” 进行 加密
import hashlib # 导入一个模块
def get_md5(data): # md5 加密函数
obj = hashlib.md5()
obj.update(data.encode('utf-8'))
result = obj.hexdigest()
return result
val = get_md5('123')
print(val)
# 加盐
import hashlib
def get_md5(data):
obj = hashlib.md5("sidrsdxff123ad".encode('utf-8')) # 加盐
obj.update(data.encode('utf-8'))
result = obj.hexdigest()
return result
val = get_md5('123')
print(val)
# 应用:用户注册+用户登录
import hashlib
USER_LIST = []
def get_md5(data): # md5 加密函数
obj = hashlib.md5("12:;idrsicxwersdfsaersdfs123ad".encode('utf-8')) # 加盐
obj.update(data.encode('utf-8'))
result = obj.hexdigest()
return result
def register(): # 用户注册函数
print('**************用户注册**************')
while True:
user = input('请输入用户名:')
if user == 'N':
return
pwd = input('请输入密码:')
temp = {'username':user,'password':get_md5(pwd)}
USER_LIST.append(temp)
def login(): # 用户登录函数
print('**************用户登陆**************')
user = input('请输入用户名:')
pwd = input('请输入密码:')
for item in USER_LIST:
if item['username'] == user and item['password'] == get_md5(pwd):
return True
register()
result = login()
if result:
print('登陆成功')
else:
print('登陆失败')
sha
import hashlib md5 = hashlib.sha1('盐'.encode()) md5.update(b'str') print(md5.hexdigest())
6.2.1.3 getpass
- Only run in the terminal
- getpass.getpass: When a password is not displayed
import getpass # 导入一个模块
pwd = getpass.getpass('请输入密码:')
if pwd == '123':
print('输入正确')
6.2.1.4 time [common]
- Time Module
time.time: timestamp (from 1970 to the present experience of seconds)
# https://login.wx.qq.com/cgi-bin/mmwebwx-bin/login?loginicon=true&uuid=4ZwIFHM6iw==&tip=1&r=-781028520&_=1555559189206
time.sleep: The number of seconds to wait
time.timezone
Examples
# 计算函数执行时间 import time def wrapper(func): def inner(): start_time = time.time() v = func() end_time = time.time() print(end_time-start_time) return v return inner @wrapper def func1(): time.sleep(2) print(123) func1()
6.2.1.5 datetime
- Time Module
datetime.now (): Current local time
datetime.utcnow (): the current UTC time
import time,timezone,timedelta from datetime import datetime,timezone,timedelta # 获取datetime格式时间 # 当前本地时间 v1 = datetime.now() # 当前东7区时间 tz = timezone(timedelta(hours=7)) v2 = datetime.now(tz) # 当前UTC时间 v3 = datetime.utcnow() print(v3)
Conversion
import time from datetime import datetime,timedelta # 1.datetime格式和字符串的相互转换 # 把datetime格式转换成字符串:strftime v1 = datetime.now() val = v1.strftime("%Y-%m-%d %H:%M:%S") # 字符串转成datetime格式:strptime v1 = datetime.strptime('2011-11-11','%Y-%m-%d') # 2.datetime时间的加减 v1 = datetime.strptime('2011-11-11','%Y-%m-%d') v2 = v1 - timedelta(days=140) # 再转换成字符串 date = v2.strftime('%Y-%m-%d') # 3.时间戳和datetime的相互转换 # 时间戳转换成datetime格式:fromtimestamp ctime = time.time() v1 = datetime.fromtimestamp(ctime) # datetime格式转换成时间戳:timestamp v1 = datetime.now() val = v1.timestamp()
6.2.1.6 sys
- python interpreter data
sys.getrefcount: acquiring a count value of the application
sys.getrecursionlimit: python supported by default the number of recursive
sys.stdout.write: Input Output
Supplementary: \ n: newline \ t: Tab \ r: back to the beginning of the current row
import time for i in range(1,101): msg = "%s%%\r" %i print(msg,end='') time.sleep(0.05)
Example: a progress bar to read the file
import os # 1. 读取文件大小(字节) file_size = os.stat('20190409_192149.mp4').st_size # 2.一点一点的读取文件 read_size = 0 with open('20190409_192149.mp4',mode='rb') as f1,open('a.mp4',mode='wb') as f2: while read_size < file_size: chunk = f1.read(1024) # 每次最多去读取1024字节 f2.write(chunk) read_size += len(chunk) val = int(read_size / file_size * 100) print('%s%%\r' %val ,end='')
sys.argv: getting users to execute the script, the incoming parameter
- Example: allow users to execute scripts passed to delete the file path, internally to help with the directory delete
""" 让用户执行脚本传入要删除的文件路径,在内部帮助用将目录删除。 C:\Python36\python36.exe D:/code/s21day14/7.模块传参.py D:/test C:\Python36\python36.exe D:/code/s21day14/7.模块传参.py """ import sys # 获取用户执行脚本时,传入的参数。 # C:\Python36\python36.exe D:/code/s21day14/7.模块传参.py D:/test # sys.argv = [D:/code/s21day14/7.模块传参.py, D:/test] path = sys.argv[1] # 删除目录 import shutil shutil.rmtree(path)
sys.exit (0): program termination, 0 for normal termination
sys.path: default python to import module will follow the path of sys.path
- Add directory: sys.path.append ( 'directory')
import sys sys.path.append('D:\\goodboy')
sys.modules: storing the contents of all modules, reflecting the current procedures used in this document in
6.2.1.7 os [common]
- And operating system-related data
os.path.exists (path): If the path exists, returns True; if the path does not exist, returns False
os.stat ( 'file path') .st_size / os.path.getsize: Get File Size
os.path.abspath (): Gets the absolute path of a file
import os os.path.abspath(__file__) #找到运行脚本的绝对路径 v1 = os.path.abspath(path) print(v1)
os.path.dirname (): Gets the path of the parent directory
import os v = r"D:\code\s21day14\20190409_192149.mp4" print(os.path.dirname(v))
Added: Escape
v1 = r"D:\code\s21day14\n1.mp4" (推荐) 加了r就相当于转义了 v2 = "D:\\code\\s21day14\\n1.mp4"
Mosaic path: os.path.join
import os path = "D:\code\s21day14" # user/index/inx/fasd/ v = 'n.txt' result = os.path.join(path,v) print(result)
os.listdir: View a list of all the files [first floor]
import os result = os.listdir(r'D:\code\s21day14') for path in result: print(path)
os.walk: View a list of all the files of all the layers []
import os result = os.walk(r'D:\code\s21day14') for a,b,c in result: # a,正在查看的目录 b,此目录下的文件夹 c,此目录下的文件 for item in c: path = os.path.join(a,item) print(path)
os.makedir: Create a directory, can only produce a directory (basically do this)
os.makedirs: Create a directory and its subdirectories (recommended)
# 将内容写入指定文件中 import os file_path = r'db\xx\xo\xxxxx.txt' file_folder = os.path.dirname(file_path) if not os.path.exists(file_folder): os.makedirs(file_folder) with open(file_path,mode='w',encoding='utf-8') as f: f.write('asdf')
os.rename: Rename
# 将db重命名为sb import os os.rename('db','sb')
os.path.isdir: determine whether the folder
os.path.isfile: determine whether the file
6.2.1.8 shutil
- Uses: delete, rename, compression, decompression, etc.
shutil.rmtree (path): remove directory
# 删除目录 import shutil shutil.rmtree(path)
shutil.move: Rename
# 重命名 import shutil shutil.move('test','ttt')
shutil.make_archive: compressed files
# 压缩文件 import shutil shutil.make_archive('zzh','zip','D:\code\s21day16\lizhong')
shutil.unpack_archive: Unzip the file
# 解压文件 import shutil shutil.unpack_archive('zzh.zip',extract_dir=r'D:\code\xxxxxx\xxxx',format='zip')
Examples
import os import shutil from datetime import datetime ctime = datetime.now().strftime('%Y-%m-%d-%H-%M-%S') # 1.压缩lizhongwei文件夹 zip # 2.放到到 code 目录(默认不存在) # 3.将文件解压到D:\x1目录中。 if not os.path.exists('code'): os.makedirs('code') shutil.make_archive(os.path.join('code',ctime),'zip','D:\code\s21day16\lizhongwei') file_path = os.path.join('code',ctime) + '.zip' shutil.unpack_archive(file_path,r'D:\x1','zip')
6.2.1.9 json
- json is a special string looks like a list / dictionary / string / nesting numbers, etc.
- Serialization: python in the value converted to a string format json
- Deserialize: json format Converts a string into a data type python
- json format requirements: essentially string
- It contains only int / float / str / list / dict
- The outermost layer must list / dict outermost removed quotes
- In json, the internal str must be double quotes
- Dictionary presence of key can only be str
- Not continuous load times
json.dumps (): Serialization
- json only supports dict / list / typle / str / int / float / True / False / None serialization
- Dictionary or list if there are Chinese, serialization, if you want to keep the Chinese show
import json v = {'k1':'alex','k2':'李杰'} val = json.dumps(v,ensure_ascii = False) #ensure_ascii 保留中文 #{"k1": "alex", "k2": "李杰"}
json.loads (): deserialization
import json # 序列化,将python的值转换为json格式的字符串。 v = [12,3,4,{'k1':'v1'},True,'asdf'] v1 = json.dumps(v) print(v1) # 反序列化,将json格式的字符串转换成python的数据类型 v2 = '["alex",123]' print(type(v2)) v3 = json.loads(v2) print(v3,type(v3))
json.dump: After opening the file, serialization, write to the file
import json v = {'k1':'alex','k2':'李杰'} f = open('x.txt',mode='w',encoding='utf-8') val = json.dump(v,f) print(val) f.close()
json.load: open the file, read the contents of the file
import json v = {'k1':'alex','k2':'李杰'} f = open('x.txt',mode='r',encoding='utf-8') data = json.load(f) f.close() print(data,type(data))
6.2.1.10 pickle
- The difference between the pickle and json
- json
- Pros: All common language
- Disadvantages: Only basic data types of the sequence list / dict like
- pickle
- Advantages: python all the things he can be serialized (socket objects), support continuous load times
- Cons: serialized content only know python
- json
pickle.dumps: Serialization
- Things serialized unreadable
pickle.loads: deserialized
import pickle # 序列化 v = {1,2,3,4} val = pickle.dumps(v) print(val) # 反序列化 data = pickle.loads(val) print(data,type(data))
pickle.dump: writing file (Note: mode = 'wb')
pickle.load: read the file (Note: mode = 'rb')
import pickle # 写入文件 v = {1,2,3,4} f = open('x.txt',mode='wb') val = pickle.dump(v,f) f.close() # 读取文件 f = open('x.txt',mode='rb') data = pickle.load(f) f.close() print(data)
6.2.1.11 copy
- Copy module
copy.copy: shallow copy
copy.deepcopy: deep copy
import copy v1 = [1,2,3] v2 = copy.copy(v1) #浅拷贝 v3 = copy.deepcopy(v1) #深拷贝
6.2.1.12 importlib
importlib.import_module: import module as a string
#示例一: import importlib # 用字符串的形式导入模块。 redis = importlib.import_module('utils.redis') # 用字符串的形式去对象(模块)找到他的成员。 getattr(redis,'func')() #示例二: import importlib middleware_classes = [ 'utils.redis.Redis', # 'utils.mysql.MySQL', 'utils.mongo.Mongo' ] for path in middleware_classes: module_path,class_name = path.rsplit('.',maxsplit=1) module_object = importlib.import_module(module_path)# from utils import redis cls = getattr(module_object,class_name) obj = cls() obj.connect()
6.2.1.13 logging
Log module: logging
- Look to the user: the water category, such as bank water
- To the programmer to see:
- Statistics used
- Used for troubleshooting, debug
- For recording errors, optimized code is completed
Log Processing essence: Logger / FileHandler / Formatter
Two configurations:
basicConfig
- Pros: Easy to use
- Disadvantages: can not achieve coding problem, can not simultaneously output to a file and screen
logger objects
- Advantages: can realize coding problem, it could also output to a file and screen
- Disadvantages: Complex
- Example:
import logging # 创建一个logger对象 logger = logging.getLogger() # 创建一个文件操作符 fh = logging.FileHandler('log.log') # 创建一个屏幕操作符 sh = logging.StreamHandler() # 给logger对象绑定 文件操作符 logger.addHandler(fh) # 给logger对象绑定 屏幕操作符 logger.addHandler(sh) # 创建一个格式 formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') # 给文件操作符 设定格式 fh.setFormatter(formatter) # 给屏幕操作符 设定格式 sh.setFormatter(formatter) # 用logger对象来操作 logger.warning('message')
Log abnormal level
CRITICAL = 50 # 崩溃 FATAL = CRITICAL ERROR = 40 # 错误 WARNING = 30 WARN = WARNING INFO = 20 DEBUG = 10 NOTSET = 0
Recommended processing log mode
import logging file_handler = logging.FileHandler(filename='x1.log', mode='a', encoding='utf-8',) logging.basicConfig( format='%(asctime)s - %(name)s - %(levelname)s -%(module)s: %(message)s', datefmt='%Y-%m-%d %H:%M:%S %p', handlers=[file_handler,], level=logging.ERROR ) logging.error('你好')
Recommended log splitting process the log mode +
import time import logging from logging import handlers # file_handler = logging.FileHandler(filename='x1.log', mode='a', encoding='utf-8',) file_handler = handlers.TimedRotatingFileHandler(filename='x3.log', when='s', interval=5, encoding='utf-8') logging.basicConfig( format='%(asctime)s - %(name)s - %(levelname)s -%(module)s: %(message)s', datefmt='%Y-%m-%d %H:%M:%S %p', handlers=[file_handler,], level=logging.ERROR ) for i in range(1,100000): time.sleep(1) logging.error(str(i))
Precautions:
# 在应用日志时,如果想要保留异常的堆栈信息。 import logging import requests logging.basicConfig( filename='wf.log', format='%(asctime)s - %(name)s - %(levelname)s -%(module)s: %(message)s', datefmt='%Y-%m-%d %H:%M:%S %p', level=logging.ERROR ) try: requests.get('http://www.xxx.com') except Exception as e: msg = str(e) # 调用e.__str__方法 logging.error(msg,exc_info=True)
6.2.1.14 collections
OrderedDict: ordered dictionary
from collections import OrderedDict odic = OrderedDict([('a', 1), ('b', 2), ('c', 3)]) print(odic) for k in odic: print(k,odic[k])
defaultdict: default dictionary
deque: deque
namedtuple: tuples can be named
# 创建一个类,这个类没有方法,所有属性的值都不能修改 from collections import namedtuple # 可命名元组 Course = namedtuple('Course',['name','price','teacher']) python = Course('python',19800,'alex') print(python) print(python.name) print(python.price)
6.2.1.15 re
Regular Expressions
- definition
- Definition: A regular expression is a regular string matching rules
- re module itself is only used to operate the regular expressions, and it does not matter itself canonical
- Why should there be a regular expression?
- Matching string
- A person's phone number
- A person's identity card number
- A machine ip address
- form validation
- Validate user input information is accurate
- Bank card number
- reptile
- Get some links to important data from web pages source code
- Matching string
- Regular rules
- The first rule: which itself is a character, to match a character string in which
- The second rule: character set [Character 1 Character 2], on behalf of a group of characters to match a character, this character appears in character as long as the group, then it shows the character to match the
- You can also use the character set range
- All ranges are ascii codes must be followed to be specified from large
- Common: [0-9] [az] [AZ]
- Metacharacters
- \ D: represents all numbers
- \ Escape character is the escape character escaping d, let d be able to match all the numbers between 0-9
- \ W: represents the uppercase and lowercase letters, numbers, underscores
- \ S: represents the blank spaces, line breaks, tabs
- \ T: the matching tab
- \ N: newline
- \ D: represents all non-numeric
- \ W: represents all characters except numbers, letters, underlined
- \ S: represents the non-blank
- : Represents any content except newline
- [] Character set: As long as all of the characters within the brackets are in line with the rules of the character
- [^] Non-character set: As long as all of the characters within the brackets are not in line with the rules of the character
- ^: Indicates the start of a character
- $: Indicates the end of a character
- |: Representation or
- Note: If the rule has two overlapping portions is always in front of the long, short back
- (): Indicates the packet, specified as a regular part of a group, | the scope of this symbol can be reduced
- special:
- [\ D], [0-9], \ d: no distinction is to be matched digit
- [\ D \ D], [\ W \ w], [\ S \ s] matches all characters in all
- quantifier
- {N}: n times occurred only represents
- {N,}: indicating the occurrence of at least n times
- {N, m}: indicates that an at least n times, occur at most m times
- ? : Means match 0 or 1, represents the essential, but there is only one, such as decimal point
+
: Means match one or more times*
: Match indicates zero or more times, represents optional, but for example there may be a plurality of n bits after the decimal point- 0 matching occurrences:
- Match any retention of digital two decimal places
- Match an integer or decimal
- Greed match
- The default greedy match will always be matched as much as possible within the scope of compliance with the conditions of quantifiers
- Non-greedy Match: inert Match
- Always match within the conditions as small as possible in line with the string
- Format: quantifier metacharacter x?
- Metacharacter means match in accordance with the rules in quantifier scope, the event x stop
- Example:? * X matches any of the content as many times as encountered immediately cease x
- Escapes:
- Regular expression character string escape python role in metastasis also happens
- But the relationship did not escape the regular expression and string escapes, and also likely to have conflict
- To avoid this conflict, all of us are to the regular test results tool for results
- Then only in the regular and the outside of the string to be matched are added to r
邮箱规则
@之前必须有内容且只能是字母(大小写)、数字、下划线(_)、减号(-)、点(.)
@和最后一个点(.)之间必须有内容且只能是字母(大小写)、数字、点(.)、减号(-),且两个点不能挨着
最后一个点(.)之后必须有内容且内容只能是字母(大小写)、数字且长度为大于等于2个字节,小于等于6个字节
邮箱验证的正则表达式:
^[a-zA-Z0-9_.-]+@[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)*\.[a-zA-Z0-9]{2,6}$
Regular module
re.findall: All items will match the string matching the rule, and returns a list, if not matched, return empty list
import re ret = re.findall('\d+','alex83') print(ret) # findall 会匹配字符串中所有符合规则的项 # 并返回一个列表 # 如果未匹配到返回空列表
re.search: If matched, returns an object, with the group value, if not matched, return None, not by group
import re ret = re.search('\d+','alex83') print(ret) # 如果能匹配上返回一个对象,如果不能匹配上返回None if ret: print(ret.group()) # 如果是对象,那么这个对象内部实现了group,所以可以取值 # 如果是None,那么这个对象不可能实现了group方法,所以报错 # 会从头到尾从带匹配匹配字符串中取出第一个符合条件的项 # 如果匹配到了,返回一个对象,用group取值 # 如果没匹配到,返回None,不能用group
re.match: match = search + ^ regular
import re ret = re.match('\d','alex83') == re.match('^\d','alex83') print(ret) # 会从头匹配字符串,从第一个字符开始是否符合规则 # 如果符合,就返回对象,用group取值 # 如果不符合,就返回None
re.finditer: results of a query in the case of more than one, it is possible to effectively save memory and reduce the space complexity, thus reducing the time complexity
import re ret = re.finditer('\d','safhl02urhefy023908'*20000000) # ret是迭代器 for i in ret: # 迭代出来的每一项都是一个对象 print(i.group()) # 通过group取值即可
re.compile: in the same regular expression used many times when the use of time can reduce overhead
import re ret = re.compile('\d+') r1 = ret.search('alex83') r2 = ret.findall('wusir74') r3 = ret.finditer('taibai40') for i in r3: print(i.group())
re.split: Regular use of cutting rules
import re ret = re.split('\d(\d)','alex83wusir74taibai') # 默认自动保留分组中的内容 print(ret)
re.sub / re.subn: using the rules of regular replacement
import re ret = re.sub('\d','D','alex83wusir74taibai',1) print(ret) # 'alexD3wusir74taibai' ret = re.subn('\d','D','alex83wusir74taibai') print(ret) # ('alexDDwusirDDtaibai', 4)
- Grouping and re module
About group values
import re ret = re.search('<(\w+)>(.*?)</\w+>',s1) print(ret) print(ret.group(0)) # group参数默认为0 表示取整个正则匹配的结果 print(ret.group(1)) # 取第一个分组中的内容 print(ret.group(2)) # 取第二个分组中的内容
Packet name :(? P <name> Regular Expressions)
import re ret = re.search('<(?P<tag>\w+)>(?P<cont>.*?)</\w+>',s1) print(ret) print(ret.group('tag')) # 取tag分组中的内容 print(ret.group('cont')) # 取cont分组中的内容
Group references :(? P = group name) about this group and the group must complete before the existing content to match exactly
import re # 方法一: s = '<h1>wahaha</h1>' ret = re.search('<(?P<tag>\w+)>.*?</(?P=tag)>',s) print(ret.group('tag')) # 'h1' # 方法二: s = '<h1>wahaha</h1>' ret = re.search(r'<(\w+)>.*?</\1>',s) print(ret.group(1)) # 'h1'
Grouping and findall: findall default display content within a packet priority, ungroup prioritize :( ?: Regular)
import re ret = re.findall('\d(\d)','aa1alex83') # findall遇到正则表达式中的分组,会优先显示分组中的内容 print(ret) # 取消分组优先显示: ret = re.findall('\d+(?:\.\d+)?','1.234+2') print(ret)
Sometimes we want to match the content included in the content of which do not match, this time just do not want to put the match out of the first match, and then removed by means
import re ret=re.findall(r"\d+\.\d+|(\d+)","1-2*(60+(-40.35/5)-(-4*3))") print(ret) # ['1', '2', '60', '', '5', '4', '3'] ret.remove('') print(ret) # ['1', '2', '60', '5', '4', '3']
Examples of reptiles
# 方法一: import re import json import requests def parser_page(par,content): res = par.finditer(content) for i in res: yield {'id': i.group('id'), 'title': i.group('title'), 'score': i.group('score'), 'com_num': i.group('comment_num')} def get_page(url): ret = requests.get(url) return ret.text pattern = '<div class="item">.*?<em class="">(?P<id>\d+)</em>.*?<span class="title">(?P<title>.*?)</span>.*?' \ '<span class="rating_num".*?>(?P<score>.*?)</span>.*?<span>(?P<comment_num>.*?)人评价</span>' par = re.compile(pattern,flags=re.S) num = 0 with open('movie_info',mode = 'w',encoding='utf-8') as f: for i in range(10): content = get_page('https://movie.douban.com/top250?start=%s&filter=' % num) g = parser_page(par,content) for dic in g: f.write('%s\n'%json.dumps(dic,ensure_ascii=False)) num += 25
# 方法二:进阶 import re import json import requests def parser_page(par,content): res = par.finditer(content) for i in res: yield {'id': i.group('id'), 'title': i.group('title'), 'score': i.group('score'), 'com_num': i.group('comment_num')} def get_page(url): ret = requests.get(url) return ret.text def write_file(file_name): with open(file_name,mode = 'w',encoding='utf-8') as f: while True: dic = yield f.write('%s\n' % json.dumps(dic, ensure_ascii=False)) pattern = '<div class="item">.*?<em class="">(?P<id>\d+)</em>.*?<span class="title">(?P<title>.*?)</span>.*?' \ '<span class="rating_num".*?>(?P<score>.*?)</span>.*?<span>(?P<comment_num>.*?)人评价</span>' par = re.compile(pattern,flags=re.S) num = 0 f = write_file('move2') next(f) for i in range(10): content = get_page('https://movie.douban.com/top250?start=%s&filter=' % num) g = parser_page(par,content) for dic in g: f.send(dic) num += 25 f.close()
6.2.2 third-party modules
6.2.2.1 Basics
After the need to download and install into the use of
Installation:
- pip package management tools
# 把pip.exe 所在的目录添加到环境变量中。 pip install 要安装的模块名称 # pip install xlrd
- Source Installation
# 下载源码包(压缩文件) -> 解压 -> 打开cmd窗口,并进入此目录:cd C:\Python37\Lib\site-packages # 执行:python3 setup.py build # 执行:python3 setup.py install
Installation Path: C: \ Python37 \ Lib \ site-packages
6.2.2.2 popular third-party modules
- requests
- xlrd
6.2.3 custom module
Write your own xx.py
def f1(): print('f1') def f2(): print('f2')
Call in yy.py
# 调用自定义模块中的功能 import xx xx.f1() xx.f2()
run
python yy.py
6.3 module call
Note: Naming files and folders can not be the same module name to import, otherwise you will find directly in the current directory
6.3.1 absolute imports
1. The basic calling module and import
- Importing files XXX.py
- method one
# 导入模块,加载此模块中所有的值到内存。
import XXX
# 调用模块中的函数
XXX.func()
- Second way
# 导入XXX.py中的func和show
from XXX import func,show
# 导入XXX.py中的所有值
from XXX import *
# 调用模块中的函数
func()
- Three ways
# 如果有重名情况,就导入时起一个别名
# 导入XXX.py中的func,并起一个别名为f
from XXX import func as f
# 调用模块中的函数
f()
to sum up
- Import: import module calls: a function module ()
- Import: from module import function call: function ()
- Import: from import function module as an alias, call: alias ()
- Knowledge points:
- as: surnamed
- *: On behalf of all
supplement
Multiple imports will not reload
import jd # 第一次加载:会加载一遍jd中所有的内容。 import jd # 由已经加载过,就不再加载。 print(456)
Have to reload
import importlib import jd importlib.reload(jd) print(456)
- py files and invoke the import folder
- XXX.py file into the folder YYY
- method one
# 导入模块
import YYY.XXX
# 调用模块中的函数
XXX.func()
- Second way
# 导入模块
from YYY import XXX
# 调用模块中的函数
XXX.func()
- Three ways
# 导入模块
from YYY.XXX import func
# 调用模块中的函数
func()
- to sum up
- And py module file to be executed in the same directory and requires a lot of features in the module, recommended:
- Import: import module calls: a function module ()
- Other Recommendations:
- Import: from module import module calls: a function module ()
- Import:. From module import function module call: function ()
- Import: from module import module calls: a function module ()
6.3.2 relative imports (not recommended)
from . import xxx
from .. import xxx