day11(Alex python)

开始学习模块的内容。

模块的内容有点多，分为两天，整理到一起。

=============================================================================================

第一天内容

模块：用一砣代码实现了某个功能的代码集合。

类似于函数式编程和面向过程编程，函数式编程则完成一个功能，其他代码用来调用即可，提供了代码的重用性和代码间的耦合。而对于一个复杂的功能来，可能需要多个函数才能完成（函数又可以在不同的.py文件中），n个 .py 文件组成的代码集合就称为模块。

如：os 是系统相关的模块；file是文件操作相关的模块

模块分为三种：

自定义模块
内置标准模块（又称标准库）
开源模块

自定义模块和开源模块的使用参考 http://www.cnblogs.com/wupeiqi/articles/4963027.html

这里主要说明标准库模块。

1. time和datetime

在Python中，通常有这几种方式来表示时间：1）时间戳 2）格式化的时间字符串 3）元组（struct_time）共九个元素。由于Python的time模块实现主要调用C库，所以各个平台可能有所不同。

UTC（Coordinated Universal Time，世界协调时）亦即格林威治天文时间，世界标准时间。在中国为UTC+8。DST（Daylight Saving Time）即夏令时。

时间戳（timestamp）的方式：通常来说，时间戳表示的是从1970年1月1日00:00:00开始按秒计算的偏移量。我们运行“type(time.time())”，返回的是float类型。返回时间戳方式的函数主要有time()，clock()等。

元组（struct_time）方式：struct_time元组共有9个元素，返回struct_time的函数主要有gmtime()，localtime()，strptime()。下面列出这种方式元组中的几个元素：

# -*- coding:utf-8 -*-
# Author: Agent Xu

import time

print(time.timezone) #当前时间与UTC的时间差
#-28800
print(time.altzone)
#-32400
print(time.time()) #返回时间戳
#1536220317.0408027
print(time.clock()) #返回处理时间
#3.004624417440883e-07

time.sleep(2) #暂停2秒

print(time.gmtime()) #当前时间的UTC格式
#time.struct_time(tm_year=2018, tm_mon=9, tm_mday=6, tm_hour=7, tm_min=51, tm_sec=59, tm_wday=3, tm_yday=249, tm_isdst=0)
print(time.localtime())  #当前时间的UTC格式+8小时，东八区
#time.struct_time(tm_year=2018, tm_mon=9, tm_mday=6, tm_hour=15, tm_min=51, tm_sec=59, tm_wday=3, tm_yday=249, tm_isdst=0)

#时间格式
print(time.asctime()) #参数为元组
#Thu Sep  6 15:51:59 2018
print(time.ctime()) #参数为时间戳
#Thu Sep  6 15:51:59 2018

x = time.localtime()
print(time.mktime(x)) #转换为时间戳
#1536220319.0
print(time.strftime('%Y-%m-%d %H:%M:%S',x))
#2018-09-06 15:51:59
print(time.strptime('2018-09-06 15:49:06','%Y-%m-%d %H:%M:%S'))#以后面的格式为准进行赋值操作
#time.struct_time(tm_year=2018, tm_mon=9, tm_mday=6, tm_hour=15, tm_min=49, tm_sec=6, tm_wday=3, tm_yday=249, tm_isdst=-1)
#可以赋值再调用时间元组的对象 ：x.tm_year = 2018

格式参照

%a    本地（locale）简化星期名称    
%A    本地完整星期名称    
%b    本地简化月份名称    
%B    本地完整月份名称    
%c    本地相应的日期和时间表示    
%d    一个月中的第几天（01 - 31）    
%H    一天中的第几个小时（24小时制，00 - 23）    
%I    第几个小时（12小时制，01 - 12）    
%j    一年中的第几天（001 - 366）    
%m    月份（01 - 12）    
%M    分钟数（00 - 59）    
%p    本地am或者pm的相应符    一    
%S    秒（01 - 61）    二    
%U    一年中的星期数。（00 - 53星期天是一个星期的开始。）第一个星期天之前的所有天数都放在第0周。    三    
%w    一个星期中的第几天（0 - 6，0是星期天）    三    
%W    和%U基本相同，不同的是%W以星期一为一个星期的开始。    
%x    本地相应日期    
%X    本地相应时间    
%y    去掉世纪的年份（00 - 99）    
%Y    完整的年份    
%Z    时区的名字（如果不存在为空字符）    
%%    ‘%’字符

时间转换关系：

import datetime

print(datetime.datetime.now()) #获取当前时间
#2018-09-06 16:25:54.265802
print(datetime.date.fromtimestamp(time.time()))  # 时间戳直接转成日期格式
#2018-09-06
print(datetime.datetime.now() + datetime.timedelta(3)) #当前时间+3天
#2018-09-09 16:25:54.265802
print(datetime.datetime.now() - datetime.timedelta(3)) #当前时间-3天
#2018-09-03 16:25:54.265802
print(datetime.datetime.now() + datetime.timedelta(hours=3)) #当前时间+3小时
#2018-09-06 19:25:54.265802
print(datetime.datetime.now() + datetime.timedelta(minutes=30)) #当前时间+30分
#2018-09-06 16:55:54.265802

2. random模块

# -*- coding:utf-8 -*-
# Author: Agent Xu

import random

#random.random()用于生成一个0到1的随机符点数: 0 <= n < 1.0
print(random.random())#0.024380841289870503
#可以指定区间
print(random.uniform(1, 10))#9.766959041710198

#random.randint()的函数原型为：random.randint(a, b)，用于生成一个指定范围内的整数。
# 其中参数a是下限，参数b是上限，生成的随机数n: a <= n <= b
print(random.randint(1,7))#4

#random.choice从序列中获取一个随机元素。
# 其函数原型为：random.choice(sequence)。参数sequence表示一个有序类型。
# 这里要说明一下：sequence在python不是一种特定的类型，而是泛指一系列的类型。
# list, tuple, 字符串都属于sequence。有关sequence可以查看python手册数据模型这一章。
print(random.choice('agentxu'))#n

# 下面是使用choice的一些例子：
print(random.choice("学习Python"))#h
print(random.choice(["JGood","is","a","handsome","boy"]))  #is
print(random.choice(("Tuple","List","Dict")))   #Dict

#多个字符中选取特定数量的字符：
print(random.sample('abcdefghij',3))#['e', 'h', 'c']

#洗牌#
items = [1,2,3,4,5,6,7]
print(items) #[1, 2, 3, 4, 5, 6, 7]
random.shuffle(items)
print(items) #[7, 6, 5, 3, 2, 1, 4]

生成随机验证码：

import random
checkcode = ''
for i in range(4):
    current = random.randrange(0,4)
    if current != i:
        temp = chr(random.randint(65,90))
    else:
        temp = random.randint(0,9)
    checkcode += str(temp)
print (checkcode)

3. os模块

提供对操作系统进行调用的接口

os.getcwd() 获取当前工作目录，即当前python脚本工作的目录路径
os.chdir("dirname")  改变当前脚本工作目录；相当于shell下cd
os.curdir  返回当前目录: ('.')
os.pardir  获取当前目录的父目录字符串名：('..')
os.makedirs('dirname1/dirname2')    可生成多层递归目录
os.removedirs('dirname1')    若目录为空，则删除，并递归到上一级目录，如若也为空，则删除，依此类推
os.mkdir('dirname')    生成单级目录；相当于shell中mkdir dirname
os.rmdir('dirname')    删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirname
os.listdir('dirname')    列出指定目录下的所有文件和子目录，包括隐藏文件，并以列表方式打印
os.remove()  删除一个文件
os.rename("oldname","newname")  重命名文件/目录
os.stat('path/filename')  获取文件/目录信息
os.sep    输出操作系统特定的路径分隔符，win下为"\\",Linux下为"/"
os.linesep    输出当前平台使用的行终止符，win下为"\t\n",Linux下为"\n"
os.pathsep    输出用于分割文件路径的字符串
os.name    输出字符串指示当前使用平台。win->'nt'; Linux->'posix'
os.system("bash command")  运行shell命令，直接显示
os.environ  获取系统环境变量
os.path.abspath(path)  返回path规范化的绝对路径
os.path.split(path)  将path分割成目录和文件名二元组返回
os.path.dirname(path)  返回path的目录。其实就是os.path.split(path)的第一个元素
os.path.basename(path)  返回path最后的文件名。如何path以／或\结尾，那么就会返回空值。即os.path.split(path)的第二个元素
os.path.exists(path)  如果path存在，返回True；如果path不存在，返回False
os.path.isabs(path)  如果path是绝对路径，返回True
os.path.isfile(path)  如果path是一个存在的文件，返回True。否则返回False
os.path.isdir(path)  如果path是一个存在的目录，则返回True。否则返回False
os.path.join(path1[, path2[, ...]])  将多个路径组合后返回，第一个绝对路径之前的参数将被忽略
os.path.getatime(path)  返回path所指向的文件或者目录的最后存取时间
os.path.getmtime(path)  返回path所指向的文件或者目录的最后修改时间

更多点击这里

这里建议格式：函数（r“......”）加一个‘r’，清晰且不容易出错。

=============================================================================================

第二天内容

4. sys模块

sys.argv           命令行参数List，第一个元素是程序本身路径
sys.exit(n)        退出程序，正常退出时exit(0)
sys.version        获取Python解释程序的版本信息
sys.maxint         最大的Int值
sys.path           返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值
sys.platform       返回操作系统平台名称
sys.stdout.write('please:')
val = sys.stdin.readline()[:-1]

5. shutil模块

参考http://www.cnblogs.com/wupeiqi/articles/4963027.html

高级的文件、文件夹、压缩包处理模块

6. json和pickle模块

上一篇day10已经进行详细说明：https://blog.csdn.net/xq920831/article/details/82413894

7. XML处理模块

xml是实现不同语言或程序之间进行数据交换的协议，跟json差不多，但json使用起来更简单，不过，古时候，在json还没诞生的黑暗年代，大家只能选择用xml呀，至今很多传统公司如金融行业的很多系统的接口还主要是xml。

xml的格式如下，就是通过<>节点来区别数据结构的:

<?xml version="1.0"?>

<data>

<country name="Liechtenstein">

<rank updated="yes">2</rank>

<year>2008</year>

<gdppc>141100</gdppc>

<neighbor name="Austria" direction="E"/>

<neighbor name="Switzerland" direction="W"/>

</country>

<country name="Singapore">

<rank updated="yes">5</rank>

<year>2011</year>

<gdppc>59900</gdppc>

<neighbor name="Malaysia" direction="N"/>

</country>

<country name="Panama">

<rank updated="yes">69</rank>

<year>2011</year>

<gdppc>13600</gdppc>

<neighbor name="Costa Rica" direction="W"/>

<neighbor name="Colombia" direction="E"/>

</country>

</data>

xml协议在各个语言里的都是支持的，在python中可以用以下模块操作xml 　　

import xml.etree.ElementTree as ET

tree = ET.parse("xmltest.xml")

root = tree.getroot()

print(root.tag)

#遍历xml文档

for child in root:

print(child.tag, child.attrib)

for i in child:

print(i.tag,i.text)

#只遍历year 节点

for node in root.iter('year'):

print(node.tag,node.text)

修改和删除xml文档内容

import xml.etree.ElementTree as ET

tree = ET.parse("xmltest.xml")

root = tree.getroot()

#修改

for node in root.iter('year'):

new_year = int(node.text) + 1

node.text = str(new_year)

node.set("updated","yes")

tree.write("xmltest.xml")

#删除node

for country in root.findall('country'):

rank = int(country.find('rank').text)

if rank > 50:

root.remove(country)

tree.write('output.xml')

自己创建xml文档

import xml.etree.ElementTree as ET

new_xml = ET.Element("namelist")

name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})

age = ET.SubElement(name,"age",attrib={"checked":"no"})

sex = ET.SubElement(name,"sex")

sex.text = '33'

name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})

age = ET.SubElement(name2,"age")

age.text = '19'

et = ET.ElementTree(new_xml) #生成文档对象

et.write("test.xml", encoding="utf-8",xml_declaration=True)

ET.dump(new_xml) #打印生成的格式

8. PyYAML

https://yq.aliyun.com/articles/509500

配置文件。。。需要用时查询。

9. ConfigParser

用于生成和修改常见配置文档，当前模块的名称在 python 3.x 版本中变更为 configparser。

来看一个好多软件的常见文档格式如下

[DEFAULT]

ServerAliveInterval = 45

Compression = yes

CompressionLevel = 9

ForwardX11 = yes

[bitbucket.org]

User = hg

[topsecret.server.com]

Port = 50022

ForwardX11 = no

如果想用python生成一个这样的文档怎么做呢？

import configparser

config = configparser.ConfigParser()

config["DEFAULT"] = {'ServerAliveInterval': '45',

'Compression': 'yes',

'CompressionLevel': '9'}

config['bitbucket.org'] = {}

config['bitbucket.org']['User'] = 'hg'

config['topsecret.server.com'] = {}

topsecret = config['topsecret.server.com']

topsecret['Host Port'] = '50022' # mutates the parser

topsecret['ForwardX11'] = 'no' # same here

config['DEFAULT']['ForwardX11'] = 'yes'

with open('example.ini', 'w') as configfile:

config.write(configfile)

写完了还可以再读出来哈。

>>> import configparser

>>> config = configparser.ConfigParser()

>>> config.sections()

[]

>>> config.read('example.ini')

['example.ini']

>>> config.sections()

['bitbucket.org', 'topsecret.server.com']

>>> 'bitbucket.org' in config

True

>>> 'bytebong.com' in config

False

>>> config['bitbucket.org']['User']

'hg'

>>> config['DEFAULT']['Compression']

'yes'

>>> topsecret = config['topsecret.server.com']

>>> topsecret['ForwardX11']

'no'

>>> topsecret['Port']

'50022'

>>> for key in config['bitbucket.org']: print(key)

...

user

compressionlevel

serveraliveinterval

compression

forwardx11

>>> config['bitbucket.org']['ForwardX11']

'yes'

configparser增删改查语法

[section1]

k1 = v1

k2:v2

[section2]

k1 = v1

import ConfigParser

config = ConfigParser.ConfigParser()

config.read('i.cfg')

# ########## 读 ##########

#secs = config.sections()

#print secs

#options = config.options('group2')

#print options

#item_list = config.items('group2')

#print item_list

#val = config.get('group1','key')

#val = config.getint('group1','key')

# ########## 改写 ##########

#sec = config.remove_section('group1')

#config.write(open('i.cfg', "w"))

#sec = config.has_section('wupeiqi')

#sec = config.add_section('wupeiqi')

#config.write(open('i.cfg', "w"))

#config.set('group2','k1',11111)

#config.write(open('i.cfg', "w"))

#config.remove_option('group2','age')

#config.write(open('i.cfg', "w"))

10. hashlib模块

映射加密，用于加密相关的操作，3.x里代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法。

import hashlib

m = hashlib.md5()

m.update(b"Hello")

m.update(b"It's me")

print(m.digest())

m.update(b"It's been a long time since last time we ...")

print(m.digest()) #2进制格式hash

print(len(m.hexdigest())) #16进制格式hash

'''

def digest(self, *args, **kwargs): # real signature unknown

""" Return the digest value as a string of binary data. """

pass

def hexdigest(self, *args, **kwargs): # real signature unknown

""" Return the digest value as a string of hexadecimal digits. """

pass

'''

import hashlib

# ######## md5 ########

hash = hashlib.md5()

hash.update('admin')

print(hash.hexdigest())

# ######## sha1 ########

hash = hashlib.sha1()

hash.update('admin')

print(hash.hexdigest())

# ######## sha256 ########

hash = hashlib.sha256()

hash.update('admin')

print(hash.hexdigest())

# ######## sha384 ########

hash = hashlib.sha384()

hash.update('admin')

print(hash.hexdigest())

# ######## sha512 ########

hash = hashlib.sha512()

hash.update('admin')

print(hash.hexdigest())

还不够吊？python 还有一个 hmac 模块，它内部对我们创建 key 和内容再进行处理然后再加密

散列消息鉴别码，简称HMAC，是一种基于消息鉴别码MAC（Message Authentication Code）的鉴别机制。使用HMAC时,消息通讯的双方，通过验证消息中加入的鉴别密钥K来鉴别消息的真伪；

一般用于网络通信中消息加密，前提是双方先要约定好key,就像接头暗号一样，然后消息发送把用key把消息加密，接收方用key ＋消息明文再加密，拿加密后的值跟发送者的相对比是否相等，这样就能验证消息的真实性，及发送者的合法性了。

import hmac

h = hmac.new(b'天王盖地虎', b'宝塔镇河妖')

print h.hexdigest()

更多关于md5,sha1,sha256等介绍的文章看这里https://www.tbs-certificates.co.uk/FAQ/en/sha256.html

上述代码只是为了方便日后查询直接粘贴过来的，如有需求还得亲测实践一下。

11. re模块（比较重要）

常用作正则表达式符号，主要用于动态模糊匹配字符串。

'.' 默认匹配除\n之外的任意一个字符，若指定flag DOTALL,则匹配任意字符，包括换行

'^' 匹配字符开头，若指定flags MULTILINE,这种也可以匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)

'$' 匹配字符结尾，或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也可以

'*' 匹配*号前的字符0次或多次，re.findall("ab*","cabb3abcbbac") 结果为['abb', 'ab', 'a']

'+' 匹配前一个字符1次或多次，re.findall("ab+","ab+cd+abb+bba") 结果['ab', 'abb']

'?' 匹配前一个字符1次或0次

'{m}' 匹配前一个字符m次

'{n,m}' 匹配前一个字符n到m次，re.findall("ab{1,3}","abb abc abbcbbb") 结果'abb', 'ab', 'abb']

'|' 匹配|左或|右的字符，re.search("abc|ABC","ABCBabcCD").group() 结果'ABC'

'(...)' 分组匹配，re.search("(abc){2}a(123|456)c", "abcabca456c").group() 结果 abcabca456c

'\A' 只从字符开头匹配，re.search("\Aabc","alexabc") 是匹配不到的

'\Z' 匹配字符结尾，同$

'\d' 匹配数字0-9

'\D' 匹配非数字

'\w' 匹配[A-Za-z0-9]

'\W' 匹配非[A-Za-z0-9]

's' 匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 结果 '\t'

'(?P<name>...)' 分组匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 结果{'province': '3714', 'city': '81', 'birthday': '1993'}

单独需要小写字母[a-z]，需要大小写字母[a-zA-Z]　　

最常用的匹配语法

re.match 从头开始匹配

re.search 匹配包含

re.findall 把所有匹配到的字符放到以列表中的元素返回

re.splitall 以匹配到的字符当做列表分隔符

re.sub 匹配字符并替换

反斜杠的困扰
与大多数编程语言相同，正则表达式里使用"\"作为转义字符，这就可能造成反斜杠困扰。假如你需要匹配文本中的字符"\"，那么使用编程语言表示的正则表达式里将需要4个反斜杠"\\\\"：前两个和后两个分别用于在编程语言里转义成反斜杠，转换成两个反斜杠后再在正则表达式里转义成一个反斜杠。Python里的原生字符串很好地解决了这个问题，这个例子中的正则表达式可以使用r"\\"表示。同样，匹配一个数字的"\\d"可以写成r"\d"。有了原生字符串，你再也不用担心是不是漏写了反斜杠，写出来的表达式也更直观。

仅需轻轻知道的几个匹配模式

re.I(re.IGNORECASE): 忽略大小写（括号内是完整写法，下同）

M(MULTILINE): 多行模式，改变'^'和'$'的行为（参见上图）

S(DOTALL): 点任意匹配模式，改变'.'的行为

内容很多，语句也很多，而且还有一些模块没有写在这，遇到了再查询吧！

1. time和datetime

2. random模块

3. os模块

4. sys模块

5. shutil模块

6. json和pickle模块

7. XML处理模块

8. PyYAML

9. ConfigParser

10. hashlib模块

11. re模块（比较重要）

猜你喜欢