Python third-party modules

Pillow

    PIL: Python Imaging Library, which is already the de facto image processing standard library for the Python platform. PIL is very powerful, but the API is very simple and easy to use. PIL only supports Python 2.7, Pillow, supports the latest Python 3.x, and adds many new features, so we can install and use Pillow directly.

Manipulate images

The most common image scaling operation requires only three or four lines of code:

from PIL import Image

# Open a jpg image file, note the current path:
im = Image.open('test.jpg')
# Get image dimensions:
w, h = im.size
print('Original image size: %sx%s' % (w, h))
# Zoom to 50%:
im.thumbnail((w//2, h//2))
print('Resize image to: %sx%s' % (w//2, h//2))
# Save the scaled image in jpeg format:
im.save('thumbnail.jpg', 'jpeg')
Other functions such as slice, rotate, filter, output text, color palette, etc. are all available.
For example, the blur effect is also just a few lines of code:
from PIL import Image, ImageFilter

# Open a jpg image file, note the current path:
im = Image.open('test.jpg')
# Apply the blur filter:
im2 = im.filter(ImageFilter.BLUR)
im2.save('blur.jpg', 'jpeg')


PIL's ImageDraw provides a series of drawing methods that allow us to draw directly. For example, to generate a letter verification code picture:

from PIL import Image, ImageDraw, ImageFont, ImageFilter

import random

# random letters:
def rndChar():
    return chr(random.randint(65, 90))

# random color 1:
def rndColor():
    return (random.randint (64, 255), random.randint (64, 255), random.randint (64, 255))

# random color 2:
def rndColor2():
    return (random.randint (32, 127), random.randint (32, 127), random.randint (32, 127))

# 240 x 60:
width = 60 * 4
height = 60
image = Image.new('RGB', (width, height), (255, 255, 255))
# Create the Font object:
font = ImageFont.truetype('arial.ttf', 36)
# Create the Draw object:
draw = ImageDraw.Draw(image)
# Fill each pixel:
for x in range(width):
    for y in range(height):
        draw.point((x, y), fill=rndColor())
# Output text:
for t in range(4):
    draw.text((60 * t + 10, 10), rndChar(), font=font, fill=rndColor2())
# blurry:
image = image.filter(ImageFilter.BLUR)
image.save('code.jpg', 'jpeg')

request

  Python's built-in urllib module for accessing network resources. However, it is cumbersome to use and lacks many useful advanced features.

    A better solution is to use requests. It is a Python third-party library that is especially handy for dealing with URL resources.

To access a page via GET, it only takes a few lines of code:

>>> import requests
>>> r = requests.get('https://www.douban.com/') # Douban homepage
>>> r.status_code
200
>>> r.text
r.text
'<!DOCTYPE HTML>\n<html>\n<head>\n<meta name="description" content="Provide book, movie, music album recommendations, reviews and...'
For URLs with parameters, pass in a dict as the paramsparameter:
>>> r = requests.get('https://www.douban.com/search', params={'q': 'python', 'cat': '1001'})
>>> r.url # The actual requested URL
'https://www.douban.com/search?q=python&cat=1001'
requests automatically detects the encoding, which can be viewed using encoding properties:
>>> r.encoding
'utf-8'
Whether the response is text or binary content, we can get the object with content properties : bytes
>>> r.content
b'<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8">\n...'
The convenience of requests is also that for certain types of responses, such as JSON, you can get it directly:
>>> r = requests.get('https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20weather.forecast%20where%20woeid%20%3D%202151330&format=json')
>>> r.json()
{'query': {'count': 1, 'created': '2017-11-17T07:14:12Z', ...
When we need to pass in HTTP Header, we pass in a dict as a headers parameter:
>>> r = requests.get('https://www.douban.com/', headers={'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 11_0 like Mac OS X) AppleWebKit'})
>>> r.text
'<!DOCTYPE html>\n<html>\n<head>\n<meta charset="UTF-8">\n <title>豆瓣(手机版)</title>...'
To send a POST request, just change the get() method to post() , and then pass data in the parameters as the data of the POST request:
>>> r = requests.post('https://accounts.douban.com/login', data={'form_email': '[email protected]', 'form_password': '123456'})
requests uses the default application/x-www-form-urlencoded encoding for POST data. If you want to pass JSON data, you can directly pass in the json parameter:
params = {'key': 'value'}
r = requests.post(url, json=params) # Internal automatic serialization to JSON
Similarly, uploading files requires a more complex encoding format, but requests simplifies it into files parameters:
>>> upload_files = {'file': open('report.xls', 'rb')}
>>> r = requests.post(url, files=upload_files)

When reading a file, be sure to use 'rb'the binary mode to read, so that the byteslength obtained is the length of the file.

Replacing the post()method with put(), delete()etc , you can request the resource by PUT or DELETE .

In addition to being able to easily get the content of the response, requests are also very simple to get other information about the HTTP response. For example, to get the response headers:

>>> r.headers
{Content-Type': 'text/html; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Content-Encoding': 'gzip', ...}
>>> r.headers['Content-Type']
'text/html; charset=utf-8'
requests do special processing on cookies, so that we can easily get the specified cookie without parsing the cookie:
>>> r.cookies['ts']
'example_cookie_12345'
To pass cookies in the request, just prepare a dict to pass in cookies parameters:
>>> cs = {'token': '12345', 'status': 'working')
>>> r = requests.get(url, cookies=cs)
Finally, to specify a timeout, pass in the timeout parameter in seconds:
>>> r = requests.get(url, timeout=2.5) # timeout after 2.5 seconds

chardet  

    String encoding has always been a very troublesome problem, especially when we are dealing with some non-standard third-party web pages. Although Python provides Unicode representation strand bytestwo data types, and can be converted by encode()and methods, but it is not easy to do decode()it without knowing the encoding .bytesdecode()

    For an unknown encoding bytes, to convert it to str, you need to "guess" the encoding first. The third-party library chardet just came in handy. Use it to detect encoding, simple and easy to use.

When we get one bytes, we can detect the encoding for it. To detect encoding with chardet, only one line of code is required:

>>> chardet.detect(b'Hello, world!')
{'encoding': 'ascii', 'confidence': 1.0, 'language': ''}

The detected code is ascii, notice that there is also a confidencefield, indicating that the probability of detection is 1.0 (ie 100%).

Let's try to detect GBK-encoded Chinese:

>>> data = 'Leaving the plains on the grass, one year old and one dying'.encode('gbk')
>>> chardet.detect(data)
{'encoding': 'GB2312', 'confidence': 0.7407407407407407, 'language': 'Chinese'}
Detect UTF-8 encoding:
>>> data = '离离原上草,一岁一枯荣'.encode('utf-8')
>>> chardet.detect(data)
{'encoding': 'utf-8', 'confidence': 0.99, 'language': ''}
对日文进行检测:
>>> data = '最新の主要ニュース'.encode('euc-jp')
>>> chardet.detect(data)
{'encoding': 'EUC-JP', 'confidence': 0.99, 'language': 'Japanese'}

用chardet检测编码,使用简单。获取到编码后,再转换为str,就可以方便后续处理。

psutil

    在Python中获取系统信息使用psutil第三方模块。顾名思义,psutil = process and system utilities,它不仅可以通过一两行代码实现系统监控,还可以跨平台使用,支持Linux/UNIX/OSX/Windows等,是系统管理员和运维小伙伴不可或缺的必备模块。

获取CPU信息

    我们先来获取CPU的信息:

>>> import psutil
>>> psutil.cpu_count() # CPU逻辑数量
4
>>> psutil.cpu_count(logical=False) # CPU物理核心
2
# 2说明是双核超线程, 4则是4核非超线程
    统计CPU的用户/系统/空闲时间:
>>> psutil.cpu_times()
scputimes(user=10963.31, nice=0.0, system=5138.67, idle=356102.45)
    再实现类似 top 命令的CPU使用率,每秒刷新一次,累计10次:
>>> for x in range(10):
...     psutil.cpu_percent(interval=1, percpu=True)
... 
[14.0, 4.0, 4.0, 4.0]
[12.0, 3.0, 4.0, 3.0]
[8.0, 4.0, 3.0, 4.0]
[12.0, 3.0, 3.0, 3.0]
[18.8, 5.1, 5.9, 5.0]
[10.9, 5.0, 4.0, 3.0]
[12.0, 5.0, 4.0, 5.0]
[15.0, 5.0, 4.0, 4.0]
[19.0, 5.0, 5.0, 4.0]
[9.0, 3.0, 2.0, 3.0]

获取内存信息

    使用psutil获取物理内存和交换内存信息,分别使用:

>>> psutil.virtual_memory()
svmem(total=8589934592, available=2866520064, percent=66.6, used=7201386496, free=216178688, active=3342192640, inactive=2650341376, wired=1208852480)
>>> psutil.swap_memory()
sswap(total=1073741824, used=150732800, free=923009024, percent=14.0, sin=10705981440, sout=40353792)

    返回的是字节为单位的整数,可以看到,总内存大小是8589934592 = 8 GB,已用7201386496 = 6.7 GB,使用了66.6%。

    而交换区大小是1073741824 = 1 GB。

获取磁盘信息

    可以通过psutil获取磁盘分区、磁盘使用率和磁盘IO信息:

>>> psutil.disk_partitions() # 磁盘分区信息
[sdiskpart(device='/dev/disk1', mountpoint='/', fstype='hfs', opts='rw,local,rootfs,dovolfs,journaled,multilabel')]
>>> psutil.disk_usage('/') # 磁盘使用情况
sdiskusage(total=998982549504, used=390880133120, free=607840272384, percent=39.1)
>>> psutil.disk_io_counters() # 磁盘IO
sdiskio(read_count=988513, write_count=274457, read_bytes=14856830464, write_bytes=17509420032, read_time=2228966, write_time=1618405)

    可以看到,磁盘'/'的总容量是998982549504 = 930 GB,使用了39.1%。文件格式是HFS,opts中包含rw表示可读写,journaled表示支持日志。

获取网络信息

psutil可以获取网络接口和网络连接信息:

>>> psutil.net_io_counters() # 获取网络读写字节/包的个数
snetio(bytes_sent=3885744870, bytes_recv=10357676702, packets_sent=10613069, packets_recv=10423357, errin=0, errout=0, dropin=0, dropout=0)
>>> psutil.net_if_addrs() # 获取网络接口信息
{
  'lo0': [snic(family=<AddressFamily.AF_INET: 2>, address='127.0.0.1', netmask='255.0.0.0'), ...],
  'en1': [snic(family=<AddressFamily.AF_INET: 2>, address='10.0.1.80', netmask='255.255.255.0'), ...],
  'en0': [...],
  'en2': [...],
  'bridge0': [...]
}
>>> psutil.net_if_stats() # 获取网络接口状态
{
  'lo0': snicstats(isup=True, duplex=<NicDuplex.NIC_DUPLEX_UNKNOWN: 0>, speed=0, mtu=16384),
  'en0': snicstats(isup=True, duplex=<NicDuplex.NIC_DUPLEX_UNKNOWN: 0>, speed=0, mtu=1500),
  'en1': snicstats(...),
  'en2': snicstats(...),
  'bridge0': snicstats(...)
}

   要获取当前网络连接信息,使用net_connections()

>>> psutil.net_connections()
Traceback (most recent call last):
  ...
PermissionError: [Errno 1] Operation not permitted

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  ...
psutil.AccessDenied: psutil.AccessDenied (pid=3847)
    你可能会得到一个 AccessDenied 错误,原因是psutil获取信息也是要走系统接口,而获取网络连接信息需要root权限,这种情况下,可以退出Python交互环境,用 sudo 重新启动:
$ sudo python3
Password: ******
Python 3.6.3 ... on darwin
Type "help", ... for more information.
>>> import psutil
>>> psutil.net_connections()
[
    sconn(fd=83, family=<AddressFamily.AF_INET6: 30>, type=1, laddr=addr(ip='::127.0.0.1', port=62911), raddr=addr(ip='::127.0.0.1', port=3306), status='ESTABLISHED', pid=3725),
    sconn(fd=84, family=<AddressFamily.AF_INET6: 30>, type=1, laddr=addr(ip='::127.0.0.1', port=62905), raddr=addr(ip='::127.0.0.1', port=3306), status='ESTABLISHED', pid=3725),
    sconn(fd=93, family=<AddressFamily.AF_INET6: 30>, type=1, laddr=addr(ip='::', port=8080), raddr=(), status='LISTEN', pid=3725),
    sconn(fd=103, family=<AddressFamily.AF_INET6: 30>, type=1, laddr=addr(ip='::127.0.0.1', port=62918), raddr=addr(ip='::127.0.0.1', port=3306), status='ESTABLISHED', pid=3725),
    sconn(fd=105, family=<AddressFamily.AF_INET6: 30>, type=1, ..., pid=3725),
    sconn(fd=106, family=<AddressFamily.AF_INET6: 30>, type=1, ..., pid=3725),
    sconn(fd=107, family=<AddressFamily.AF_INET6: 30>, type=1, ..., pid=3725),
    ...
    sconn(fd=27, family=<AddressFamily.AF_INET: 2>, type=2, ..., pid=1)
]

Get process information

    Detailed information about all processes can be obtained through psutil:
>>> psutil.pids() # all process IDs
[3865, 3864, 3863, 3856, 3855, 3853, 3776, ..., 45, 44, 1, 0]
>>> p = psutil.Process(3776) # Get the specified process ID=3776, which is actually the current Python interactive environment
>>> p.name() # process name
'python3.6'
>>> p.exe() # Process exe path
'/Users/michael/anaconda3/bin/python3.6'
>>> p.cwd() # Process working directory
'/Users/michael'
>>> p.cmdline() # The command line where the process starts
['python3']
>>> p.ppid() # parent process ID
3765
>>> p.parent() # parent process
<psutil.Process(pid=3765, name='bash') at 4503144040>
>>> p.children() # list of child processes
[]
>>> p.status() # process status
'running'
>>> p.username() # Process username
'michael'
>>> p.create_time() # Process creation time
1511052731.120333
>>> p.terminal() # 进程终端
'/dev/ttys002'
>>> p.cpu_times() # 进程使用的CPU时间
pcputimes(user=0.081150144, system=0.053269812, children_user=0.0, children_system=0.0)
>>> p.memory_info() # 进程使用的内存
pmem(rss=8310784, vms=2481725440, pfaults=3207, pageins=18)
>>> p.open_files() # 进程打开的文件
[]
>>> p.connections() # 进程相关网络连接
[]
>>> p.num_threads() # 进程的线程数量
1
>>> p.threads() # 所有线程信息
[pthread(id=1, user_time=0.090318, system_time=0.062736)]
>>> p.environ() # 进程环境变量
{'SHELL': '/bin/bash', 'PATH': '/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:...', 'PWD': '/Users/michael', 'LANG': 'zh_CN.UTF-8', ...}
>>> p.terminate() # 结束进程
Terminated: 15 <-- 自己把自己结束了

和获取网络连接类似,获取一个root用户的进程需要root权限,启动Python交互环境或者.py文件时,需要sudo权限。

psutil还提供了一个test()函数,可以模拟出ps命令的效果:

$ sudo python3
Password: ******
Python 3.6.3 ... on darwin
Type "help", ... for more information.
>>> import psutil
>>> psutil.test()
USER         PID %MEM     VSZ     RSS TTY           START    TIME  COMMAND
root           0 24.0 74270628 2016380 ?             Nov18   40:51  kernel_task
root           1  0.1 2494140    9484 ?             Nov18   01:39  launchd
root          44  0.4 2519872   36404 ?             Nov18   02:02  UserEventAgent
root          45    ? 2474032    1516 ?             Nov18   00:14  syslogd
root          47  0.1 2504768    8912 ?             Nov18   00:03  kextd
root          48  0.1 2505544    4720 ?             Nov18   00:19  fseventsd
_appleeven    52  0.1 2499748    5024 ?             Nov18   00:00  appleeventsd
root          53  0.1 2500592    6132 ?             Nov18   00:02  configd
...


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325987334&siteId=291194637