Python 笔记 快速查询

Python 笔记 快速查询

http://yanghuangblog.com/index.php/archives/7/

[BLOG]

文章目录

资料

在线书

https://www.py4e.com/html3/

docs

https://docs.python.org/3/

Dash 查询

语法差异

示例代码 参考代码


可以返回多个返回值

如果其中某些不需要,可以用_代替,比如:

parameters_values, _ = dictionary_to_vector(parameters)

换行

Python中一般是一行写完所有代码,如果遇到一行写不完需要换行的情况,有两种方法:

1.在该行代码末尾加上续行符“ \”(即空格+\);

test = ‘item_one’ \

‘item_two’ \

‘tem_three’

输出结果:‘item_oneitem_twotem_three’

2.加上括号,() {} []中不需要特别加换行符:

test2 = ('csdn ’

‘cssdn’)

reserved words

The reserved words in the language where humans talk to Python include the following:

from import as

def return

def thing():
    pass #please implement this
	pass
	return

pass

pass一般作为占位符或者创建占位程序,不会执行任何操作;

pass在软件设计阶段也经常用来作为TODO,提醒实现相应的实现;

if elif else

is is not in

python中is 和== 的区别是啥?

is比较的是id是不是一样,==比较的是值是不是一样。

Python中,万物皆对象!万物皆对象!万物皆对象!(很重要,重复3遍)

每个对象包含3个属性,id,type,value

id就是对象地址,可以通过内置函数id()查看对象引用的地址。

type就是对象类型,可以通过内置函数type()查看对象的类型。

value就是对象的值。

引申内容:

所以大多数情况下当用is和==的结果是一样时,用is的效率是会高于==的效率。

in用在逻辑判断,返回True False

and or

连接多个条件判断

while for in continue break

>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

for i in range(10):
    pass

for i in [5,4,3,2,1]:
	pass

for countdown in 5, 4, 3, 2, 1, "hey!":
    print(countdown)

while True:
	pass

try except

这个差异比较大。

默认 traceback 停止运行。

try + blocks, 如果blocks中一条失败,下面就不运行了,直接到expect中运行。

class

del

There is a way to remove an item from a list given its index instead of its value: the del statement.

>>> a = [-1, 1, 66.25, 333, 333, 1234.5]
>>> del a[0]
>>> a
[1, 66.25, 333, 333, 1234.5]
>>> del a[2:4]
>>> a
[1, 66.25, 1234.5]
>>> del a[:]
>>> a
[]

>>> del a

with as

如果不用with语句,代码如下:

file = open("/tmp/foo.txt")
data = file.read()
file.close()

这里有两个问题。一是可能忘记关闭文件句柄;二是文件读取数据发生异常,没有进行任何处理。下面是处理异常的加强版本:

file = open("/tmp/foo.txt")
try:
    data = file.read()
finally:
    file.close()

虽然这段代码运行良好,但是太冗长了。这时候就是with一展身手的时候了。除了有更优雅的语法,with还可以很好的处理上下文环境产生的异常。下面是with版本的代码:

with open("/tmp/foo.txt") as file:
    data = file.read()

原理:

基本思想是with所求值的对象必须有一个__enter__()方法,一个__exit__()方法。

紧跟with后面的语句被求值后,返回对象的__enter__()方法被调用,
这个方法的返回值将被赋值给as后面的变量。
当with后面的代码块全部被执行完之后,将调用前面返回对象的__exit__()方法。

在with后面的代码块抛出任何异常时,__exit__()方法被执行。正如例子所示,
异常抛出时,与之关联的type,value和stack trace传给__exit__()方法,
因此抛出的ZeroDivisionError异常被打印出来了。
开发库时,清理资源,关闭文件等等操作,都可以放在__exit__方法当中。

参考:https://www.cnblogs.com/DswCnblog/p/6126588.html

待整理的关键词

global yield

assert
​ raise
finally lambda nonlocal

运算符 其他规则

运算符

**乘方操作 幂函数

/返回浮点; //返回整数

>>> minute = 59
>>> minute/60
0.9833333333333333

>>> minute = 59
>>> minute//60
0 

string +连接

>>> first = '100'
>>> second = '150'
>>> print(first + second)
100150

tab

python不允许tab和空格混用,所以都有空格,ide中要设置好。

不使用tab,而要用空格

sublime text 3 user设置:

{
	"color_scheme": "Packages/Color Scheme - Default/Solarized (Light).tmTheme",
	"ignored_packages":
	[
		"Vintage"
	],

	"tab_size": 4,
	"translate_tabs_to_spaces": true,
}

代码块 blocks

通常是“:”开始,缩进回退结束。

if 4 > 3:
    print("111")
    print("222")

没有分号,没有大括号

vowels.sort(reverse=True) 直接用第三个参数

变量 内置函数和class 高级内置数据结构

静态语言 动态语言 脚本语言 胶水代码

Built-in Functions

type()

class type(object)

dir()

dir 列出类的所有方法 dir(class)

intput([prompt])

print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)

a = np.array([[1,2,3,4]])
print(str(a.shape) + "123")
print(a.shape + "123")

(1, 4)123
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-19-40d703a12efd> in <module>()
     16 a = np.array([[1,2,3,4]])
     17 print(str(a.shape) + "123")
---> 18 print(a.shape + "123")

TypeError: can only concatenate tuple (not "str") to tuple

只单独打印,不需要加str,但如果要用+之类的,要先用str;

int() float() str() list() tuple()

class int(x=0)

class int(x, base=10)

# class float([x])

>>> float('+1.23')
1.23
>>> float('   -12345\n')
-12345.0
>>> float('1e-003')
0.001
>>> float('+1E6')
1000000.0
>>> float()
0.0

max() min() len() sum()

open()

open(file, mode=‘r’, buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

Open file and return a corresponding file object. If the file cannot be opened, an OSError is raised.

range() 默认从0开始,不包含stop,类似于c数组

class range(stop)

class range(start, stop[, step])

  • start

    The value of the start parameter (or 0 if the parameter was not supplied)

  • stop

    The value of the stop parameter

  • step

    The value of the step parameter (or 1 if the parameter was not supplied)

>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(range(1, 11))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> list(range(0, 30, 5))
[0, 5, 10, 15, 20, 25]

其他

abs() delattr() hash() memoryview() set()
all() dict() help() setattr()
any() hex() next() slice()
ascii() divmod() id() object() sorted()
bin() enumerate() oct() staticmethod()
bool() eval()
breakpoint() exec() isinstance() ord()
bytearray() filter() issubclass() pow() super()
bytes() iter()
callable() format() property()
chr() frozenset() vars()
classmethod() getattr() locals() repr() zip()
compile() globals() map() reversed() import()
complex() hasattr() round()

None

如果不用None,比如c中,要实现查找最小值,需要一个而外的flag,来记录是否还是初始化状态;

string

[0,4]是指的0 1 2 3,不包含4 切片操作

>>> fruit = 'banana'
>>> fruit[:3]
'ban'
>>> fruit[3:]
'ana'
>>> fruit = 'banana'
>>> len(fruit)
6

The expression fruit[-1] yields the last letter, fruit[-2] yields the second to last, and so on.

>>> 'a' in 'banana'
True
>>> 'seed' in 'banana'
False

Strings are immutable string中内容是只读 不可更改

>>> greeting = 'Hello, world!'
>>> greeting[0] = 'J'
TypeError: 'str' object does not support item assignment

string methods

find

str.find(sub[, start[, end]])

Return the lowest index in the string where substring sub is found within the slice s[start:end].

Return -1 if sub is not found.

>>> line = '  Here we go  '
>>> line.strip()
'Here we go'

>>> line = 'Have a nice day'
>>> line.startswith('h')
False
>>> line.lower()
'have a nice day'
>>> line.lower().startswith('h')
True

strip 与 lower的返回值是个新的string,而没有改变原来string内容;

Format operator

>>> camels = 42
>>> 'I have spotted %d camels.' % camels
'I have spotted 42 camels.'

>>> 'In %d years I have spotted %g %s.' % (3, 0.1, 'camels')
'In 3 years I have spotted 0.1 camels.'

file

读取,直接用for in就可以,句柄是一个以行为单位的序列

file.read() 读取所有数据

rstrip 连换行也strip了

fname = input('Enter the file name: ')
try:
    fhand = open(fname)
except:
    print('File cannot be opened:', fname)
    exit()
count = 0
for line in fhand:
    if line.startswith('Subject:'):
        count = count + 1
print('There were', count, 'subject lines in', fname)

# Code: http://www.py4e.com/code3/search7.py

list

Lists are mutable

list用for in list,是只读模式

huangyangdeMacBook-Pro:python_test yang$ cat test.py 
tmp = [1,2,3]

print(tmp)

for iterm in tmp:

    print(iterm)
    iterm = 4

print(tmp)

for i in range(len(tmp)):

    print(tmp[i]) 
    tmp[i] = 4

print(tmp)huangyangdeMacBook-Pro:python_test yang$ python3 test.py 
[1, 2, 3]
1
2
3
[1, 2, 3]
1
2
3
[4, 4, 4]

list的+是合并,*是复制

list, append单个,extend list

sort(*, key=None, reverse=False)

they modify the list and return None.

list,pop index,remove value,del 数组方式

del是内置函数

x = t.pop(1)
t.remove('b')
del t[1]

Lists and strings (list split join)

For example, list(‘abc’) returns [‘a’, ‘b’, ‘c’] and list( (1, 2, 3) ) returns [1, 2, 3]. If no argument is given, the constructor creates a new empty list, [].

>>> s = 'pining for the fjords'
>>> t = s.split()
>>> print(t)
['pining', 'for', 'the', 'fjords']
>>> print(t[2])
the

list, str.split(delimiter) delimiter.join(list)

str.split(sep=None, maxsplit=-1)

>>> '1 2 3'.split()
['1', '2', '3']
>>> '1,2,3'.split(',')
['1', '2', '3']
>>> '1,2,3'.split(',', maxsplit=1)
['1', '2,3']
>>> '1,2,,3,'.split(',')
['1', '2', '', '3', '']
>>> t = ['pining', 'for', 'the', 'fjords']
>>> delimiter = ' '
>>> delimiter.join(t)
'pining for the fjords'

连续定义两个字符串,同一个id,但连续两个list,不同id

dict

>>> eng2sp = {'one': 'uno', 'two': 'dos', 'three': 'tres'}
>>> print(eng2sp)
{'one': 'uno', 'three': 'tres', 'two': 'dos'}
>>> print(eng2sp['two'])
'dos'
>>> len(eng2sp)
3
>>> print(eng2sp['four'])
KeyError: 'four'
>>> 'one' in eng2sp
True
>>> 'uno' in eng2sp
False

dict, in只在key的范围中找,在value中找,要用values方法先导出

>>> d = {"one": 1, "two": 2, "three": 3, "four": 4}
>>> d
{'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>> list(d)
['one', 'two', 'three', 'four']
>>> list(d.keys())
['one', 'two', 'three', 'four']
>>> list(d.values())
[1, 2, 3, 4]
>>> list(d.items())
[('one', 1), ('two', None), ('three', 3), ('four', 4)]

dict反查

但如果此时,我们想由value查找key,则会相对复杂一点,一般来说可通过如下3种方式实现:

#-----------------------------------------------------------------------------------

A. 充分利用 keys() 、values()、index() 函数

>>> list (student.keys()) [list (student.values()).index (‘1004’)]

结果显示: ‘小明’

get(key[, default])

Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a [KeyError](file:///Users/yang/Library/Application%20Support/Dash/DocSets/Python_3/Python%203.docset/Contents/Resources/Documents/doc/library/stdtypes.html#//apple_ref/Method/exceptions.html#KeyError).

常见错误

parameters['W' + str(l)]
parameters[ W + str(l)]   #err!

Tuple

Tuples are immutable

A tuple is a sequence of values much like a list. The values stored in a tuple can be any type, and they are indexed by integers. The important difference is that tuples are immutable. Tuples are also comparable and hashable so we can sort lists of them and use tuples as key values in Python dictionaries.

>>> t = 'a', 'b', 'c', 'd', 'e'
>>> type(t)
<class 'tuple'>
>>> t = ['a', 'b', 'c', 'd', 'e']
>>> type(t)
<class 'list'>
>>> t = ('a', 'b', 'c', 'd', 'e')
>>> type(t)
<class 'tuple'>

>>> t1 = ('a',)
>>> type(t1)
<type 'tuple'>
>>> t2 = ('a')
>>> type(t2)
<type 'str'>

You can’t modify the elements of a tuple, but you can replace one tuple with another:

>>> t = ('a', 'b', 'c', 'd', 'e')
>>> t = ('A',) + t[1:]
>>> t
('A', 'b', 'c', 'd', 'e')


>>> t = ('a', 'b', 'c', 'd', 'e')
>>> t = ('A',t[1:])
>>> t
('A', ('b', 'c', 'd', 'e'))

Tuple assignment

>>> m = [ 'have', 'fun' ]
>>> x, y = m
>>> x
'have'
>>> y
'fun'

>>> m = [ 'have', 'fun' ]
>>> (x, y) = m
>>> x
'have'
>>> y
'fun'

A particularly clever application of tuple assignment allows us to swap the values of two variables in a single statement:

>>> a, b = b, a

>>> addr = '[email protected]'
>>> uname, domain = addr.split('@')

Dictionaries and tuples

>>> d = {'a':10, 'b':1, 'c':22}
>>> t = list(d.items())
>>> t
[('b', 1), ('a', 10), ('c', 22)]
>>> t.sort()
>>> t
[('a', 10), ('b', 1), ('c', 22)]


for key, val in list(d.items()):
    print(val, key)

大括号 中括号 小括号

dict定义用 {}

list定义用 []

tuples定义用 ()

但上面三个,使用时,都用 []

类似于c中的声明方法:

d = {} # 声明一个dict d
d = [] # 声明一个list d
d = () # 声明一个tuple d

常用库 需要 import的,但默认安装的 标准库

random

import random

for i in range(2):
    x = random.random()
    print(x)
    
0.11132867921152356
0.5950949227890241

>>> random.randint(5, 10)
5
>>> random.randint(5, 10)
9

>>> t = [1, 2, 3]
>>> random.choice(t)
2
>>> random.choice(t)
3

Regular expressions

^ Matches the beginning of the line.

$ Matches the end of the line.

. Matches any character (a wildcard).

\s Matches a whitespace character.

\S Matches a non-whitespace character (opposite of \s).

* Applies to the immediately preceding character(s) and indicates to match zero or more times.

*? Applies to the immediately preceding character(s) and indicates to match zero or more times in “non-greedy mode”.

+ Applies to the immediately preceding character(s) and indicates to match one or more times.

+? Applies to the immediately preceding character(s) and indicates to match one or more times in “non-greedy mode”.

? Applies to the immediately preceding character(s) and indicates to match zero or one time.

?? Applies to the immediately preceding character(s) and indicates to match zero or one time in “non-greedy mode”.

[aeiou] Matches a single character as long as that character is in the specified set. In this example, it would match “a”, “e”, “i”, “o”, or “u”, but no other characters.

[a-z0-9] You can specify ranges of characters using the minus sign. This example is a single character that must be a lowercase letter or a digit.

[^A-Za-z] When the first character in the set notation is a caret, it inverts the logic. This example matches a single character that is anything other than an uppercase or lowercase letter.

( ) When parentheses are added to a regular expression, they are ignored for the purpose of matching, but allow you to extract a particular subset of the matched string rather than the whole string when using findall().

\b Matches the empty string, but only at the start or end of a word.

\B Matches the empty string, but not at the start or end of a word.

\d Matches any decimal digit; equivalent to the set [0-9].

\D Matches any non-digit character; equivalent to the set [^0-9].

re.search

# Search for lines that start with 'X' followed by any non
# whitespace characters and ':'
# followed by a space and any number.
# The number can include a decimal.
import re
hand = open('mbox-short.txt')
for line in hand:
    line = line.rstrip()
    if re.search('^X\S*: [0-9.]+', line):
        print(line)

# Code: http://www.py4e.com/code3/re10.py

When we run the program, we see the data nicely filtered to show only the lines we are looking for.

X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.0000
X-DSPAM-Confidence: 0.6178
X-DSPAM-Probability: 0.0000

re.findall and extracting

# Search for lines that start with 'X' followed by any
# non whitespace characters and ':' followed by a space
# and any number. The number can include a decimal.
# Then print the number if it is greater than zero.
import re
hand = open('mbox-short.txt')
for line in hand:
    line = line.rstrip()
    x = re.findall('^X\S*: ([0-9.]+)', line)
    if len(x) > 0:
        print(x)

# Code: http://www.py4e.com/code3/re11.py

Instead of calling search(), we add parentheses around the part of the regular expression that represents the floating-point number to indicate we only want findall() to give us back the floating-point number portion of the matching string.

The output from this program is as follows:

['0.8475']
['0.0000']
['0.6178']
['0.0000']
['0.6961']
['0.0000']
..

socket

import socket

mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('data.pr4e.org', 80))
cmd = 'GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode()
mysock.send(cmd)

while True:
    data = mysock.recv(512)
    if len(data) < 1:
        break
    print(data.decode(),end='')

mysock.close()

# Code: http://www.py4e.com/code3/socket1.py

urllib

import urllib.request

fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')
for line in fhand:
    print(line.decode().strip())

# Code: http://www.py4e.com/code3/urllib1.py

urllib.request.urlopen 两种用法

先for,然后在for中处理每一行,用到数据时 decode具体line
import urllib.request, urllib.parse, urllib.error

fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')

counts = dict()
for line in fhand:
    words = line.decode().split()
    for word in words:
        counts[word] = counts.get(word, 0) + 1
print(counts)

# Code: http://www.py4e.com/code3/urlwords.py

一次性read出byte数字,然后交给其他库去一次性处理
# Search for lines that start with From and have an at sign
import urllib.request, urllib.parse, urllib.error
import re

url = input('Enter - ')
html = urllib.request.urlopen(url).read()
links = re.findall(b'href="(http://.*?)"', html)
for link in links:
    print(link.decode())

# Code: http://www.py4e.com/code3/urlregex.py

HTTP error 403 in Python 3 Web Scraping

from urllib.request import Request, urlopen

req = Request('http://www.cmegroup.com/trading/products/#sortField=oi&sortAsc=false&venues=3&page=1&cleared=1&group=1', headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()

xml.etree.ElementTree

xml results = tree.findall(“comments/comment/count”) 自身名字除外,要从第一级子名字开始写,到你想要的为止

find

import xml.etree.ElementTree as ET

data = '''
<person>
  <name>Chuck</name>
  <phone type="intl">
    +1 734 303 4456
  </phone>
  <email hide="yes" />
</person>'''

tree = ET.fromstring(data)
print('Name:', tree.find('name').text)
print('Attr:', tree.find('email').get('hide'))

# Code: http://www.py4e.com/code3/xml1.py

findall

import xml.etree.ElementTree as ET

input = '''
<stuff>
  <users>
    <user x="2">
      <id>001</id>
      <name>Chuck</name>
    </user>
    <user x="7">
      <id>009</id>
      <name>Brent</name>
    </user>
  </users>
</stuff>'''

stuff = ET.fromstring(input)
lst = stuff.findall('users/user')
print('User count:', len(lst))

for item in lst:
    print('Name', item.find('name').text)
    print('Id', item.find('id').text)
    print('Attribute', item.get('x'))

# Code: http://www.py4e.com/code3/xml2.py

json

json 返回字典或list

json.loads

import json

data = '''
[
  { "id" : "001",
    "x" : "2",
    "name" : "Chuck"
  } ,
  { "id" : "009",
    "x" : "7",
    "name" : "Brent"
  }
]'''

info = json.loads(data)
print('User count:', len(info))

for item in info:
    print('Name', item['name'])
    print('Id', item['id'])
    print('Attribute', item['x'])

# Code: http://www.py4e.com/code3/json2.py

BeautifulSoup 不需要decode

url = input('Enter - ')
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')

# Retrieve all of the anchor tags
tags = soup('a')
for tag in tags:
    print(tag.get('href', None))

json需要先decode再处理

uh = urllib.request.urlopen(url)
data = uh.read().decode()

try:
    js = json.loads(data)
except:
    js = None

if not js or 'status' not in js or js['status'] != 'OK':
    print('==== Failure To Retrieve ====')
    print(data)
    quit()

# print(json.dumps(js, indent=1))

place_id = js["results"][0]["place_id"]

print(place_id)

sqlite3

CREATE TABLE

import sqlite3

conn = sqlite3.connect('music.sqlite')
cur = conn.cursor()

cur.execute('DROP TABLE IF EXISTS Tracks')
cur.execute('CREATE TABLE Tracks (title TEXT, plays INTEGER)')

conn.close()

# Code: http://www.py4e.com/code3/db1.py

INSERT INTO & SELECT

import sqlite3

conn = sqlite3.connect('music.sqlite')
cur = conn.cursor()

cur.execute('INSERT INTO Tracks (title, plays) VALUES (?, ?)',
    ('Thunderstruck', 20))
cur.execute('INSERT INTO Tracks (title, plays) VALUES (?, ?)',
    ('My Way', 15))
conn.commit()

print('Tracks:')
cur.execute('SELECT title, plays FROM Tracks')
for row in cur:
     print(row)

cur.execute('DELETE FROM Tracks WHERE plays < 100')
conn.commit()

cur.close()

# Code: http://www.py4e.com/code3/db2.py

Programming with multiple tables

INTEGER PRIMARY KEY简介

Sqlite 中INTEGER PRIMARY KEY AUTOINCREMENT和rowid/INTEGER PRIMARY KEY的使用
在用sqlite设计表时,每个表都有一个自己的整形id值作为主键,插入后能直接得到该主键.
因为sqlite内部本来就会为每个表加上一个rowid,这个rowid可以当成一个隐含的字段使用,
但是由sqlite引擎来维护的,在3.0以前rowid是32位的整数,3.0以后是64位的整数,可以使用这个内部的rowid作为每个表的id主键。

insert ignore表示,如果中已经存在相同的记录,则忽略当前新数据;
A join without condition is a cross join. A cross join repeats each row for the left hand table for each row in the right hand table:
fetchone() 如果没有结果 , 则返回 None

time

import time

x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]

### CLASSIC DOT PRODUCT OF VECTORS IMPLEMENTATION ###
tic = time.process_time()
dot = 0
for i in range(len(x1)):
    dot+= x1[i]*x2[i]
toc = time.process_time()
print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

常用库 需要 import的,但需要手动安装的

BeautifulSoup

# To run this, you can install BeautifulSoup
# https://pypi.python.org/pypi/beautifulsoup4

# Or download the file
# http://www.py4e.com/code3/bs4.zip
# and unzip it in the same directory as this file

import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
import ssl

# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

url = input('Enter - ')
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')

# Retrieve all of the anchor tags
tags = soup('a')
for tag in tags:
    print(tag.get('href', None))

# Code: http://www.py4e.com/code3/urllinks.py

tags = soup(‘a’) 只最后一个标签 ??

安装bs4

方式1:

sudo easy_install beautifulsoup4

有可能遇到TLS版本问题,要用方式2

方式2:

curl ‘https://bootstrap.pypa.io/get-pip.py’ > get-pip.py

sudo python3 get-pip.py

sudo pip install bs4

class

class PartyAnimal:
   x = 0

   def __init__(self):
     print('I am constructed')

   def party(self) :
     self.x = self.x + 1
     print('So far',self.x)

   def __del__(self):
     print('I am destructed', self.x)


an = PartyAnimal()
PartyAnimal.party(an)
上面两种方式等价

等价调用方式

numpy.reshape(a, newshape, order=‘C’)

a = np.arange(6).reshape((3, 2))

a = np.reshape(np.arange(6), (3, 2))

疑问

print

>>> print("hyhy", "1111")
hyhy 1111
>>> print("hyhy" + "1111")
hyhy1111

如果不想要空格,改如何处理?

  1. print(…)
  2. print(value, …, sep=’ ', end=‘n’, file=sys.stdout, flush=False) # n表示换行

todo

md相关

移植目录,默认都是外链,如果是本地图片,怎么一起移动。

命名规范

可以

优秀的参考代码

如何积累大量参考代码

那些必须try,否则一定traceback

Jupyter notebook

Jupyter Notebook介绍、安装及使用教程

使用Anaconda安装

Anaconda(官方网站)就是可以便捷获取包且对包能够进行管理,同时对环境可以统一管理的发行版本。Anaconda包含了conda、Python在内的超过180个科学包及其依赖项。

你可以通过进入Anaconda的官方下载页面自行选择下载;

numpy

官方文档:https://docs.scipy.org/doc/numpy-1.10.1/index.html

np.copy

直接复制,类似于cpp的引用,后续的修改会影响原来的值,所以用np.copy

parameters_values, _ = dictionary_to_vector(parameters)

thetaplus = np.copy(parameters_values)       
thetaplus[i][0] = thetaplus[i][0] + epsilon     

axis

获取矩阵行数列数(二维情况)

要对矩阵进行遍历时,一般先获取矩阵的行数和列数。要获取narray对象的各维的长度,可以通过narray对象的shape属性

import numpy as np
a = np.array([[1,2,3,4,5],[6,7,8,9,10]])

print(a.shape)          # 结果返回一个tuple元组 (2, 5)
print(a.shape[0])       # 获得行数,返回 2
print(a.shape[1])       # 获得列数,返回 5


hy:

从数据结构上理解axis 0 1 2,最外层的是0,其次是1,其次是2;

而行和列,在数据结构上都是以行为单位,所以行排在列前面;

a = np.array([
	[[1,2,3,4,5],[6,7,8,9,10]],
	[[1,2,3,4,5],[6,7,8,9,10]],
	])

print(a.shape) 

(2, 2, 5)
# shape[0] 对应最高的维度2(第三维度), shape[1]和[2]对应行和列

v = image.reshape(image.shape[0]*image.shape[1]*image.shape[2], 1)

reshape的参数,如果是两个,就是行和列数量;reshape之后,image的shape并没有改变,只是v变成了新的shape;


https://blog.csdn.net/taotao223/article/details/79187823

axis

二维数组就更简单了shape(3,4)这是一个三行四列的数组

sum(axis=0),不考虑行数,把列对应的数相加

最后总结下,axis=n ,就相当于不用考虑n所对应的意义,这个是针对于sum求和,如果是cumsum是不一样的,那个是累加shape保持不变

很多都这样用:

x_sum = np.sum(x_exp, axis = 1, keepdims = True)

np.sum

求和 sum()

矩阵求和的函数是sum(),可以对行,列,或整个矩阵求和

import numpy as np

a = np.array([[1,2,3],[4,5,6]])

print(a.sum())           # 对整个矩阵求和
# 结果 21

print(a.sum(axis=0)) # 对行方向求和
# 结果 [5 7 9]

print(a.sum(axis=1)) # 对列方向求和
# 结果 [ 6 15]
1234567891011121314

Numpy 常用方法总结

Numpy 常用方法总结

本文主要列出numpy模块常用方法

创建矩阵(采用ndarray对象)

对于python中的numpy模块,一般用其提供的ndarray对象。
创建一个ndarray对象很简单,只要将一个list作为参数即可。

import numpy as np 

# 创建一维的narray对象
a = np.array([1,2,3,4,5])

# 创建二维的narray对象
a2 = np.array([[1,2,3,4,5],[6,7,8,9,10]])

# 创建多维对象以其类推123456789

矩阵的截取

按行列截取

矩阵的截取和list相同,可以通过**[](方括号)**来截取

import numpy as np
a = np.array([[1,2,3,4,5],[6,7,8,9,10]])

print(a[0:1])       # 截取第一行,返回 [[1,2,3,4,5]]
print(a[1,2:5])     # 截取第二行,第三、四列,返回 [8,9]

print(a[1,:])       # 截取第二行,返回 [ 6,7,8,9,10]
print(a[1:,2:])     # 截取第一行之后,第2列之后内容,返回[8,9,10]
123456789

按条件截取

按条件截取其实是在[](方括号)中传入自身的布尔语句

import numpy as np

a = np.array([[1,2,3,4,5],[6,7,8,9,10]])
b = a[a>6]      # 截取矩阵a中大于6的元素,范围的是一维数组
print(b)        # 返回 [ 7  8  9 10]

# 其实布尔语句首先生成一个布尔矩阵,将布尔矩阵传入[](方括号)实现截取
print(a>6) 


# 返回
[[False False False False False]
 [False  True  True  True  True]]
1234567891011121314

按条件截取应用较多的是对矩阵中满足一定条件的元素变成特定的值
例如:将矩阵中大于6的元素变成0。

import numpy as np

a = np.array([[1,2,3,4,5],[6,7,8,9,10]])
print(a)


#开始矩阵为
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]

a[a>6] = 0
print(a)


#大于6清零后矩阵为
[[1 2 3 4 5]
 [6 0 0 0 0]]1234567891011121314151617

矩阵的合并

矩阵的合并可以通过numpy中的hstack方法和vstack方法实现

import numpy as np

a1 = np.array([[1,2],[3,4]])
a2 = np.array([[5,6],[7,8]])

# 注意! 参数传入时要以列表list或元组tuple的形式传入
print(np.hstack([a1,a2])) 

# 横向合并,返回结果如下 
[[1 2 5 6]
 [3 4 7 8]]

print(np.vstack((a1,a2)))

# 纵向合并,返回结果如下
[[1 2]
 [3 4]
 [5 6]
 [7 8]]

# 矩阵的合并也可以通过concatenatef方法。

np.concatenate( (a1,a2), axis=0 )       # 等价于  np.vstack( (a1,a2) )
np.concatenate( (a1,a2), axis=1 )       # 等价于  np.hstack( (a1,a2) )123456789101112131415161718192021222324

通过函数创建矩阵

numpy模块中自带了一些创建ndarray对象的函数,可以很方便的创建常用的或有规律的矩阵。

arange

import numpy as np

a = np.arange(10)       # 默认从0开始到10(不包括10),步长为1
print(a)                # 返回 [0 1 2 3 4 5 6 7 8 9]

a1 = np.arange(5,10)    # 从5开始到10(不包括10),步长为1
print(a1)               # 返回 [5 6 7 8 9]

a2 = np.arange(5,20,2)  # 从5开始到20(不包括20),步长为2
print(a2)               # 返回 [ 5  7  9 11 13 15 17 19]12345678910

linspace

linspace()和matlab的linspace很类似,用于创建指定数量等间隔的序列,实际生成一个等差数列。

import numpy as np

a = np.linspace(0,10,7) # 生成首位是0,末位是10,含7个数的等差数列
print(a) 


# 结果 
[  0.           1.66666667   3.33333333   5.         6.66666667  8.33333333  10.        ]
123456789

logspace

linspace用于生成等差数列,而logspace用于生成等比数列。
下面的例子用于生成首位是100,末位是104,含5个数的等比数列。

import numpy as np

a = np.logspace(0,4,5)
print(a)


# 结果
[  1.00000000e+00   1.00000000e+01   1.00000000e+02   1.00000000e+03
   1.00000000e+04]123456789

ones、zeros、eye、empty

ones创建全1矩阵
zeros创建全0矩阵
eye创建单位矩阵
empty创建空矩阵(实际有值)

import numpy as np

a_ones = np.ones((3,4))         # 创建3*4的全1矩阵
print(a_ones)

# 结果
[[ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]


a_zeros = np.zeros((3,4))       # 创建3*4的全0矩阵
print(a_zeros)

# 结果
[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]


a_eye = np.eye(3)               # 创建3阶单位矩阵
print(a_eye)

# 结果
[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]


a_empty = np.empty((3,4))       # 创建3*4的空矩阵 
print(a_empty)
# 结果
[[  9.25283328e+086,               nan,   6.04075076e-309,    1.53957654e-306],
  [  3.60081101e+228,   8.59109220e+115,   5.83022290e+252,     7.29515154e-315],
   [  8.73990008e+245,  -1.12621655e-279,   8.06565391e-273,     8.35428692e-308]]
123456789101112131415161718192021222324252627282930313233343536

fromstring ——获得字符ASCII码

fromstring()方法可以将字符串转化成ndarray对象,需要将字符串数字化时这个方法比较有用,可以获得字符串的ascii码序列。

import numpy as np

a = "abcdef"
b = np.fromstring(a,dtype=np.int8)      # 因为一个字符为8为,所以指定dtype为np.int8
print(b)                                # 返回 [ 97  98  99 100 101 102]
123456

fromfunction

fromfunction()方法可以根据矩阵的行号列号生成矩阵的元素。
例如创建一个矩阵,矩阵中的每个元素都为行号和列号的和。

import numpy as np

def func(i,j): 
    return i+j

a = np.fromfunction(func,(5,6)) 
# 第一个参数为指定函数,第二个参数为列表list或元组tuple,说明矩阵的大小
print(a)


# 返回
[[ 0.  1.  2.  3.  4.  5.]
 [ 1.  2.  3.  4.  5.  6.]
 [ 2.  3.  4.  5.  6.  7.]
 [ 3.  4.  5.  6.  7.  8.]
 [ 4.  5.  6.  7.  8.  9.]]
# 注意这里行号的列号都是从0开始的
123456789101112131415161718

矩阵的运算

常用矩阵运算符

numpy中的ndarray对象重载了许多运算符,使用这些运算符可以完成矩阵间对应元素的运算。
例如:+ - * / % **

常用矩阵函数

同样地,numpy中也定义了许多函数,使用这些函数可以将函数作用于矩阵中的每个元素。
表格中默认导入了numpy模块,即 import numpy as np

a为ndarray对象。

  • np.sin(a) 对矩阵a中每个元素取正弦,sin(x)

  • np.cos(a) 对矩阵a中每个元素取余弦,cos(x)

  • np.tan(a) 对矩阵a中每个元素取正切,tan(x)

  • np.arcsin(a) 对矩阵a中每个元素取反正弦,arcsin(x)

  • np.arccos(a) 对矩阵a中每个元素取反余弦,arccos(x)

  • np.arctan(a) 对矩阵a中每个元素取反正切,arctan(x)

  • np.exp(a) 对矩阵a中每个元素取指数函数,ex

  • np.sqrt(a) 对矩阵a中每个元素开根号√x

    hy:

abs 绝对值

square 计算平方

例如:

import numpy as np

a = np.array([[1,2,3],[4,5,6]])
print(np.sin(a))

# 结果
[[ 0.84147098  0.90929743  0.14112001]
 [-0.7568025  -0.95892427 -0.2794155 ]]

print(np.arcsin(a))

# 结果
# RuntimeWarning: invalid value encountered in arcsin
print(np.arcsin(a))
[[ 1.57079633         nan         nan]
 [        nan         nan         nan]]
123456789101112131415161718

当矩阵中的元素不在定义域范围内,会产生RuntimeWarning,结果为nan(not a number)。

矩阵乘法(点乘)

矩阵乘法必须满足矩阵乘法的条件,即第一个矩阵的列数等于第二个矩阵的行数。
矩阵乘法的函数为 dot
例如:

import numpy as np

a1 = np.array([[1,2,3],[4,5,6]])        # a1为2*3矩阵
a2 = np.array([[1,2],[3,4],[5,6]])      # a2为3*2矩阵

print(a1.shape[1]==a2.shape[0])         # True, 满足矩阵乘法条件
print(a1.dot(a2)) 

# a1.dot(a2)相当于matlab中的a1*a2
# 而python中的a1*a2相当于matlab中的a1.*a2

# 结果
[[22 28]
 [49 64]]
123456789101112131415

矩阵的转置 A T

import numpy as np

a = np.array([[1,2,3],[4,5,6]])

print(a.transpose())

# 结果
[[1 4]
 [2 5]
 [3 6]]


### 矩阵的转置还有更简单的方法,就是a.T
a = np.array([[1,2,3],[4,5,6]])
print(a.T)

# 结果
[[1 4]
 [2 5]
 [3 6]]
123456789101112131415161718192021

矩阵的逆 a−1

求矩阵的逆需要先导入numpy.linalg,用linalg的inv函数来求逆。
矩阵求逆的条件是矩阵的行数和列数相同。

import numpy as np
import numpy.linalg as lg

a = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(lg.inv(a))

# 结果
[[ -4.50359963e+15   9.00719925e+15  -4.50359963e+15]
 [  9.00719925e+15  -1.80143985e+16   9.00719925e+15]
 [ -4.50359963e+15   9.00719925e+15  -4.50359963e+15]]

a = np.eye(3)               # 3阶单位矩阵
print(lg.inv(a))            # 单位矩阵的逆为他本身

# 结果
[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]
12345678910111213141516171819

矩阵信息获取(如平均值)

最大最小值

获得矩阵中元素最大最小值的函数分别是max和min,可以获得整个矩阵、行或列的最大最小值。
例如

import numpy as np

a = np.array([[1,2,3],[4,5,6]])
print(a.max())              # 获取整个矩阵的最大值 结果: 6
print(a.min())              # 结果:1

# 可以指定关键字参数axis来获得行最大(小)值或列最大(小)值
# axis=0 行方向最大(小)值,即获得每列的最大(小)值
# axis=1 列方向最大(小)值,即获得每行的最大(小)值
# 例如

print(a.max(axis=0))
# 结果为 [4 5 6]

print(a.max(axis=1))
# 结果为 [3 6]

# 要想获得最大最小值元素所在的位置,可以通过argmax函数来获得
print(a.argmax(axis=1))
# 结果为 [2 2]1234567891011121314151617181920

平均值 mean()

获得矩阵中元素的平均值可以通过函数mean()。同样地,可以获得整个矩阵、行或列的平均值。

import numpy as np

a = np.array([[1,2,3],[4,5,6]])
print(a.mean())             # 结果为: 3.5

# 同样地,可以通过关键字axis参数指定沿哪个方向获取平均值
print(a.mean(axis=0))       # 结果 [ 2.5  3.5  4.5]
print(a.mean(axis=1))       # 结果 [ 2.  5.]123456789

方差 var()

方差的函数为var(),方差函数var()相当于函数mean(abs(x - x.mean())**2),其中x为矩阵。

import numpy as np

a = np.array([[1,2,3],[4,5,6]])
print(a.var())              # 结果 2.91666666667

print(a.var(axis=0))        # 结果 [ 2.25  2.25  2.25]
print(a.var(axis=1))        # 结果 [ 0.66666667  0.66666667]
123456789

标准差 std()

标准差的函数为std()。
std()相当于 sqrt(mean(abs(x - x.mean())**2)),或相当于sqrt(x.var())。

import numpy as np

a = np.array([[1,2,3],[4,5,6]])
print(a.std())              # 结果 1.70782512766

print(a.std(axis=0))        # 结果 [ 1.5  1.5  1.5]
print(a.std(axis=1))        # 结果 [ 0.81649658  0.81649658]
123456789

中值 median()

中值指的是将序列按大小顺序排列后,排在中间的那个值,如果有偶数个数,则是排在中间两个数的平均值。

例如序列[5,2,6,4,2],按大小顺序排成 [2,2,4,5,6],排在中间的数是4,所以这个序列的中值是4。

又如序列[5,2,6,4,3,2],按大小顺序排成 [2,2,3,4,5,6],因为有偶数个数,排在中间两个数是3、4,所以这个序列中值是3.5。

中值的函数是median(),调用方法为 numpy.median(x,[axis]),axis可指定轴方向,默认axis=None,对所有数去中值。

import numpy as np
x = np.array([[1,2,3],[4,5,6]])

print(np.median(x))         # 对所有数取中值
# 结果
3.5

print(np.median(x,axis=0))  # 沿第一维方向取中值
# 结果
[ 2.5  3.5  4.5]

print(np.median(x,axis=1))  # 沿第二维方向取中值
# 结果
[ 2.  5.]
12345678910111213141516

累积和 cumsum()

某位置累积和指的是该位置之前(包括该位置)所有元素的和。

例如序列[1,2,3,4,5],其累计和为[1,3,6,10,15],即第一个元素为1,第二个元素为1+2=3,……,第五个元素为1+2+3+4+5=15。

矩阵求累积和的函数是cumsum(),可以对行,列,或整个矩阵求累积和。

import numpy as np

a = np.array([[1,2,3],[4,5,6]])

print(a.cumsum())               # 对整个矩阵求累积和
# 结果 [ 1  3  6 10 15 21]

print(a.cumsum(axis=0))         # 对行方向求累积和
# 结果
[[1 2 3]
 [5 7 9]]

print(a.cumsum(axis=1))         # 对列方向求累积和
# 结果
[[ 1  3  6]
 [ 4  9 15]]
123456789101112131415161718

参考

参考自:smallpi
另外参考:numpy中的array与matrix

me

矩阵范数 numpy.linalg.norm

https://blog.csdn.net/bitcarmanlee/article/details/51945271

https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.linalg.norm.html#numpy.linalg.norm

The Frobenius norm is given by [R41]:

||A||_F = [\sum_{i,j} abs(a_{i,j})2]{1/2}

norm()方法的原型:

def norm(x, ord=None, axis=None, keepdims=False):
    Matrix or vector norm.

    This function is able to return one of eight different matrix norms,
    or one of an infinite number of vector norms (described below), depending
    on the value of the ``ord`` parameter.

hy:keepdims是为了保持结构,方便与原来的x进行计算,否则输出默认是行向量。

keepdims : bool, optional

If this is set to True, the axes which are normed over are left in the result as dimensions with size one. With this option the result will broadcast correctly against the original x.

再看看更为详细的计算说明:

    The following norms can be calculated:

    =====  ============================  ==========================
    ord    norm for matrices             norm for vectors
    =====  ============================  ==========================
    None   Frobenius norm                2-norm
    'fro'  Frobenius norm                --
    'nuc'  nuclear norm                  --
    inf    max(sum(abs(x), axis=1))      max(abs(x))
    -inf   min(sum(abs(x), axis=1))      min(abs(x))
    0      --                            sum(x != 0)
    1      max(sum(abs(x), axis=0))      as below
    -1     min(sum(abs(x), axis=0))      as below
    2      2-norm (largest sing. value)  as below
    -2     smallest singular value       as below
    other  --                            sum(abs(x)**ord)**(1./ord)
    =====  ============================  ==========================1234567891011121314151617

看到上面这个表以后,同学们应该特别清楚了吧。

reshape

https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.reshape.html#numpy.reshape

numpy.reshape(a, newshape, order=‘C’)[source]

-1的作用:最后一个维度,可以用-1让程序自动计算,而不是准确计算出来,

若newshape是个整数,则都在同一行里;所以newshape中,n等价于(1,n);

newshape是个元组,若只有一个参数,可以不加() ,但正常都要带着! 或者为防止出错,还是习惯性带着吧,哪怕只有一个参数。

a = np.arange(6).reshape((3, 2))
a = np.arange(6).reshape(3, 2)
a = np.reshape(np.arange(6), (3, 2))  
a = np.reshape(np.arange(6), 3, 2)  #err!

    W1 = np.random.randn(n_h, n_x)*0.01
    W1 = np.random.randn((n_h, n_x))*0.01 #err! 这个当特例,记住吧!!!!!!!!
    
    b1 = np.zeros((n_h,1))  #算是标准写法吧
    b1 = np.zeros(n_h,1) #err!

A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape (b∗∗c∗∗d, a) is to use:

X_flatten = X.reshape(X.shape[0], -1).T      # X.T is the transpose of X

Numpy乘法(dot multiply * 三种)

dot是标准矩阵乘法

multiply是对应项相乘 点乘

*有多重含义,不建议使用

https://www.jianshu.com/p/fd2999f41d84

数学计算

>>> np.log10(100)
2.0
>>> np.log(np.e)
1.0
>>> np.log2(4)
2.0

square  平方
sqrt  平方根


numpy.random.randn

Return a sample (or samples) from the “standard normal” distribution.

filled with random floats sampled from a univariate “normal” (Gaussian) distribution of mean 0 and variance 1

For random samples from N(\mu, \sigma^2), use:

sigma * np.random.randn(...) + mu

猜你喜欢

转载自blog.csdn.net/sinat_37026077/article/details/84622281
今日推荐