Python的版本
基本数据类型
与大多数语言一样,Python有许多基本类型,包括整数,浮点数,布尔值和字符串。这些数据类型的行为方式与其他编程语言相似。
Numbers(数字类型):代表的是整数和浮点数,它原理与其他语言相同:
# -*- coding: UTF-8 -*-
x = 3
print(type(x)) # Prints "<class 'int'>"
print(x) # Prints "3"
print(x + 1) # Addition; prints "4"
print(x - 1) # Subtraction; prints "2"
print(x * 2) # Multiplication; prints "6"
print(x ** 2) # Exponentiation; prints "9"
x += 1
print(x) # Prints "4"
x *= 2
print(x) # Prints "8"
y = 2.5
print(type(y)) # Prints "<class 'float'>"
print(y, y + 1, y * 2, y ** 2) # Prints "2.5 3.5 5.0 6.25"
注意,与许多语言不同,Python没有一元增量(x+)或递减(x-)运算符。
Python还有用于复数的内置类型;你可以在这篇文档中找到所有的详细信息。
Booleans(布尔类型): Python实现了所有常用的布尔逻辑运算符,但它使用的是英文单词而不是符号 (&&, ||, etc.):
# -*- coding: UTF-8 -*-
t = True
f = False
print(type(t)) # Prints "<class 'bool'>"
print(t and f) # Logical AND; prints "False"
print(t or f) # Logical OR; prints "True"
print(not t) # Logical NOT; prints "False"
print(t != f) # Logical XOR; prints "True"
Strings(字符串类型):Python对字符串有很好的支持:
hello = 'hello' # String literals can use single quotes
world = "world" # or double quotes; it does not matter.
print(hello) # Prints "hello"
print(len(hello)) # String length; prints "5"
hw = hello + ' ' + world # String concatenation
print(hw) # prints "hello world"
hw12 = '%s %s %d' % (hello, world, 12) # sprintf style string formatting
print(hw12) # prints "hello world 12"
String对象有许多有用的方法;例如:
s = "hello"
print(s.capitalize()) # Capitalize a string; prints "Hello"
print(s.upper()) # Convert a string to uppercase; prints "HELLO"
print(s.rjust(7)) # Right-justify a string, padding with spaces; prints " hello"
print(s.center(7)) # Center a string, padding with spaces; prints " hello "
print(s.replace('l', '(ell)')) # Replace all instances of one substring with another;
# prints "he(ell)(ell)o"
print(' world '.strip()) # Strip leading and trailing whitespace; prints "world"
容器(Containers)
Python包含几种内置的容器类型:列表、字典、集合和元组。
列表(Lists)
列表其实就是Python中的数组,但是可以它可以动态的调整大小并且可以包含不同类型的元素:
xs = [3, 1, 2] # Create a list
print(xs, xs[2]) # Prints "[3, 1, 2] 2"
print(xs[-1]) # Negative indices count from the end of the list; prints "2"
xs[2] = 'foo' # Lists can contain elements of different types
print(xs) # Prints "[3, 1, 'foo']"
xs.append('bar') # Add a new element to the end of the list
print(xs) # Prints "[3, 1, 'foo', 'bar']"
x = xs.pop() # Remove and return the last element of the list
print(x, xs) # Prints "bar [3, 1, 'foo']"
我们将在numpy数组的上下文中再次看到切片
循环Loops:你可以循环遍历列表的元素,如下所示:
animals = ['cat', 'dog', 'monkey']
for animal in animals:
print(animal)
# Prints "cat", "dog", "monkey", each on its own line.
如果要访问循环体内每个元素的索引,请使用内置的 enumerate 函数:
animals = ['cat','dog','monkey']
for idx, animal in enumerate(animals):
print('#%d: %s' % (idx + 1, animal))
列表推导式(List comprehensions): 编程时,我们经常想要将一种数据转换为另一种数据。 举个简单的例子,思考以下计算平方数的代码:
nums = [0, 1, 2, 3, 4]
squares = []
for x in nums:
squares.append(x ** 2)
print(squares) # Prints [0, 1, 4, 9, 16]
你可以使用 列表推导式 使这段代码更简单:
nums = [0, 1, 2, 3, 4]
squares = [x ** 2 for x in nums]
print(squares) # Prints [0, 1, 4, 9, 16]
列表推导还可以包含条件:
nums = [0,1,2,3,4]
even_squares = [x ** 2 for x in nums if x % 2 == 0]
print(even_squares) #Prints "[0,4,16]"
字典
字典存储(键,值)对,类似于Java中的Map或Javascript中的对象。你可以像这样使用过它:
d = {'cat': 'cute', 'dog': 'furry'} # Create a new dictionary with some data
print(d['cat']) # Get an entry from a dictionary; prints "cute"
print('cat' in d) # Check if a dictionary has a given key; prints "True"
d['fish'] = 'wet' # Set an entry in a dictionary
print(d['fish']) # Prints "wet"
# print(d['monkey']) # KeyError: 'monkey' not a key of d
print(d.get('monkey', 'N/A')) # Get an element with a default; prints "N/A"
print(d.get('fish', 'N/A')) # Get an element with a default; prints "wet"
del d['fish'] # Remove an element from a dictionary
print(d.get('fish', 'N/A')) # "fish" is no longer a key; prints "N/A"
(循环)Loops:迭代词典中的键很容易:
d = {"person":2,'cat':4,'spider':8}
for animal in d:
legs = d[animal]
print('A %s has %d legs' % (animal,legs))
运行结果:
A person has 2 legs
A cat has 4 legs
A spider has 8 legs
如果要访问键及其对应的值,请使用items方法:
nums = [0,1,2,3,4]
even_num_to_square = {x : x ** 2 for x in nums if x % 2 == 0}
print(even_num_to_square)
集合(Sets)
集合是不同元素的无序集合。举个简单的例子,请思考下面的代码:
animals = {'cat', 'dog'}
print('cat' in animals) # Check if an element is in a set; prints "True"
print('fish' in animals) # prints "False"
animals.add('fish') # Add an element to a set
print('fish' in animals) # Prints "True"
print(len(animals)) # Number of elements in a set; prints "3"
animals.add('cat') # Adding an element that is already in the set does nothing
print(len(animals)) # Prints "3"
animals.remove('cat') # Remove an element from a set
print(len(animals)) # Prints "2"
循环(Loops):遍历集合的语法与遍历列表的语法相同;但是,由于集合是无序的,因此不能假设访问集合元素的顺序:
animals = {'cat','dog','fish'}
for idx,animal in enumerate(animals):
print('#%d:%s' % (idx + 1,animal))
运行结果:
#1:dog
#2:fish
#3:cat
集合推导式(Set comprehensions):就像列表和字典一样,我们可以很容易使用集合来构造集合:
from math import sqrt
nums = {int(sqrt(x)) for x in range(30)}
print(nums) # Prints "{0, 1, 2, 3, 4, 5}"
元组(Tuples)
元组是(不可变的)有序值列表。元组在很多类似于列表;其中一个重要的区别是元组可以用作字典中的键和集合的元素,而列表则不能。这是一个简单的例子:
d = {(x, x + 1): x for x in range(10)} # Create a dictionary with tuple keys
t = (5, 6) # Create a tuple
print(type(t)) # Prints "<class 'tuple'>"
print(d[t]) # Prints "5"
print(d[(1, 2)]) # Prints "1"
函数(Functions)
Python函数使用def关键字定义。例如:
def sign(x):
if x > 0:
return 'positive'
elif x < 0:
return 'negative'
else:
return 'zero'
for x in [-1,0,1]:
print(sign(x))
运行结果:
negative
zero
positive
我们经常定义函数来获取可选的关键字参数,如下所示:
def hello(name,loud=False):
if loud:
print('HELLO,%s!' % name.upper())
else:
print('Hello,%s' % name)
hello('Bob') #Prints "Hello,Bob"
hello('Fred',loud=True) #Prints "Hello,FRED!"
运行结果:
Hello,Bob
HELLO,FRED!
类(Classes)
在Python中定义类的语法很简单:
class Greeter(object):
def __init__(self,name):
self.name = name
#Instance method
def greet(self,loud=False):
if loud:
print('HELLO,%s!' % self.name.upper())
else:
print('Hello,%s' % self.name)
g = Greeter('Fred')
g.greet()
g.greet(loud=True)
运行结果:
Hello,Fred
HELLO,FRED!
NumPy
Numpy是Python中科学计算的核心库。它提供了一个高性能的多维数组对象,以及用于处理这些数组的工具。如果你已经熟悉MATLAB,你可能会发现这篇教程对于你从MATLAB切换到学习Numpy很有帮助。
数组(Arrays)
numpy数组是一个值网格,所有类型都相同,并由非负整数元组索引。 维数是数组的排名; 数组的形状是一个整数元组,给出了每个维度的数组大小。
我们可以从嵌套的Python列表初始化numpy数组,并使用方括号访问元素:
import numpy as np
a = np.array([1, 2, 3]) # Create a rank 1 array
print(type(a)) # Prints "<class 'numpy.ndarray'>"
print(a.shape) # Prints "(3,)"
print(a[0], a[1], a[2]) # Prints "1 2 3"
a[0] = 5 # Change an element of the array
print(a) # Prints "[5, 2, 3]"
b = np.array([[1,2,3],[4,5,6]]) # Create a rank 2 array
print(b.shape) # Prints "(2, 3)"
print(b[0, 0], b[0, 1], b[1, 0]) # Prints "1 2 4"
Numpy还提供了许多创建数组的函数:
import numpy as np
a = np.zeros((2,2)) # Create an array of all zeros
print(a) # Prints "[[ 0. 0.]
# [ 0. 0.]]"
b = np.ones((1,2)) # Create an array of all ones
print(b) # Prints "[[ 1. 1.]]"
c = np.full((2,2), 7) # Create a constant array
print(c) # Prints "[[ 7. 7.]
# [ 7. 7.]]"
d = np.eye(2) # Create a 2x2 identity matrix
print(d) # Prints "[[ 1. 0.]
# [ 0. 1.]]"
e = np.random.random((2,2)) # Create an array filled with random values
print(e) # Might print "[[ 0.91940167 0.08143941]
# [ 0.68744134 0.87236687]]"
数组索引
Numpy提供了几种索引数组方法。
切片(Slicing): 与Python列表类似,可以对numpy数组进行切片。由于数组可能是多维的,因此必须为数组的每个维指定一个切片:
import numpy as np
# Create the following rank 2 array with shape (3, 4)
# [[ 1 2 3 4]
# [ 5 6 7 8]
# [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
# [6 7]]
b = a[:2, 1:3]
# A slice of an array is a view into the same data, so modifying it
# will modify the original array.
print(a[0, 1]) # Prints "2"
b[0, 0] = 77 # b[0, 0] is the same piece of data as a[0, 1]
print(a[0, 1]) # Prints "77"
你还可以将整数索引与切片索引混合使用。 但是,这样做会产生比原始数组更低级别的数组。 请注意,这与MATLAB处理数组切片的方式完全不同:
import numpy as np
# Create the following rank 2 array with shape (3, 4)
# [[ 1 2 3 4]
# [ 5 6 7 8]
# [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
# Two ways of accessing the data in the middle row of the array.
# Mixing integer indexing with slices yields an array of lower rank,
# while using only slices yields an array of the same rank as the
# original array:
row_r1 = a[1, :] # Rank 1 view of the second row of a
row_r2 = a[1:2, :] # Rank 2 view of the second row of a
print(row_r1, row_r1.shape) # Prints "[5 6 7 8] (4,)"
print(row_r2, row_r2.shape) # Prints "[[5 6 7 8]] (1, 4)"
# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape) # Prints "[ 2 6 10] (3,)"
print(col_r2, col_r2.shape) # Prints "[[ 2]
# [ 6]
# [10]] (3, 1)"
整数数组索引:使用切片索引到Numpy数组时,生成的数组视图将始终是原始数组的子数组。相反,整数数组索引允许你使用另外一个数组中的数据构造任意数组。这是一个例子:
import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])
# An example of integer array indexing.
# The returned array will have shape (3,) and
print(a[[0, 1, 2], [0, 1, 0]]) # Prints "[1 4 5]"
# The above example of integer array indexing is equivalent to this:
print(np.array([a[0, 0], a[1, 1], a[2, 0]])) # Prints "[1 4 5]"
# When using integer array indexing, you can reuse the same
# element from the source array:
print(a[[0, 0], [1, 1]]) # Prints "[2 2]"
# Equivalent to the previous integer array indexing example
print(np.array([a[0, 1], a[0, 1]])) # Prints "[2 2]"
整数数组索引的一个有用技巧是从矩阵的每一行中选择或改变一个元素:
import numpy as np
# Create a new array from which we will select elements
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
print(a) # prints "array([[ 1, 2, 3],
# [ 4, 5, 6],
# [ 7, 8, 9],
# [10, 11, 12]])"
# Create an array of indices
b = np.array([0, 2, 0, 1])
# Select one element from each row of a using the indices in b
print(a[np.arange(4), b]) # Prints "[ 1 6 7 11]"
# Mutate one element from each row of a using the indices in b
a[np.arange(4), b] += 10
print(a) # prints "array([[11, 2, 3],
# [ 4, 5, 16],
# [17, 8, 9],
# [10, 21, 12]])
布尔数组索引: 布尔数组索引允许你选择数组的任意元素。通常,这种类型的索引用于选择满足某些条件的数组元素。下面是一个例子:
import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])
bool_idx = (a > 2) # Find the elements of a that are bigger than 2;
# this returns a numpy array of Booleans of the same
# shape as a, where each slot of bool_idx tells
# whether that element of a is > 2.
print(bool_idx) # Prints "[[False False]
# [ True True]
# [ True True]]"
# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print(a[bool_idx]) # Prints "[3 4 5 6]"
# We can do all of the above in a single concise statement:
print(a[a > 2]) # Prints "[3 4 5 6]"
数据类型
每个numpy数组都是相同类型元素的网格。Numpy提供了一组可用于构造数组的大量数值数据类型。Numpy在创建数组时尝试猜测数据类型,但构造数组的函数通常还包含一个可选参数来显式指定数据类型。这是一个例子:
import numpy as np
x = np.array([1, 2]) # Let numpy choose the datatype
print(x.dtype) # Prints "int32"
x = np.array([1.0, 2.0]) # Let numpy choose the datatype
print(x.dtype) # Prints "float64"
x = np.array([1, 2], dtype=np.int64) # Force a particular datatype
print(x.dtype) # Prints "int64"
数组中的数学
基本数学函数在数组上以元素方式运行,既可以作为运算符重载,也可以作为numpy模块中的函数:
import numpy as np
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)
# Elementwise sum; both produce the array
# [[ 6.0 8.0]
# [10.0 12.0]]
print(x + y)
print(np.add(x, y))
# Elementwise difference; both produce the array
# [[-4.0 -4.0]
# [-4.0 -4.0]]
print(x - y)
print(np.subtract(x, y))
# Elementwise product; both produce the array
# [[ 5.0 12.0]
# [21.0 32.0]]
print(x * y)
print(np.multiply(x, y))
# Elementwise division; both produce the array
# [[ 0.2 0.33333333]
# [ 0.42857143 0.5 ]]
print(x / y)
print(np.divide(x, y))
# Elementwise square root; produces the array
# [[ 1. 1.41421356]
# [ 1.73205081 2. ]]
print(np.sqrt(x))
请注意,与MATLAB不同,*是元素乘法,而不是矩阵乘法。 我们使用dot函数来计算向量的内积,将向量乘以矩阵,并乘以矩阵。 dot既可以作为numpy模块中的函数,也可以作为数组对象的实例方法:
import numpy as np
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])
v = np.array([9,10])
w = np.array([11, 12])
# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))
# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))
# Matrix / matrix product; both produce the rank 2 array
# [[19 22]
# [43 50]]
print(x.dot(y))
print(np.dot(x, y))
Numpy为在数组上执行计算提供了许多有用的函数;其中最有用的函数之一是 SUM:
import numpy as np
x = np.array([[1,2],[3,4]])
print(np.sum(x)) # Compute sum of all elements; prints "10"
print(np.sum(x, axis=0)) # Compute sum of each column; prints "[4 6]"
print(np.sum(x, axis=1)) # Compute sum of each row; prints "[3 7]"
除了使用数组计算数学函数外,我们经常需要对数组中的数据进行整形或其他操作。这种操作的最简单的例子是转置一个矩阵;要转置一个矩阵,只需使用一个数组对象的T属性:
import numpy as np
x = np.array([[1,2], [3,4]])
print(x) # Prints "[[1 2]
# [3 4]]"
print(x.T) # Prints "[[1 3]
# [2 4]]"
# Note that taking the transpose of a rank 1 array does nothing:
v = np.array([1,2,3])
print(v) # Prints "[1 2 3]"
print(v.T) # Prints "[1 2 3]"
广播(Broadcasting)
广播是一种强大的机制,它允许numpy在执行算术运算时使用不同形状的数组。通常,我们有一个较小的数组和一个较大的数组,我们希望多次使用较小的数组来对较大的数组执行一些操作。
例如,假设我们要向矩阵的每一行添加一个常数向量。我们可以这样做:
import numpy as np
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = np.empty_like(x) # Create an empty matrix with the same shape as x
# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
y[i, :] = x[i, :] + v
# Now y is the following
# [[ 2 2 4]
# [ 5 5 7]
# [ 8 8 10]
# [11 11 13]]
print(y)
这会凑效; 但是当矩阵 x 非常大时,在Python中计算显式循环可能会很慢。注意,向矩阵 x 的每一行添加向量 v 等同于通过垂直堆叠多个 v 副本来形成矩阵 vv,然后执行元素的求和x 和 vv。 我们可以像如下这样实现这种方法:
import numpy as np
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
vv = np.tile(v, (4, 1)) # Stack 4 copies of v on top of each other
print(vv) # Prints "[[1 0 1]
# [1 0 1]
# [1 0 1]
# [1 0 1]]"
y = x + vv # Add x and vv elementwise
print(y) # Prints "[[ 2 2 4
# [ 5 5 7]
# [ 8 8 10]
# [11 11 13]]"
Numpy广播允许我们在不实际创建v的多个副本的情况下执行此计算。考虑这个需求,使用广播如下:
import numpy as np
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = x + v # Add v to each row of x using broadcasting
print(y) # Prints "[[ 2 2 4]
# [ 5 5 7]
# [ 8 8 10]
# [11 11 13]]"
y=x+v行即使x具有形状(4,3)和v具有形状(3,),但由于广播的关系,该行的工作方式就好像v实际上具有形状(4,3),其中每一行都是v的副本,并且求和是按元素执行的。
将两个数组一起广播遵循以下规则:
如果数组不具有相同的rank,则将较低等级数组的形状添加1,直到两个形状具有相同的长度。
如果两个数组在维度上具有相同的大小,或者如果其中一个数组在该维度中的大小为1,则称这两个数组在维度上是兼容的。
如果数组在所有维度上兼容,则可以一起广播。
广播之后,每个数组的行为就好像它的形状等于两个输入数组的形状的元素最大值。
在一个数组的大小为1且另一个数组的大小大于1的任何维度中,第一个数组的行为就像沿着该维度复制一样
如果对于以上的解释依然没有理解,请尝试阅读这篇文档或这篇解释中的说明。
支持广播的功能称为通用功能。你可以在这篇文档中找到所有通用功能的列表。
以下是广播的一些应用:
import numpy as np
# Compute outer product of vectors
v = np.array([1,2,3]) # v has shape (3,)
w = np.array([4,5]) # w has shape (2,)
# To compute an outer product, we first reshape v to be a column
# vector of shape (3, 1); we can then broadcast it against w to yield
# an output of shape (3, 2), which is the outer product of v and w:
# [[ 4 5]
# [ 8 10]
# [12 15]]
print(np.reshape(v, (3, 1)) * w)
# Add a vector to each row of a matrix
x = np.array([[1,2,3], [4,5,6]])
# x has shape (2, 3) and v has shape (3,) so they broadcast to (2, 3),
# giving the following matrix:
# [[2 4 6]
# [5 7 9]]
print(x + v)
# Add a vector to each column of a matrix
# x has shape (2, 3) and w has shape (2,).
# If we transpose x then it has shape (3, 2) and can be broadcast
# against w to yield a result of shape (3, 2); transposing this result
# yields the final result of shape (2, 3) which is the matrix x with
# the vector w added to each column. Gives the following matrix:
# [[ 5 6 7]
# [ 9 10 11]]
print((x.T + w).T)
# Another solution is to reshape w to be a column vector of shape (2, 1);
# we can then broadcast it directly against x to produce the same
# output.
print(x + np.reshape(w, (2, 1)))
# Multiply a matrix by a constant:
# x has shape (2, 3). Numpy treats scalars as arrays of shape ();
# these can be broadcast together to shape (2, 3), producing the
# following array:
# [[ 2 4 6]
# [ 8 10 12]]
print(x * 2)