Data mining numpy advanced linear algebra

Simple array operations

Refer to linalg.py in the numpy folder for more information.

import numpy
import numpy.linalg as linalg
a = numpy.array([
    [1, 2],
    [3, 4]
], dtype=float)
print("a: \n", a)

print("a的转置: \n", a.transpose())
print("a的你矩阵: \n", linalg.inv(a))

u = numpy.eye(2)  # 2×2单位矩阵, eye = I
print("单位矩阵u: \n", u)
print("u的迹: \n", numpy.trace(u))
j = numpy.array([
    [0, -1],
    [1, 0]
], dtype=float)
print("j: \n", j)
print("j×j: \n", numpy.dot(j, j))  # 矩阵积

y = numpy.array([
    [5],
    [7]
], dtype=float)
print("y: \n", y)
print("线性方程求解: \n", linalg.solve(a, y))
print("特征向量求解: \n", linalg.eig(j))

"E:\Python 3.6.2\python.exe" F:/PycharmProjects/test.py
a: 
 [[ 1.  2.]
 [ 3.  4.]]
a的转置: 
 [[ 1.  3.]
 [ 2.  4.]]
a的你矩阵: 
 [[-2.   1. ]
 [ 1.5 -0.5]]
单位矩阵u: 
 [[ 1.  0.]
 [ 0.  1.]]
u的迹: 
 2.0
j: 
 [[ 0. -1.]
 [ 1.  0.]]
j×j: 
 [[-1.  0.]
 [ 0. -1.]]
y: 
 [[ 5.]
 [ 7.]]
线性方程求解: 
 [[-3.]
 [ 4.]]
特征向量求解: 
 (array([ 0.+1.j,  0.-1.j]), array([[ 0.70710678+0.j        ,  0.70710678-0.j        ],
       [ 0.00000000-0.70710678j,  0.00000000+0.70710678j]]))

Process finished with exit code 0

Matrix class

This is a brief introduction to the matrix class.

import numpy
import numpy.linalg as linalg

a = numpy.matrix('1, 2; 3, 4', dtype=float)
print("a: \n", a)
print("TYPE a: ", type(a))

print("a的转置: \n", a.T)

x = numpy.matrix('5, 7', dtype=float).T
print("a×x: \n", a*x)

print("线性方程求解: \n", linalg.solve(a, x))

"E:\Python 3.6.2\python.exe" F:/PycharmProjects/test.py
a: 
 [[ 1.  2.]
 [ 3.  4.]]
TYPE a:  <class 'numpy.matrixlib.defmatrix.matrix'>
a的转置: 
 [[ 1.  3.]
 [ 2.  4.]]
a×x: 
 [[ 19.]
 [ 43.]]
线性方程求解: 
 [[-3.]
 [ 4.]]

Process finished with exit code 0

Indexing: Comparing matrices and 2D arrays

Note that there are some important differences between arrays and matrices in NumPy.

NumPy provides two basic objects:

An N-dimensional array object and a generic function object. Other objects are built upon them.

In particular, matrices are two-dimensional array objects that inherit from NumPy array objects.

For both arrays and matrices, the index must contain the appropriate combination of one or more of these: an integer scalar, ellipses, a list of integers; a boolean, a tuple of integers or booleans, and a one-dimensional array of integers or booleans .

Matrices can be used as matrix indices, but usually arrays, lists, or other forms are required to accomplish this task.

As usual in Python, indices are 0-based. Traditionally we use rectangular rows and columns to represent a two-dimensional array or matrix, where the rows along the 0-axis are called rows and the columns along the 1-axis are called.

import numpy
import numpy.linalg as linalg

# 创建数组和矩阵用来切片

A = numpy.arange(12)
print("A :", A)
A.shape = 3, 4
M = numpy.mat(A.copy())  # 深复制，转成矩阵
print("TYPE A: {}, TYPE M: {}".format(type(A), type(M)))

print("A: \n", A)
print("M: \n", M)

"E:\Python 3.6.2\python.exe" F:/PycharmProjects/test.py
A : [ 0  1  2  3  4  5  6  7  8  9 10 11]
TYPE A: <class 'numpy.ndarray'>, TYPE M: <class 'numpy.matrixlib.defmatrix.matrix'>
A: 
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
M: 
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Process finished with exit code 0

Basic slicing uses slice objects or integers. For example A[ : ], M[ : ] will be evaluated very similar to Python indices. It is important to note, however, that NumPy sliced arrays do not create copies of the data; slices provide a unified view of the data.

>>> print A[:]; print A[:].shape
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
(3, 4)
>>> print M[:]; print M[:].shape
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
(3, 4)

Use comma-separated indices to index along multiple axes.

import numpy
import numpy.linalg as linalg
# 创建数组和矩阵用来切片

A = numpy.arange(12)
print("A :", A)
A.shape = 3, 4
M = numpy.mat(A.copy())  # 深复制，转成矩阵

print("A: \n", A)
print("M: \n", M)

# 对二维数组使用一个冒号产生一个一维数组，然而矩阵产生了一个二维矩阵。
print("A[:, 1]: {} TYPE A[:, 1]: {}".format(A[:, 1], A[:, 1].shape))
print("M[:, 1]: \n {} TYPE M[:, 1]: {}".format(M[:, 1], M[:, 1].shape))

"E:\Python 3.6.2\python.exe" F:/PycharmProjects/test.py
A : [ 0  1  2  3  4  5  6  7  8  9 10 11]
TYPE A: <class 'numpy.ndarray'>, TYPE M: <class 'numpy.matrixlib.defmatrix.matrix'>
A: 
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
M: 
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
A[:, 1]: [1 5 9] TYPE A[:, 1]: (3,)
M[:, 1]: 
 [[1]
 [5]
 [9]] TYPE M[:, 1]: (3, 1)

Process finished with exit code 0

Using a colon with a two-dimensional array produces a one-dimensional array, whereas a matrix produces a two-dimensional matrix. 10 For example, a M[2,:]slice produces a matrix of shape (1,4), by contrast, a slice of an array always produces an array of the lowest possible dimension of 11 . For example, if C is a three-dimensional array, C[...,1]a two-dimensional array is C[1,:,1]produced and a one-dimensional array is produced. From this point on, we will only show the result of array slicing if the corresponding matrix slice result is the same.

import numpy
import numpy.linalg as linalg
# 创建数组和矩阵用来切片

A = numpy.arange(12)
print("A :", A)
A.shape = 3, 4
M = numpy.mat(A.copy())  # 深复制，转成矩阵

print("A: \n", A)
print("M: \n", M)

# 对二维数组使用一个冒号产生一个一维数组，然而矩阵产生了一个二维矩阵。
print("A[:, 1]: {} TYPE A[:, 1]: {}".format(A[:, 1], A[:, 1].shape))  # 切出的列
print("M[:, 1]: \n {} TYPE M[:, 1]: {}".format(M[:, 1], M[:, 1].shape))

print("A[1, :]: {} TYPE A[1, :]: {}".format(A[1, :], A[1, :].shape))  # 切出的行
print("M[1, :]: {} TYPE M[1, :]: {}".format(M[1, :], M[1, :].shape))

"E:\Python 3.6.2\python.exe" F:/PycharmProjects/test.py
A : [ 0  1  2  3  4  5  6  7  8  9 10 11]
A: 
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
M: 
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
A[:, 1]: [1 5 9] TYPE A[:, 1]: (3,)
M[:, 1]: 
 [[1]
 [5]
 [9]] TYPE M[:, 1]: (3, 1)
A[1, :]: [4 5 6 7] TYPE A[1, :]: (4,)
M[1, :]: [[4 5 6 7]] TYPE M[1, :]: (1, 4)


Process finished with exit code 0

The first and third columns of the array, one way is to use list slicing:

# 切出第一列和第三列（从0开始计数）
print("M[:, [1, 3]]: {} TYPE M[:, [1, 3]]: {}".format(M[:, [1, 3]], M[:, [1, 3]].shape))

M[:, [1, 3]]: [[ 1  3]
 [ 5  7]
 [ 9 11]] TYPE M[:, [1, 3]]: (3, 2)

A slightly more complicated way is to use the take() method:

print("M[:, [1, 3]]:", M[:, ].take([1, 3], axis=1))

If we want to skip the first line, we can do this:

# 跳过第一行
print("跳过A的第一行进行切片: ", A[1:, ].take([1, 3], axis=1))
# 跳过第一行，矩阵向量积实现
print("跳过A的第一行进行切片: ", A[numpy.ix_((1, 2), (1, 3))])

跳过A的第一行进行切片:  [[ 5  7]
 [ 9 11]]
跳过A的第一行进行切片:  [[ 5  7]
 [ 9 11]]

Now let's do something more complicated. Say we want to keep the columns whose first row is greater than 1. One way is to create a boolean index:

>>> A[0,:]>1
array([False, False, True, True], dtype=bool)
>>> A[:,A[0,:]>1]
array([[ 2,  3],
       [ 6,  7],
       [10, 11]])

But indexing the matrix is not so convenient, it only counts the first row.

>>> M[0,:]>1
matrix([[False, False, True, True]], dtype=bool)
>>> M[:,M[0,:]>1]
matrix([[2, 3]])  # 不知道为什么报错，IndexError: too many indices for array

The problem with this process is that slicing with "matrix slicing" produces a matrix, but the matrix has an A property whose values are presented as arrays. So we just do the following substitution:

>>> M[:,M.A[0,:]>1]
matrix([[ 2,  3],
        [ 6,  7],
        [10, 11]])

If we want to conditionally slice in both directions of the matrix, we have to adjust the strategy slightly, instead:

>>> A[A[:,0]>2,A[0,:]>1]      # 只取结果矩阵的对角线
array([ 6, 11])
>>> M[M.A[:,0]>2,M.A[0,:]>1]
matrix([[ 6, 11]])

We need to use the vector product ix_():

>>> A[ix_(A[:,0]>2,A[0,:]>1)]
array([[ 6,  7],
       [10, 11]])
>>> M[ix_(M.A[:,0]>2,M.A[0,:]>1)]
matrix([[ 6,  7],
        [10, 11]])

print(A[numpy.ix_(A[:, 0] > 2, A[0, :] > 1)]) # Equivalent to A[A[:, 0] > 2, :][:, A[ 0, :] > 1]