[Machine learning][Part3] Basic knowledge of numpy vector matrix operations

I haven’t been exposed to mathematics for a long time. Machine learning requires some mathematical knowledge. Here I will review the relevant basic mathematical knowledge.

Vector

Vectors are ordered arrays of numbers. In notation, vectors are represented by lowercase bold letters. The elements of a vector are all of the same type. For example, vectors do not contain characters and numbers. The number of elements in an array is often called a dimension. The number of elements in an array is often called a dimension. The elements of a vector can be referenced using indices. In mathematical settings, the index usually goes from 1 to n. In computer science and these labs, indexes usually run from 0 to n-1. The following is a comparison of the two. In computers, we use the code on the left, which is 0 to n-1.

 Arrays in Numpy

The basic data structure of NumPy is an indexable n-dimensional array containing elements of the same type (dtype).

Operations on one-dimensional vectors:

vector creation
Create a one-dimensional vector of the specified shape. The parameters can be integers, primitives, etc., and the parameters represent the shape of the sequence to be created.
a= np.zeros(4); print(f"np.zeros(4) :   a = {a}, a shape = {a.shape}, a data type = {a.dtype}")
a= np.zeros((4,)); print(f"np.zeros(4,) :  a = {a}, a shape = {a.shape}, a data type = {a.dtype}")
a= np.random.random_sample(4); print(f"np.random.random_sample(4): a = {a}, a shape = {a.shape}, a data type = {a.dtype}")
Create a one-dimensional vector without specifying a shape
a = np.arange(4.); print(f"np.arange(4.):     a = {a}, a shape = {a.shape}, a data type = {a.dtype}")
a = np.random.rand(4);  print(f"np.random.rand(4): a = {a}, a shape = {a.shape}, a data type = {a.dtype}")
Creates a one-dimensional vector of specified values
a = np.array([5, 4, 3, 2]); print(f"np.array([5,4,3,2]):  a = {a},     a shape = {a.shape}, a data type = {a.dtype}")
a = np.array([5., 4, 3, 2]); print(f"np.array([5.,4,3,2]): a = {a}, a shape = {a.shape}, a data type = {a.dtype}")
Vector operations
Get vector elements: It can be obtained through index and slicing. This operation is similar to the list operation.
  • Obtained through index:
a = np.arange(10)
print(a)
# 获取Index = 2的元素
print(f"a[2].shape: {a[2].shape} a[2]  = {a[2]}, Accessing an element returns a scalar")

# 获取最后一个元素
print(f"a[-1]={a[-1]}")

# index 必须在向量有效范围以内,否则会报错
try:
    c = a[10]
except Exception as e:
    print(e)
  • Slice to get elements
# 切片操作通过(start:stop:step)这个三个参数来控制要获取的元素,
# 切片操作是左闭右开,也就是包括index=start的值,但是不包括index=stop的值
# 向量切片操作
a = np.arange(10)
print(f"a         = {a}")

#获取向量中的从index=2开始到Index=7结束的5个元素, 第三个参数1表示step=1,代表连续取值 (start:stop:step)
c = a[2:7:1];     print("a[2:7:1] = ", c)

# 获取向量中的从index=2开始到Index=7结束的元素, 第三个参数1表示step=2,代表隔一个index取一个值 (start:stop:step)
c = a[2:7:2];     print("a[2:7:2] = ", c)

# 取index 大于3的所有值
c = a[3:];        print("a[3:]    = ", c)

# 取index小于3的所有值
c = a[:3];        print("a[:3]    = ", c)

# 取所有的值
c = a[:];         print("a[:]     = ", c)
  • Operations on a single vector
a = np.array([1, 2, 3, 4])
print(f"a:       {a}")
# 将向量中的元素全部变为相反数
b = -a
print(f"b:      {b}")
# 计算向量中所有元素的和并返回一个和的标量
b = np.sum(a)
print(f"b = np.sum(a) : {b}")
# 求向量的平均值
b = np.mean(a)
print(f"b = np.mean(a): {b}")
# 对向量中每个元素求平法
b = a**2
print(f"b = a**2      : {b}")
  • For operations on vector elements, many numpy operations on numbers are also used on vectors.
# 向量a+向量b, 两个向量长度必须相同,不然会报error
a = np.array([1, 2, 3, 4])
b= np.array([-1, -2, 3, 4])

print(f"Binary operators work element wise: {a + b}")

# 标量和向量的操作

a = np.array([1, 2, 3, 4])
b = 5 * a
print(f"b = 5 * a : {b}")
  • Dot product of vectors and vectors

Customize a method to implement dot product:

def my_dot(a,b):
    """
   Compute the dot product of two vectors

    Args:
      a (ndarray (n,)):  input vector
      b (ndarray (n,)):  input vector with same dimension as a

    Returns:
      x (scalar):
    """

    x = 0
    for i in range(a.shape[0]):
        x= x+a[i]*b[i]
    return x


# test my_dot()

a = np.array([1,2,3,4])
b = np.array([-1, 4, 3, 2])

print(f"my_dot(a, b) = {my_dot(a, b)}")

Using the dot product method in Numpy:

# 使用numpy中的dot来计算点积,返回一个标量
a = np.array([1, 2, 3, 4])
b = np.array([-1, 4, 3, 2])
c = np.dot(a, b)
print(f"NumPy 1-D np.dot(a, b) = {c}, np.dot(a, b).shape = {c.shape} ")
c = np.dot(b, a)
print(f"NumPy 1-D np.dot(b, a) = {c}, np.dot(a, b).shape = {c.shape} ")

Then make an efficiency comparison between the above two methods of calculating dot products.


# 对比一下numpy 的dot和自己写的my_dot的效率如何,可以看出numpy中的效率要高很多
np.random.seed(1)
a = np.random.rand(10000000)  # very large arrays
b = np.random.rand(10000000)

tic = time.time()  # capture start time
c = np.dot(a, b)
toc = time.time()  # capture end time

print(f"np.dot(a, b) =  {c:.4f}")
print(f"Vectorized version duration: {1000*(toc-tic):.4f} ms ")

tic = time.time()  # capture start time
c = my_dot(a,b)
toc = time.time()  # capture end time

print(f"my_dot(a, b) =  {c:.4f}")
print(f"loop version duration: {1000*(toc-tic):.4f} ms ")

del(a);del(b)  #remove these big arrays from memory

The running results are: You can see that numpy takes much less time

my_dot(a, b) = 24
NumPy 1-D np.dot(a, b) = 24, np.dot(a, b).shape = () 
NumPy 1-D np.dot(b, a) = 24, np.dot(a, b).shape = () 
np.dot(a, b) =  2501072.5817
Vectorized version duration: 6.5184 ms 
my_dot(a, b) =  2501072.5817
loop version duration: 2430.3420 ms 

matrix

A matrix is ​​a two-dimensional array whose elements are all of the same type. Generally represented by capital bold letters. It is represented by two subscripts m and n, where m represents the number of rows and n represents the number of columns. Specified elements can be accessed through two subscripts

Matrix operations

Create matrix

The same method as creating a vector, except that the parameters here need to be replaced by the ancestor
a = np.zeros((1, 5))
print(f"a shape = {a.shape}, a = {a}")

a = np.zeros((2, 1))
print(f"a shape = {a.shape}, a = {a}")

a = np.random.random_sample((1, 1))
print(f"a shape = {a.shape}, a = {a}")

# 2.创建指定元素的矩阵
a= np.array([[5],
             [4],
             [3]])
print(f" a shape = {a.shape}, np.array: a = {a}")

Matrix operations

# 3.矩阵的操作
# 3.1 下标访问
# reshape 是一种比较方便的方法创建矩阵,
a = np.arange(6).reshape(-1, 2) #reshape(-1,2) 表示生成一个6/2行,2列的矩阵,也就是3行两列的矩阵
print(f"a.shape:{a.shape},\na={a}")
# 访问一个元素
print(f"\na[2.0].shape:{a[2:0].shape},a[2,0]={a[2:0]}, type(a[2,0])={type(a[2,0])} Accessing an element returns a scalar\n")
# 访问一行
print(f"a[2].shape:{a[2].shape},a[2] = {a[2]},type(a[2]) = {type(a[2])}")

# 3.2切片访问
a = np.arange(20).reshape(-1, 10)
print(f"a=\n{a}")

# 访问一行中5个连续的元素(start:stop:step)
print("a[0,2:7:1]=",a[0, 2:7:1], "a[0,2:7:1].shape=", a[0, 2:7:1].shape, "a-1D array")

# 访问两行中5个连续的元素(start:stop:step)
print("a[:, 2:7:1] = \n", a[:, 2:7:1], ",  a[:, 2:7:1].shape =", a[:, 2:7:1].shape, "a 2-D array")

# 访问矩阵所有元素
print("a[:,:] = \n", a[:,:], ",  a[:,:].shape =", a[:,:].shape)

# 访问一行中的所有元素,方法1
print("a[1,:] = ", a[1,:], ",  a[1,:].shape =", a[1,:].shape, "a 1-D array")
# 访问一行中的所有元素,方法2
print("a[1]   = ", a[1],   ",  a[1].shape   =", a[1].shape, "a 1-D array")

Guess you like

Origin blog.csdn.net/x1987200567/article/details/133316561