Common operations of data mining numpy

Universal function (ufunc)

NumPy provides common math functions like sin, cos and exp.

In NumPy, these are called "universal functions" (ufuncs). In NumPy these functions operate element-wise , producing an array as output.

更多函数all, alltrue, any, apply along axis, argmax, argmin, argsort, average, bincount, ceil, clip, conj, conjugate, corrcoef, cov, cross, cumprod, cumsum, diff, dot, floor, inner, inv, lexsort, max, maximum, mean, median, min, minimum, nonzero, outer, prod, re, round, sometrue, sort, std, sum, trace, transpose, var, vdot, vectorize, where 参见:NumPy示例

b = numpy.arange(3)
print(numpy.exp(b))
print(numpy.sin(b))
print(numpy.sqrt(b))
[ 1.          2.71828183  7.3890561 ]
[ 0.          0.84147098  0.90929743]
[ 0.          1.          1.41421356]

Indexing, Slicing and Iterating

One-dimensional arrays can be indexed, sliced, and iterated over, just like lists and other Python sequences.

import numpy
a = numpy.arange(10)**3
print(a[2])  # 索引
print(a[2:5])  # 切片
a[:6:2] = 100  # 索引0到6, 步进为2, 对其所有值进行赋值。
print(a)
b = a[::-1]  # 反转倒叙, 不会改变原来的值
print(b)
for i in a:
    print(i**(1/3))
8
[ 8 27 64]
[100   1 100  27 100 125 216 343 512 729]
[729 512 343 216 125 100  27 100   1 100]
4.64158883361
1.0
4.64158883361
3.0
4.64158883361
5.0
6.0
7.0
8.0
9.0

Multidimensional arrays can have one index per axis. These indices are given as a comma-separated tuple.

The fromfunction function creates an array a, and the result returned by a is the result of our custom function F, so the return value of the function F is stored in a, and x and y are actually the indices of the array, and the number of parameters of the function F is number is the number of axes of its matrix;

As shown in the following code, it is a two-dimensional array, where x is 0 to 4 (5-1), y is 0-3 (4-1), where x is the 0 axis and y is the 1 axis.

import numpy
def f(x, y):
    return 10*x + y

a = numpy.fromfunction(f, (5, 4), dtype=int)
print(a)
print(a[2, 3])  # 索引
print(a[0:5, 1])  # 切片,表示 0轴的0-4行, 取1轴的索引为1(第二个)的数构成数组
print(a[1:3, :])  # 取0轴 1-3行, 1轴的所有数据构成数组
print(a[-1])  # 当提供少于轴数的索引时,缺失的索引被认为是整个切片
[[ 0  1  2  3]
 [10 11 12 13]
 [20 21 22 23]
 [30 31 32 33]
 [40 41 42 43]]
23
[ 1 11 21 31 41]
[[10 11 12 13]
 [20 21 22 23]]
[40 41 42 43]

b[i]Expressions in square brackets are treated as ia series :, representing the remaining axes. NumPy also allows you to use "dot" images b[i,...].

The dots (...) represent the many semicolons necessary to produce a complete index tuple. If x is an array of rank 5 (i.e. it has 5 axes), then:

  • x[1,2,…] is equivalent to x[1,2,:,:,:],
  • x[…,3] is equivalent to x[:,:,:,:,3]
  • x[4,…,5,:] is equivalent to x[4,:,:,5,:].

Iterating over a multidimensional array is in terms of the first axis:

for row in a:
    print(row)   #  输出了五个0轴的数组,即第一个轴
[0 1 2 3]
[10 11 12 13]
[20 21 22 23]
[30 31 32 33]
[40 41 42 43]

To operate on each element in an array, we can use the flat property, which is an iterator over the elements of the array:

for elem in a.flat:  # 输出每一个元素
    print(elem)
0
1
2
3
10
11
12
13
20
21
22
23
30
31
32
33
40
41
42
43

More [], …, newaxis, ndenumerate, indices, index exp refer to the NumPy example

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324933960&siteId=291194637