45 Skills Every NumPy Professional Should Master

1. Description

        NumPy (or Numeric Python) is at the heart of every data science and machine learning project.

        The entire data-driven ecosystem depends in some way on NumPy and its core functionality. This makes it one of the most important and game-changing libraries Python has ever created.

        Given NumPy's wide applicability in industry and academia due to its unparalleled potential, familiarity with its methods and syntax is extremely necessary for python programmers.

However, if you are new to the NumPy library and trying to get a firm grip on the NumPy library , things can seem very daunting and overwhelming at first         if you start with the  official NumPy documentation .

        I've been there myself, and this blog is meant to help you get started with NumPy. In other words, in this blog, I will review my experience with NumPy and share 45 specific methods that I use almost all the time.

You can refer to the code for this article here .

Two, Numpy usage example

2.1 Import library

        Of course, if you want to use the NumPy library, you should import it.

import numpy as np
import pandas as pd

        The widely adopted convention here is to set the alias to np. We'll also be using pandas here and there, so let's import that too. 

2.2 (1–10) NumPy array creation methods

        Below are some of the most common ways to create NumPy arrays.

#1) from python list

To convert a python list to a NumPy array, use the following method:np.array()

a = [1, 2, 3]
np.array(a)

We can verify the data type of an object created using methods available in Python:type

a = [1, 2, 3]
type(np.array(a))

In the above demo, we created a one-dimensional array.

One-dimensional array (picture from the author)

However, we can also create a multidimensional NumPy array using a list of lists:np.array()

a = [[1,2,3], [4,5,6]]
np.array(a)
Two-dimensional array

To create a NumPy array of a specific data type, pass arguments:dtype

a = [[1,2,3], [4,5,6]]
np.array(a, dtype = np.float32)

#2) Create a NumPy array of zeros

Usually creates a NumPy array filled with zeros. This can be done using methods in NumPy as follows:np.zeros()

np.zeros(5)

>>

array([0., 0., 0., 0., 0.])

For multidimensional NumPy arrays:

np.zeros((2, 3))

>>

array([[0., 0., 0.],
       [0., 0., 0.]])

#3) Create an array of numbers

If you want to create an array filled with ones, use this method instead of zeros:np.ones()

np.ones((2, 3))

>>

array([[1., 1., 1.],
       [1., 1., 1.]])

#4) Create an array of identity numbers

In the identity matrix, the diagonal is filled with "1s" and all entries except the diagonal are "0s", as follows:

Identity matrix (picture from the author)

Use this method to create an identity matrix.np.eye()

np.eye(3)

>>

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

#5) Create an equally spaced NumPy array using specific steps

To generate equally spaced values ​​within a given interval, use the following method:np.arange()

  • Use generated values ​​from:start=0stop=10step=1
np.arange(10)
>>array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
  • Use generated values ​​from:start=5stop=11step=1
np.arange(5, 11)
>>array([ 5,  6,  7,  8,  9, 10])
  • Generates the value of with.start=5step=11step=2
np.arange(5, 11, 2)
>>array([5, 7, 9])

The value is not included in the final array, by default, .stopstep=1

#6) Create an equally spaced NumPy array with a specific array size

This is similar to what was discussed above, but using , you can generate numbers within an interval, and the numbers are evenly distributed.np.arange()np.linspace()num

np.linspace(start = 10, stop = 20, num = 5)
>>array([10. , 12.5, 15. , 17.5, 20. ])

#7–8) Generate a random NumPy array

np.random.randint(low = 5, high = 16, size = 5)
>>array([12,  9,  8,  8, 13])
  • However, to generate random floating point samples, use the following method:np.random.random()
np.random.random(size = 10)
>> array([0.13011502, 0.13624477, 0.63199788, 0.62565385, 0.47521946,
       0.31121428, 0.11785969, 0.49575226, 0.77330761, 0.77047183])

#9–10) Generating NumPy arrays from pandas series

If you want to convert a Pandas Series to a NumPy array, you can use one of the or methods:np.array()np.asarray()

s = pd.Series([1,2,3,4], name = "col")
np.array(s)
>> array([1, 2, 3, 4])
s = pd.Series([1,2,3,4], name = "col")
np.asarray(s)
>> array([1, 2, 3, 4])

11–21) NumPy array manipulation methods

        Next, we'll discuss some of the most widely used methods for manipulating NumPy arrays.

#11) The shape of the array of numbers

You can determine the shape of a NumPy array using the attribute methods of NumPy arrays, as follows:np.shape()ndarray.shape

a = np.ones((2, 3))
print("Shape of the array - Method 1:", np.shape(a))
print("Shape of the array - Method 2:", a.shape)
>> 
Shape of the array - Method 1: (2, 3)
Shape of the array - Method 2: (2, 3)

#12) Reshaping NumPy arrays

Reshaping refers to giving a NumPy array a new shape without changing its data.

You can change the shape with:np.reshape()

a = np.arange(10)
a.reshape((2, 5))
>> array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

#13–14) Transpose a NumPy array

If you want to transpose a NumPy array, you can use that method or something like this:np.transpose()ndarray.T

a = np.arange(12).reshape((6, 2))
a.transpose()
>>
array([[ 0,  2,  4,  6,  8, 10],
       [ 1,  3,  5,  7,  9, 11]])
a = np.arange(12).reshape((6, 2))
a.T
>> 
array([[ 0,  2,  4,  6,  8, 10],
       [ 1,  3,  5,  7,  9, 11]])

#15–17) Concatenate multiple NumPy arrays to form one NumPy array

You can use this method to concatenate sequences of arrays and get a new NumPy array:np.concatenate()

a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
np.concatenate((a, b), axis=0)
>>
array([[1, 2],
       [3, 4],
       [5, 6]])

a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
np.concatenate((a, b.T), axis=1)
>> 
array([[1, 2, 5],
       [3, 4, 6]])
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
np.concatenate((a, b), axis=None)
>>
array([1, 2, 3, 4, 5, 6])

#18) Flattening Number Arrays

If you want to collapse an entire NumPy array into a single dimension, you can use something like this:ndarray.flatten()

a = np.array([[1,2], [3,4]])
a.flatten()
>>
array([1, 2, 3, 4])

#19) Unique elements of numeric arrays

To determine the unique elements of a NumPy array, use a method like this:np.unique()

a = np.array([[1, 2], [2, 3]])
np.unique(a)
>>
array([1, 2, 3])

a = np.array([[1, 2, 3], [1, 2, 3], [2, 3, 4]])
np.unique(a, axis=0)
>>
array([[1, 2, 3],
       [2, 3, 4]])
a = np.array([[1, 1, 3], [1, 1, 3], [1, 1, 4]])
np.unique(a, axis=1)
>>
array([[1, 3],
       [1, 3],
       [1, 4]])

#20) Extrude an array of numbers

        Use this method if you want to remove an axis of length 1 from a NumPy array. As follows:np.squeeze()

x = np.array([[[0], [1], [2]]])

>>> x.shape
(1, 3, 1)

np.squeeze(x).shape
>>
(3,)

#21) Convert NumPy arrays to Python lists

To get a python list from a NumPy array, use something like this:ndarry.tolist()

a = np.array([[1, 1, 3], [1, 1, 3], [1, 1, 4]])
a.tolist()
>>
[[1, 1, 3], [1, 1, 3], [1, 1, 4]]

22-33) Mathematical operations on NumPy arrays

NumPy provides a wide variety of element-wise mathematical functions that you can apply to NumPy arrays. You can read about all available math operations here . Below, let's discuss some of the most commonly used ones.

#22–24) Trigonometric functions

a = np.array([1,2,3])
print("Trigonometric Sine   :", np.sin(a))
print("Trigonometric Cosine :", np.cos(a))
print("Trigonometric Tangent:", np.tan(a))
>>
Trigonometric Sine   : [0.84147098 0.90929743 0.14112001]
Trigonometric Cosine : [ 0.54030231 -0.41614684 -0.9899925 ]
Trigonometric Tangent: [ 1.55740772 -2.18503986 -0.14254654]

#25–28) Rounding functions

  • Use this method to return the element-wise floor.np.floor()
  • Use this method to return the upper limit of elements.np.ceil()
  • Use this method to round to the nearest integer.np.rint()
>>> a = np.linspace(1, 2, 5)
array([1.  , 1.25, 1.5 , 1.75, 2.  ])


>>> np.floor(a)
array([1., 1., 1., 1., 2.])

>>> np.ceil(a)
array([1., 2., 2., 2., 2.])

>>> np.rint(a)
array([1., 1., 2., 2., 2.])

  • Round to the given number of decimal places using:np.round_()
a = np.linspace(1, 2, 7)
np.round_(a, 2) # 2 decimal places
>>
array([1.  , 1.17, 1.33, 1.5 , 1.67, 1.83, 2.  ])

#29–30) Exponential and logarithmic

  • Use this method to calculate element indices.np.exp()
  • Use this method to calculate the element-wise natural logarithm.np.log()
>>> a = np.arange(1, 6)
array([1, 2, 3, 4, 5])

>>> np.exp(a).round(2)
array([  2.72,   7.39,  20.09,  54.6 , 148.41])

>>> np.log(a).round(2)
array([0.  , 0.69, 1.1 , 1.39, 1.61])

#31–32) sum and product

  • Use this method to calculate the sum of array elements:np.sum()
a = np.array([[1, 2], [3, 4]])

>>> np.sum(a)
10

>>> np.sum(a, axis = 0)
array([4, 6])

>>> np.sum(a, axis = 1)
array([3, 7])

  • Use this method to calculate the product of array elements:np.prod()
a = np.array([[1, 2], [3, 4]])

>>> np.prod(a)
24

>>> np.prod(a, axis = 0)
array([3, 8])

>>> np.sum(a, axis = 1)
array([2, 12])

#33) Square root

Use  the np.sqrt()  method to calculate the square root of an array element:

a = np.array([[1, 2], [3, 4]])
np.sqrt(a)
>>
array([[1.        , 1.41421356],
       [1.73205081, 2.        ]])

34-36) Matrix and vector operations

#34) Dot Product

If you want to compute the dot product of two NumPy arrays, use the following method:np.dot()

a = np.array([[1, 2], [3, 4]])
b = np.array([[1, 1], [1, 1]])
np.dot(a, b)
>>
array([[3, 3],
       [7, 7]])

#35) Matrix Products

To compute the matrix product of two NumPy arrays, use the or operator in Python:np.matmul()@

a = np.array([[1, 2], [3, 4]])
b = np.array([[1, 1], [1, 1]])

>>> np.matmul(a, b)
array([[3, 3],
       [7, 7]])

>>> a@b
array([[3, 3],
       [7, 7]])

Note:  In this case, the outputs of and are the same, but they can be very different. You can read their differences here .np.matmul()np.dot()

#36) Vector Norm

A vector norm represents a set of functions for measuring the length of a vector. I already have a post about vector norms, you can read it below:

a = np.arange(-4, 5)

>>> np.linalg.norm(a) ## L2 Norm
7.745966692414834

>>> np.linalg.norm(a, 1) ## L1 Norm
20.0

Use this method to find matrix or vector norms:np.linalg.norm()

37-38) Sort method

#37) Sorting an array of numbers

To sort the array in place, use this method.ndarray.sort()

a = np.array([[1,4],[3,1]])

>>> np.sort(a) ## sort based on rows
array([[1, 4],
       [1, 3]])

>>> np.sort(a, axis=None) ## sort the flattened array
array([1, 1, 3, 4])

>>> np.sort(a, axis=0) ## sort based on columns
array([[1, 1],
       [3, 4]])

#38) Index order in sorted NumPy arrays

To return the order of the indices that will sort the array, use the following method:np.argsort()

x = np.array([3, 1, 2])
np.argsort(x)
>>
array([1, 2, 0])

39-42) Search method

#39) Exponent corresponding to maximum value

To return the index of the largest value along an axis, use a method like this:np.argmax()

>>> a = np.random.randint(1, 20, 10).reshape(2,5)
array([[15, 13, 10,  1, 18],
       [14, 19, 19, 17,  8]])

>>> np.argmax(a) ## index in a flattend array
6

>>> np.argmax(a, axis=0) ## indices along columns
array([0, 1, 1, 1, 0])

>>> np.argmax(a, axis=1) ## indices along rows
array([4, 1])

To find an index in a non-flattened array, you can do:

ind = np.unravel_index(np.argmax(a), a.shape)
ind
>>
(1, 1)

#40) The index corresponding to the minimum

        Likewise, if you want to return the index of the smallest value along an axis, use something like this:np.argmin()

>>> a = np.random.randint(1, 20, 10).reshape(2,5)
array([[15, 13, 10,  1, 18],
       [14, 19, 19, 17,  8]])

>>> np.argmin(a) ## index in a flattend array
3

>>> np.argmin(a, axis=0) ## indices along columns
array([1, 0, 0, 0, 1])

>>> np.argmin(a, axis=1) ## indices along rows
array([3, 4])

#41) Search by criteria

If you want to choose between two arrays based on a condition, use something like this:np.where()

>>> a = np.random.randint(-10, 10, 10)
array([ 2, -3,  6, -3, -8,  4, -6, -2,  6, -4])

>>> np.where(a < 0, 0, a)
array([2, 0, 6, 0, 0, 4, 0, 0, 6, 0])
"""
if element < 0:
    return 0
else:
    return element
"""

#42) Non-zero element index

        To determine the index of a non-zero element in a NumPy array, use the following method:np.nonzero()

a = np.array([[3, 0, 0], [0, 4, 0], [5, 6, 0]])
np.nonzero(a)
>>
(array([0, 1, 2, 2]), array([0, 1, 0, 1]))

43-45) Statistical methods

        Next, let's look at ways to compute standard statistics on NumPy arrays. You can find all statistical techniques supported by NumPy here .

#43) Average

To find the mean of values ​​in a NumPy array along an axis, use something like this:np.mean()

a = np.array([[1, 2], [3, 4]])

>>> np.mean(a)
2.5

>>> np.mean(a, axis = 1) ## along the row axis
array([1.5, 3.5])

>>> np.mean(a, axis = 0) ## along the column axis
array([2., 3.])

#44) Median number

To calculate the median of a NumPy array, use this method.np.median()

a = np.array([[1, 2], [3, 4]])

>>> np.median(a)
2.5

>>> np.median(a, axis = 1) ## along the row axis
array([1.5, 3.5])

>>> np.median(a, axis = 0) ## along the column axis
array([2., 3.])

#45) Standard Deviation

To compute the standard deviation of a NumPy array along a specified array, use this method.np.std()

a = np.array([[1, 2], [3, 4]])

>>> np.std(a)
1.118033988749895

>>> np.std(a, axis = 1) ## along the row axis
array([0.5, 0.5])

>>> np.std(a, axis = 0) ## along the column axis
array([1., 1.])

Guess you like

Origin blog.csdn.net/gongdiwudu/article/details/131904005