1. Description
NumPy (or Numeric Python) is at the heart of every data science and machine learning project.
The entire data-driven ecosystem depends in some way on NumPy and its core functionality. This makes it one of the most important and game-changing libraries Python has ever created.
Given NumPy's wide applicability in industry and academia due to its unparalleled potential, familiarity with its methods and syntax is extremely necessary for python programmers.
However, if you are new to the NumPy library and trying to get a firm grip on the NumPy library , things can seem very daunting and overwhelming at first if you start with the official NumPy documentation .
I've been there myself, and this blog is meant to help you get started with NumPy. In other words, in this blog, I will review my experience with NumPy and share 45 specific methods that I use almost all the time.
You can refer to the code for this article here .
Two, Numpy usage example
2.1 Import library
Of course, if you want to use the NumPy library, you should import it.
import numpy as np
import pandas as pd
The widely adopted convention here is to set the alias to np. We'll also be using pandas here and there, so let's import that too.
2.2 (1–10) NumPy array creation methods
Below are some of the most common ways to create NumPy arrays.
#1) from python list
To convert a python list to a NumPy array, use the following method:np.array()
a = [1, 2, 3]
np.array(a)
We can verify the data type of an object created using methods available in Python:type
a = [1, 2, 3]
type(np.array(a))
In the above demo, we created a one-dimensional array.
One-dimensional array (picture from the author)
However, we can also create a multidimensional NumPy array using a list of lists:np.array()
a = [[1,2,3], [4,5,6]]
np.array(a)
To create a NumPy array of a specific data type, pass arguments:dtype
a = [[1,2,3], [4,5,6]]
np.array(a, dtype = np.float32)
#2) Create a NumPy array of zeros
Usually creates a NumPy array filled with zeros. This can be done using methods in NumPy as follows:np.zeros()
np.zeros(5)
>>
array([0., 0., 0., 0., 0.])
For multidimensional NumPy arrays:
np.zeros((2, 3))
>>
array([[0., 0., 0.],
[0., 0., 0.]])
#3) Create an array of numbers
If you want to create an array filled with ones, use this method instead of zeros:np.ones()
np.ones((2, 3))
>>
array([[1., 1., 1.],
[1., 1., 1.]])
#4) Create an array of identity numbers
In the identity matrix, the diagonal is filled with "1s" and all entries except the diagonal are "0s", as follows:
Identity matrix (picture from the author)
Use this method to create an identity matrix.np.eye()
np.eye(3)
>>
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
#5) Create an equally spaced NumPy array using specific steps
To generate equally spaced values within a given interval, use the following method:np.arange()
- Use generated values from:
start=0
stop=10
step=1
np.arange(10)
>>array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
- Use generated values from:
start=5
stop=11
step=1
np.arange(5, 11)
>>array([ 5, 6, 7, 8, 9, 10])
- Generates the value of with.
start=5
step=11
step=2
np.arange(5, 11, 2)
>>array([5, 7, 9])
The value is not included in the final array, by default, .stop
step=1
#6) Create an equally spaced NumPy array with a specific array size
This is similar to what was discussed above, but using , you can generate numbers within an interval, and the numbers are evenly distributed.np.arange()
np.linspace()
num
np.linspace(start = 10, stop = 20, num = 5)
>>array([10. , 12.5, 15. , 17.5, 20. ])
#7–8) Generate a random NumPy array
- To generate a random array of integers, use the following method:
np.random.randint()
np.random.randint(low = 5, high = 16, size = 5)
>>array([12, 9, 8, 8, 13])
- However, to generate random floating point samples, use the following method:
np.random.random()
np.random.random(size = 10)
>> array([0.13011502, 0.13624477, 0.63199788, 0.62565385, 0.47521946,
0.31121428, 0.11785969, 0.49575226, 0.77330761, 0.77047183])
#9–10) Generating NumPy arrays from pandas series
If you want to convert a Pandas Series to a NumPy array, you can use one of the or methods:np.array()
np.asarray()
s = pd.Series([1,2,3,4], name = "col")
np.array(s)
>> array([1, 2, 3, 4])
s = pd.Series([1,2,3,4], name = "col")
np.asarray(s)
>> array([1, 2, 3, 4])
11–21) NumPy array manipulation methods
Next, we'll discuss some of the most widely used methods for manipulating NumPy arrays.
#11) The shape of the array of numbers
You can determine the shape of a NumPy array using the attribute methods of NumPy arrays, as follows:np.shape()
ndarray.shape
a = np.ones((2, 3))
print("Shape of the array - Method 1:", np.shape(a))
print("Shape of the array - Method 2:", a.shape)
>>
Shape of the array - Method 1: (2, 3)
Shape of the array - Method 2: (2, 3)
#12) Reshaping NumPy arrays
Reshaping refers to giving a NumPy array a new shape without changing its data.
You can change the shape with:np.reshape()
a = np.arange(10)
a.reshape((2, 5))
>> array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
#13–14) Transpose a NumPy array
If you want to transpose a NumPy array, you can use that method or something like this:np.transpose()
ndarray.T
a = np.arange(12).reshape((6, 2))
a.transpose()
>>
array([[ 0, 2, 4, 6, 8, 10],
[ 1, 3, 5, 7, 9, 11]])
a = np.arange(12).reshape((6, 2))
a.T
>>
array([[ 0, 2, 4, 6, 8, 10],
[ 1, 3, 5, 7, 9, 11]])
#15–17) Concatenate multiple NumPy arrays to form one NumPy array
You can use this method to concatenate sequences of arrays and get a new NumPy array:np.concatenate()
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
np.concatenate((a, b), axis=0)
>>
array([[1, 2],
[3, 4],
[5, 6]])
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
np.concatenate((a, b.T), axis=1)
>>
array([[1, 2, 5],
[3, 4, 6]])
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
np.concatenate((a, b), axis=None)
>>
array([1, 2, 3, 4, 5, 6])
axis=0
the same with.np.vstack()
axis=1
the same with.np.hstack()
#18) Flattening Number Arrays
If you want to collapse an entire NumPy array into a single dimension, you can use something like this:ndarray.flatten()
a = np.array([[1,2], [3,4]])
a.flatten()
>>
array([1, 2, 3, 4])
#19) Unique elements of numeric arrays
To determine the unique elements of a NumPy array, use a method like this:np.unique()
a = np.array([[1, 2], [2, 3]])
np.unique(a)
>>
array([1, 2, 3])
a = np.array([[1, 2, 3], [1, 2, 3], [2, 3, 4]])
np.unique(a, axis=0)
>>
array([[1, 2, 3],
[2, 3, 4]])
a = np.array([[1, 1, 3], [1, 1, 3], [1, 1, 4]])
np.unique(a, axis=1)
>>
array([[1, 3],
[1, 3],
[1, 4]])
#20) Extrude an array of numbers
Use this method if you want to remove an axis of length 1 from a NumPy array. As follows:np.squeeze()
x = np.array([[[0], [1], [2]]])
>>> x.shape
(1, 3, 1)
np.squeeze(x).shape
>>
(3,)
#21) Convert NumPy arrays to Python lists
To get a python list from a NumPy array, use something like this:ndarry.tolist()
a = np.array([[1, 1, 3], [1, 1, 3], [1, 1, 4]])
a.tolist()
>>
[[1, 1, 3], [1, 1, 3], [1, 1, 4]]
22-33) Mathematical operations on NumPy arrays
NumPy provides a wide variety of element-wise mathematical functions that you can apply to NumPy arrays. You can read about all available math operations here . Below, let's discuss some of the most commonly used ones.
#22–24) Trigonometric functions
a = np.array([1,2,3])
print("Trigonometric Sine :", np.sin(a))
print("Trigonometric Cosine :", np.cos(a))
print("Trigonometric Tangent:", np.tan(a))
>>
Trigonometric Sine : [0.84147098 0.90929743 0.14112001]
Trigonometric Cosine : [ 0.54030231 -0.41614684 -0.9899925 ]
Trigonometric Tangent: [ 1.55740772 -2.18503986 -0.14254654]
#25–28) Rounding functions
- Use this method to return the element-wise floor.
np.floor()
- Use this method to return the upper limit of elements.
np.ceil()
- Use this method to round to the nearest integer.
np.rint()
>>> a = np.linspace(1, 2, 5)
array([1. , 1.25, 1.5 , 1.75, 2. ])
>>> np.floor(a)
array([1., 1., 1., 1., 2.])
>>> np.ceil(a)
array([1., 2., 2., 2., 2.])
>>> np.rint(a)
array([1., 1., 2., 2., 2.])
- Round to the given number of decimal places using:
np.round_()
a = np.linspace(1, 2, 7)
np.round_(a, 2) # 2 decimal places
>>
array([1. , 1.17, 1.33, 1.5 , 1.67, 1.83, 2. ])
#29–30) Exponential and logarithmic
- Use this method to calculate element indices.
np.exp()
- Use this method to calculate the element-wise natural logarithm.
np.log()
>>> a = np.arange(1, 6)
array([1, 2, 3, 4, 5])
>>> np.exp(a).round(2)
array([ 2.72, 7.39, 20.09, 54.6 , 148.41])
>>> np.log(a).round(2)
array([0. , 0.69, 1.1 , 1.39, 1.61])
#31–32) sum and product
- Use this method to calculate the sum of array elements:
np.sum()
a = np.array([[1, 2], [3, 4]])
>>> np.sum(a)
10
>>> np.sum(a, axis = 0)
array([4, 6])
>>> np.sum(a, axis = 1)
array([3, 7])
- Use this method to calculate the product of array elements:
np.prod()
a = np.array([[1, 2], [3, 4]])
>>> np.prod(a)
24
>>> np.prod(a, axis = 0)
array([3, 8])
>>> np.sum(a, axis = 1)
array([2, 12])
#33) Square root
Use the np.sqrt() method to calculate the square root of an array element:
a = np.array([[1, 2], [3, 4]])
np.sqrt(a)
>>
array([[1. , 1.41421356],
[1.73205081, 2. ]])
34-36) Matrix and vector operations
#34) Dot Product
If you want to compute the dot product of two NumPy arrays, use the following method:np.dot()
a = np.array([[1, 2], [3, 4]])
b = np.array([[1, 1], [1, 1]])
np.dot(a, b)
>>
array([[3, 3],
[7, 7]])
#35) Matrix Products
To compute the matrix product of two NumPy arrays, use the or operator in Python:np.matmul()
@
a = np.array([[1, 2], [3, 4]])
b = np.array([[1, 1], [1, 1]])
>>> np.matmul(a, b)
array([[3, 3],
[7, 7]])
>>> a@b
array([[3, 3],
[7, 7]])
Note: In this case, the outputs of and are the same, but they can be very different. You can read their differences here .
np.matmul()
np.dot()
#36) Vector Norm
A vector norm represents a set of functions for measuring the length of a vector. I already have a post about vector norms, you can read it below:
a = np.arange(-4, 5)
>>> np.linalg.norm(a) ## L2 Norm
7.745966692414834
>>> np.linalg.norm(a, 1) ## L1 Norm
20.0
Use this method to find matrix or vector norms:np.linalg.norm()
37-38) Sort method
#37) Sorting an array of numbers
To sort the array in place, use this method.ndarray.sort()
a = np.array([[1,4],[3,1]])
>>> np.sort(a) ## sort based on rows
array([[1, 4],
[1, 3]])
>>> np.sort(a, axis=None) ## sort the flattened array
array([1, 1, 3, 4])
>>> np.sort(a, axis=0) ## sort based on columns
array([[1, 1],
[3, 4]])
#38) Index order in sorted NumPy arrays
To return the order of the indices that will sort the array, use the following method:np.argsort()
x = np.array([3, 1, 2])
np.argsort(x)
>>
array([1, 2, 0])
39-42) Search method
#39) Exponent corresponding to maximum value
To return the index of the largest value along an axis, use a method like this:np.argmax()
>>> a = np.random.randint(1, 20, 10).reshape(2,5)
array([[15, 13, 10, 1, 18],
[14, 19, 19, 17, 8]])
>>> np.argmax(a) ## index in a flattend array
6
>>> np.argmax(a, axis=0) ## indices along columns
array([0, 1, 1, 1, 0])
>>> np.argmax(a, axis=1) ## indices along rows
array([4, 1])
To find an index in a non-flattened array, you can do:
ind = np.unravel_index(np.argmax(a), a.shape)
ind
>>
(1, 1)
#40) The index corresponding to the minimum
Likewise, if you want to return the index of the smallest value along an axis, use something like this:np.argmin()
>>> a = np.random.randint(1, 20, 10).reshape(2,5)
array([[15, 13, 10, 1, 18],
[14, 19, 19, 17, 8]])
>>> np.argmin(a) ## index in a flattend array
3
>>> np.argmin(a, axis=0) ## indices along columns
array([1, 0, 0, 0, 1])
>>> np.argmin(a, axis=1) ## indices along rows
array([3, 4])
#41) Search by criteria
If you want to choose between two arrays based on a condition, use something like this:np.where()
>>> a = np.random.randint(-10, 10, 10)
array([ 2, -3, 6, -3, -8, 4, -6, -2, 6, -4])
>>> np.where(a < 0, 0, a)
array([2, 0, 6, 0, 0, 4, 0, 0, 6, 0])
"""
if element < 0:
return 0
else:
return element
"""
#42) Non-zero element index
To determine the index of a non-zero element in a NumPy array, use the following method:np.nonzero()
a = np.array([[3, 0, 0], [0, 4, 0], [5, 6, 0]])
np.nonzero(a)
>>
(array([0, 1, 2, 2]), array([0, 1, 0, 1]))
43-45) Statistical methods
Next, let's look at ways to compute standard statistics on NumPy arrays. You can find all statistical techniques supported by NumPy here .
#43) Average
To find the mean of values in a NumPy array along an axis, use something like this:np.mean()
a = np.array([[1, 2], [3, 4]])
>>> np.mean(a)
2.5
>>> np.mean(a, axis = 1) ## along the row axis
array([1.5, 3.5])
>>> np.mean(a, axis = 0) ## along the column axis
array([2., 3.])
#44) Median number
To calculate the median of a NumPy array, use this method.np.median()
a = np.array([[1, 2], [3, 4]])
>>> np.median(a)
2.5
>>> np.median(a, axis = 1) ## along the row axis
array([1.5, 3.5])
>>> np.median(a, axis = 0) ## along the column axis
array([2., 3.])
#45) Standard Deviation
To compute the standard deviation of a NumPy array along a specified array, use this method.np.std()
a = np.array([[1, 2], [3, 4]])
>>> np.std(a)
1.118033988749895
>>> np.std(a, axis = 1) ## along the row axis
array([0.5, 0.5])
>>> np.std(a, axis = 0) ## along the column axis
array([1., 1.])