Numpy diagram (2)--matrix

content

matrix

matrix initialization

axis parameter

Matrix Operations

Operators +, -, *, /, //, ** and @

Transpose and reshape

Connect the matrices hstack and vstack

Splitting matrices hsplit and vsplit

Copy matrix tile and repeat

delete row and column delete

insert row and column insert

Add actions append and pad

Meshgrid

Matrix Statistics

matrix sorting

matrix

matrix initialization

The matrix initialization syntax is similar to that of vectors:

The double brackets are needed here because the second positional parameter is reserved for dtype.

The generation of random matrices is also similar to the generation of vectors:

The two-dimensional index syntax is more convenient than nested lists:

As with one-dimensional arrays, the view above shows that sliced ​​arrays do not actually make any copies. When the array is modified, the changes will also be reflected in the slice.

axis parameter

In many operations (such as summation), we need to tell NumPy whether we want to operate across rows or columns. In order to use a general representation of arbitrary dimensions, NumPy introduces the concept of axis : the axis parameter is actually the number of indices in question: the first index is axis=0, the second is axis=1, etc.

So in a two-dimensional array, if axis=0 is by column, then axis=1 is by row.

Matrix Operations

Operators +, -, *, /, //, ** and @

In addition to normal operators (such as +, -, *, /, //, and **) that compute element-wise, there is an @ operator that computes matrix products:

In the first part, we have seen the operation of vector products, and NumPy allows element-wise mixed operations between vectors and matrices, and even between two vectors:

Transpose and reshape

As you can see from the example above, in a 2D array, row and column vectors are treated differently.

By default, 1D arrays are treated as row vectors in 2D operations. So when multiplying a matrix by a row vector, you can use (n,) or (1,n) and the result will be the same.

If you need a column vector, there are transpose methods to operate on it:

Two operations to generate a two-bit array column vector from a one-dimensional array are to use the commands reshape and newaxis to create new indices:

The -1 parameter here means that reshape automatically calculates the length of the array in the second dimension, and None acts as a shortcut for np.newaxis in square brackets, which adds an empty axis at the specified position.

Therefore, there are a total of three types of vectors in NumPy: 1D arrays, 2D row vectors, and 2D column vectors . Here is a schematic diagram of the explicit conversion between the two:

According to the rules, one-dimensional arrays are implicitly interpreted as two-dimensional row vectors, so conversions between these two arrays are usually not necessary, and the corresponding areas are marked in gray.

Connect the matrices hstack and vstack

The connectivity matrix has two main functions:

Both functions work fine when stacking only matrices or just vectors. But when it comes to mixed stacking between 1D arrays and matrices, vstack works fine: hstack gives dimension mismatch errors.

Because as mentioned above, 1D arrays are interpreted as row vectors, not column vectors. A workaround is to convert it to a column vector, or use column_stack to do it automatically:

Splitting matrices hsplit and vsplit

The reverse of stacking is splitting:

Copy matrix tile and repeat

A matrix can be copied in two ways: tile is like copy-paste, repeat is like paginated printing.

delete row and column delete

Specific columns and rows can be deleted with delete :

insert row and column insert

The inverse operation is insertion:

Add actions append and pad

append is like hstack, the function cannot automatically transpose 1D arrays, so again you need to transpose the vector or add the length, or use column_stack instead:

In fact, if all we need to do is add constant values ​​to the bounds of the array, the pad function will suffice:

Meshgrid

If we were to create the following matrix:

Both methods are slow because they use Python loops. The way to deal with this kind of problem in MATLAB is to create a meshgrid :

The meshgrid function accepts an arbitrary set of indices, mgrid is just a slice, and indices can only generate a complete range of indices. fromfunction calls the provided function only once with the I and J parameters as described above.

But actually, there is a better way in NumPy. No need to spend storage space on the entire matrix. It's enough to just store a vector of the correct size, the algorithm will take care of the rest:

Without the indexing='ij' parameter, meshgrid will change the order of the parameters: J, I = np.meshgrid(j, i) - this is an "xy" mode used to visualize 3D plots.

In addition to being initialized on 2D or 3D arrays, meshgrid can also be used to index arrays:

Matrix Statistics

Just like the statistical functions mentioned earlier, after the two-dimensional array receives the axis parameter, the corresponding statistical operation will be taken:

In two dimensions and higher, the argmin and argmax functions return the indices of the maximum and minimum values:

The all and any functions can also use the axis parameter:

matrix sorting

While the axis argument is useful for the functions listed above, it is not helpful for two-dimensional sorting:

axis is by no means a replacement for the Python list key argument. However, NumPy has several functions that allow sorting by columns:

1. Sort the array by the first column : a[a[:,0].argsort()]

Here returns the indexed array of the original array after argsort has been sorted.

This trick can be repeated, but care must be taken so that the next sort does not confuse the results of the previous sort :

a = a[a[:,2].argsort()]a = a[a[:,1].argsort(kind=’stable’)]a = a[a[:,0].argsort(kind=’stable’)]

2. There is a helper function lexsort that sorts all available columns as above, but always does it row by row , for example:

  • a[np.lexsort(np.flipud(a[2,5].T))]: First sort by the 2nd column, then by the 5th column;

  • a[np.lexsort(np.flipud(aT))]: Sorts all columns from left to right.

3. There is also a parameter order, but it is neither fast nor easy to use if you start with a normal (unstructured) array.

4. Because this particular way of operating is more readable and it may be a better choice, pandas is less error-prone to do this:

  • pd.DataFrame(a).sort_values(by=[2,5]).to_numpy(): Sort by column 2 and then by column 5 .

  • pd.DataFrame(a).sort_values().to_numpy(): sort by all columns from left to right

Guess you like

Origin blog.csdn.net/weixin_43145427/article/details/124317071