Article directory
General introduction
To determine the dataset type of a Numpy (.npy) file, you can use the Numpy library in Python to load the file and inspect its contents. Here are some common steps:
 Import the Numpy library: First, make sure you have the Numpy library installed and import it:
import numpy as np
 Load Numpy files: Use
np.load()
the function to load .npy files:
data = np.load('your_file.npy')

Check the data's properties: Once the .npy file is loaded, you can check the data's properties to determine its type. Here are some common properties and their meanings:
data.dtype
: This will return the data type of the data. For example,int32
it represents a 32bit integer,float64
represents a 64bit floating point number, and<U5
represents a 5character Unicode string.data.shape
: This will return the shape of the data, i.e. the dimensions of the data and the size of each dimension. For example,(100, 3)
a 2D array with 100 rows and 3 columns(64, 64, 3)
represents a 64x64 pixel image with 3 channels.data.ndim
: This will return the number of dimensions of the data. For example, 2 represents twodimensional data, 3 represents threedimensional data, and so on.data.size
: This will return the total number of elements in the data.

Determine the type based on attributes: Based on the values of the above attributes, you can initially determine the data type in the .npy file. For example, if the data type is integer and the dimension is 2, it might be an image containing pixel values. If the data type is floating point and the dimension is 1, it is probably onedimensional numeric data.

Visualize the data (optional): If you're not sure about the data type, you can try visualizing the data to understand it better. For example, for image data, you can use Matplotlib to display the image. For numerical data, you can draw a histogram or a line chart.
data.dtype
data.dtype
What is returned is the data type (data type) of the data stored in the Numpy array. This data type is usually a Numpy data type object that represents the type of each element in the array.
Numpy supports multiple data types. The following are some common Numpy data types and their corresponding identifiers:
int32
,int64
,int16
,int8
: signed integers, representing 32bit, 64bit, 16bit and 8bit integers respectively.uint32
,uint64
,uint16
,uint8
: unsigned integers, representing 32bit, 64bit, 16bit and 8bit unsigned integers respectively.float32
,float64
: Floating point numbers, representing 32bit and 64bit floating point numbers respectively.complex64
,complex128
: complex numbers, representing 64bit and 128bit complex numbers respectively.<U{n}
: Unicode string, where {n} represents the maximum number of characters in the string.
For example, if a Numpy array has a data type of int32
, then the elements in the array are all 32bit signed integers. If the data type is float64
, then the elements in the array are all 64bit doubleprecision floating point numbers.
<U319
Represents that the data type in a Numpy array is a Unicode string, where the maximum number of characters per string is 319 characters. This is a Numpy data type used to represent text data. In this data type, each element in the array is a Unicode string that can contain a variety of characters, including letters, numbers, symbols, and special characters.
For example, if you have a Numpy array of data type <U319
, then each element of this array can contain up to 319 characters of text data. You can use indexes to access individual strings in an array and perform text processing or analysis operations such as searching, splitting, replacing, etc.
Please note that <U{n}
in {n}
represents the maximum number of characters in the string in this data type. You can choose the appropriate number of characters to store your text data according to your needs.
You can data.dtype
check the data type of a Numpy array using
data.shape
data.shape
Returns the shape of a Numpy array, that is, the dimensions of the array and the size of each dimension. This is a tuple containing size information for each dimension.
For example, if you have a Numpy array data
, you data.shape
can use to get its shape information, in the form (n1, n2, n3, ...)
, where n1
, n2
, n3
etc. represent the size of each dimension. The length of the shape depends on the number of dimensions of the array.
Here are some examples:

For a onedimensional array, the shape will be
(n,)
, wheren
represents the length of the array. 
For a twodimensional array (matrix), the shape will be
(n1, n2)
, wheren1
represents the number of rows andn2
represents the number of columns. 
For a threedimensional array, the shape will be
(n1, n2, n3)
. 
For higher dimensional arrays, the shape will contain a corresponding number of dimension sizes.
For example, if you have a (3, 4)
Numpy array of shape representing a matrix with 3 rows and 4 columns, data.shape
then (3, 4)
.
You can use data.shape
to get information about the dimensions of an array so you can understand its structure when processing and analyzing the data.
data.people
data.ndim
Returns the number of dimensions of a Numpy array, also known as the rank of the array. This value tells you how many dimensions or axes the array has.
For example, if you have a onedimensional array, data.ndim
1 will be returned, indicating that the array is onedimensional. If you have a 2D matrix, data.ndim
2 will be returned, indicating that the array is 2D, with rows and columns. If there is a threedimensional array, data.ndim
3 will be returned, and so on.
The number of dimensions is very important for understanding and manipulating arrays because it determines how many indexes you need to use to access elements in the array. For example, for a twodimensional array, you need to provide two indices, one for specifying rows and one for columns. The number of dimensions is also the length of the tuples in the array shape.
Here are some examples:
 Onedimensional array:
data.ndim
returns 1  Twodimensional array (matrix):
data.ndim
returns 2  Threedimensional array:
data.ndim
return 3  Higher dimensional arrays:
data.ndim
return the corresponding values
By inspecting data.ndim
, you can determine the number of dimensions of the Numpy array you are working with, which helps you correctly manipulate the array when writing code.
data.size
data.size
Returns the total number of elements in a Numpy array. Specifically, it represents the number of data elements contained in the array.
For example, if you have a (3, 4)
Numpy array of shape representing a matrix with 3 rows and 4 columns, then data.size
will be returned 3 * 4 = 12
because the array contains a total of 12 elements.
By checking data.size
, you can determine the number of elements in the array, which is very useful for analyzing and processing array data. This can be used to iterate over all elements of an array, calculate statistics, or make sure your operations don't go out of bounds.