Rasterio: rasterio.open function parameters and usage analysis (take GPM Imerg Early nc to tif as an example)

Table of contents

01 Preface

02 Parameter description

03 Usage


01 Preface

I have recently used ENVI IDL, and I feel physically and mentally exhausted. On the one hand, I have not learned deeply, and on the other hand, there are too few materials about IDL. Basically, there are only a few old blogs (although they are well written), and the rest are basically from sources. In the official document. So it is better to be prepared with both hands. Python should also be used to process remote sensing images. After all, python’s expertise in artificial intelligence has a relatively large impact on remote sensing images. Now if the beautiful country does not send a few more satellites for domestic remote sensing The first batch of professional students can't survive, and all the papers are in the volume of computer algorithms and artificial intelligence.

Not much nonsense, I will first describe its parameters. As for the usage, I plan to use the GPM Imerg early data set as an example to convert the nc file into a tif file to illustrate the usage of rasterio.

(As for why rasterio is used instead of gdal, there are certain considerations. I think the grammar of rasterio is more in line with the concept of python, while the grammar of gdal is more biased towards C language and the like)

02 Parameter description

Basic form:

rasterio.open(fp, mode='r', driver=None, width=None, height=None, count=None, crs=None, transform=None, dtype=None, nodata=None, sharing=False, **kwargs)

The rasterio.open function is used to create an DatasetReaderOR DatasetWriterobject, which are used to read and write raster data, respectively.

fp: the path (string) of the file to be opened;

mode: string (optional parameter, the default is r, read-only mode), the mode of opening the file, including four types, namely 'r', 'w', 'r+', 'w+', respectively indicating read-only mode , write only, readable and writable, readable and writable mode. For the following r+ and w+, the difference is that if the file path does not exist, then w+ will create a new one, while r+ will report an error, and the common part can be read and written, but it should be noted that if the file path exists, that is, the file exists, then w+ will Clear all the content in it and r+ will continue to append content thereafter. The difference between w and w+ is that w can only be written but not read, while w+ can be both read and written;

driver: The format of the file, which is generally omitted in r and r+ modes, because the function will automatically obtain its file suffix discrimination. If you create a file, you need to specify this parameter. If creating a GeoTIFF file, you need to specify driver=GTiff. The driver here is similar to gdal, you can refer to: Raster drivers — GDAL documentation to obtain driver parameters in different file formats;

Width: the width of the image, that is, the number of columns of the image grid matrix;

height: the height of the image, that is, the number of rows of the image grid matrix;

count: the number of bands of the image;

crs: can be a string/dictionary/crs object.

transform: The affine transform of the file, which can be an Affineobject or a list or tuple of 6 elements. These 6 elements represent (x_size, skew_y, x_upper_left, skew_x, -y_size, y_upper_left), x is lon, y is lat.

You can calculate it by yourself, just fill in 0.0 for the rotation coefficients skew_x and skew_x, and generally you will not use it. (I will calculate through other functions later because I am lazy)

Since the origin of the coordinate system is at the upper left corner on our remote sensing images, the input corner information is also at the latitude and longitude coordinates of the upper left corner. Since it is the upper left corner point, it is calculated from left to right and from top to bottom, so The resolution on x is normal, but the resolution on y needs to add a negative sign because the latitude is decreasing from top to bottom.

dtype: The data type of the file, string or np type are both feasible.

nodata: Pixel values ​​that are invalid values ​​in the matrix, can be passed in int, float, nan.

sharing: bool type. Whether to share handles, multithreading should be avoided. When processing large amounts of data, the operating system may run out of available file descriptors, which may cause programs to crash or fail to open further files. To avoid this problem, Rasterio maintains a pool of shared handles that can be reused in multiple places, reducing the overhead of opening new files.

03 Usage

Well, basically, I have made it clear. As for the usage, I will not go into details. Time is limited, so you can check it yourself.

Here's how to process GPM Imerg Early's NC4 files as GeoTIFF files.

import netCDF4 as nc
import rasterio
from rasterio.transform import from_origin

# preparation
in_path = r'F:\ExtremePrecipitation\data\GPM IMERG Early\3B-HHR-E.MS.MRG.3IMERG.20180312-S000000-E002959.0000.V06B.HDF5.SUB.nc4'
out_path = r'F:\ExtremePrecipitation\TEMP\output.tif'

# get the precip_dataset, lon_dataset, lat_dataset of the nc4 file
dataset = nc.Dataset(in_path)  # get the dataset_writer object
precipitation = dataset.variables['precipitationCal'][0, :, :]  # shape=(1, 630, 510), (channels, cols, rows))
lon = dataset.variables['lon'][:]  # shape=(630,), this means the cols == 630
lat = dataset.variables['lat'][:]  # shape=(510,), this means the rows == 510
# get the basic info of the precipitation
rows = len(lat)
cols = len(lon)
lon_upper_left = min(lon)
lat_upper_left = max(lat)
lon_res = lon[1] - lon[0]  # assume the lon and lat are equally spaced
lat_res = lat[1] - lat[0]

# write the precipitation data into a tif file
with rasterio.open(out_path, 'w', driver='GTiff',
                   height=rows, width=cols,
                   count=1, dtype=precipitation.dtype,
                   crs='+proj=latlong',
                   transform=from_origin(lon_upper_left, lat_upper_left, lon_res, lat_res)) as dst:
    dst.write(precipitation, 1)  # write the precipitation dataset into the first band of the tif file

Guess you like

Origin blog.csdn.net/m0_63001937/article/details/131498205