Get into the habit of writing together! This is the 17th day of my participation in the "Nuggets Daily New Plan · June Update Challenge", click to view the details of the event .
欢迎关注我的公众号 [极智视界],获取我的更多笔记分享
Hello everyone, I am Jizhi Vision. This article will explain the TensorRT Constant operator.
Constant operator refers to the constant layer. When is this operator generally used: Generally, the next operator is two-matrix multiplication or two-matrix dot product or two-matrix splicing and other two-input operators, and a matrix needs When reading offline, the Constant operator is needed to construct the tensor for offline reading. The above describes a usage scenario of the Constant operator. The following describes how to add the Constant operator in TensorRT.
How to build a Constant operator in TensorRT, let's see:
# 通过 add_constant 添加 constant 算子
constantLayer = network.add_constant([1], np.array([1], dtype=np.float32))
# 重设常量数据
constantLayer.weights = data
# 重设常量形状
constantLayer.shape = data.shape
Let's see a practical example:
import numpy as np
from cuda import cudart
import tensorrt as trt
# 输入张量 NCHW
nIn, cIn, hIn, wIn = 1, 3, 4, 5 # 输入张量 NCHW
# 输入数据
data = np.arange(nIn * cIn * hIn * wIn, dtype=np.float32).reshape(nIn, cIn, hIn, wIn)
np.set_printoptions(precision=8, linewidth=200, suppress=True)
cudart.cudaDeviceSynchronize()
logger = trt.Logger(trt.Logger.ERROR)
builder = trt.Builder(logger)
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
config = builder.create_builder_config()
#---------------------------------------------------------- --------------------# 替换部分
# 添加 constant 算子
constantLayer = network.add_constant(data.shape, data)
#---------------------------------------------------------- --------------------# 替换部分
network.mark_output(constantLayer.get_output(0))
engineString = builder.build_serialized_network(network, config)
engine = trt.Runtime(logger).deserialize_cuda_engine(engineString)
context = engine.create_execution_context()
_, stream = cudart.cudaStreamCreate()
outputH0 = np.empty(context.get_binding_shape(0), dtype=trt.nptype(engine.get_binding_dtype(0)))
_, outputD0 = cudart.cudaMallocAsync(outputH0.nbytes, stream)
context.execute_async_v2([int(outputD0)], stream)
cudart.cudaMemcpyAsync(outputH0.ctypes.data, outputD0, outputH0.nbytes, cudart.cudaMemcpyKind.cudaMemcpyDeviceToHost, stream)
cudart.cudaStreamSynchronize(stream)
print("outputH0:", outputH0.shape)
print(outputH0)
cudart.cudaStreamDestroy(stream)
cudart.cudaFree(outputD0)
- output tensor shape (1,3,4,5)
Well, I have shared and explained the TensorRT Constant operator above. I hope my sharing can help you a little bit in your learning.
【Public number transmission】
"Extremely Intelligent AI | Explaining the TensorRT Constant Operator"