Learning TensorRT (c) layer is expanded custom TensorRT

  This article comes from learning TensorRT document "TensorRT-Developer-Guide" Chapter 4 "EXTENDING TENSORRT WITH CUSTOM LAYERS" understanding.

Add custom C ++ API layer

  Add a custom layer is achieved by extending IPluginV2Ext and IPluginCreator categories:

  1. IPluginV2Ext: IPluginV2 upgraded version realized custom plug-in base class, contains the version and processing of other formats and single precision;
  2. IPluginCreator: Since the creation of the class definition layer, can be obtained through its plug-in name, version information, and other parameters, the network also provides a method of creating a plug-creation phase, and deserialize it in reasoning stage.

  The definition of good plug can REGISTER_TENSORRT_PLUGIN(pluginCreator)be static registration, and through the use of getPluginRegistry()query and use. Official plug-ins have been implemented:

  • RPROI_TRT
  • normalize_trt
  • PriorBox_TRT
  • Gridatrrchor_trit
  • NMS_TRT
  • lrelu_trt
  • Reorg_TRT
  • Region_TRT
  • Clip_TRT
// 通过getPluginRegistry获取所有TensorRT插件,creator即IPluginCreator对象
auto creator = getPluginRegistry()->getPluginCreator(pluginName, pluginVersion);
const PluginFieldCollection* pluginFC = creator->getFieldNames();

// 填充该层参数信息,pluginData需要先通过PluginField分配堆上空间
PluginFieldCollection *pluginData = parseAndFillFields(pluginFC, layerFields);

// 使用层名和插件参数创建新的插件对象,创建在堆上,需要主动释放
IPluginV2 *pluginObj = creator->createPlugin(layerName, pluginData);

// 在网络上添加一层,并将该层和插件绑定,layer即IPluginV2Layer对象
auto layer = network.addPluginV2(&inputs[0], int(inputs.size()), pluginObj);

// TODO:创建最新的网络,并序列化引擎

// 销毁插件对象
pluginObj->destroy() 

// TODO:释放TensorRT资源,network、engine、builder
// TODO:释放显存空间,如原网络参数信息pluginData 

  TensorRT the engine attribute information in the sequence of internal storage IPluginV2 widget and check it on the plug-in registry at the time of deserialization, and by IPluginV2 :: destroy () interface to the internal destruction.
  Previous versions, users must create a plug by nvinfer1 :: IPluginFactory class during deserialization, the current version can be used addPluginV2 TensorRT can. E.g:

// 使用Caffe解释器解析网络并添加插件
// 如果使用IPluginExt创建插件,需要搭配nvinfer1::IPluginFactory 和 nvinfer1::IPluginFactory
class FooPlugin : public IPluginExt
{
	// TODO:创建插件实现方法
};
class MyPluginFactory : 
public nvinfer1::IPluginFactory, 
public nvcaffeparser1::IPluginFactoryExt
{
	// TODO:创建插件的工厂方法
};

// 如果使用IPluginV2创建并注册插件,则不再需要实现nvinfer1::IPluginFactory,
// 但需要通过nvcaffeparser1::IPluginFactoryV2 和 IPluginCreator来完成注册
class FooPlugin : public IPluginV2
{
	// TODO:创建插件实现方法
};
class FooPluginFactory : public nvcaffeparser1::IPluginFactoryV2
{
	virtual nvinfer1::IPluginV2* createPlugin(...)
	{
		// TODO:创建并返回插件对象,如FooPlugin
	}
	bool isPlugin(const char* name)
	{
		// TODO:通过网络层的名字检验是否使用该插件
	}
}
class FooPluginCreator : public IPluginCreator
{
	// TODO:实现所有的插件创建
};
REGISTER_TENSORRT_PLUGIN(FooPluginCreator);

  The plug-in creates specific instances can be viewed:

  • samplePlugin: Custom network plug Caffe method;
  • sampleFasterRCNN: By TensorRT registered Caffe network plug;
  • sampleUffSSD: for UFF (for TensorFlow) add plug-ins.

Use a custom plug-in

  This part describes the basic situation and when you create a similar, should be noted that for Caffe interpreter, you can use a custom plug-ins via setPluginFactoryV2 and IPluginFactoryV2, then created when deserialized plug-in will be defined in terms of IPluginExt :: destroy () internal destruction of content without having to manually call, users only need to be destroyed to create a plug-in object creation process.

API description

IPluginV2 of API

  1, retrieve the plugin output data structures, and adjacent layers can check whether the docking:

  • getNbOutputs: Verify the number of output tensor;
  • getOutputDimensions: validate the input dimension, the dimension obtain an output;
  • supportsFormat: setting widget supported data types, how treatments precision;
  • getOutputDataType: ftplugin output data (NCHW, NC / 2HW2, NHWC8 the like, see PluginFormatType).

  2, in addition to obtaining input plug output, take up much space to store data, call and builder in the pre-allocated:

  • getWorkspaceSize

  3, the plug-in to create a multiple stage configuration, initialization, execution, suspension, and will only run multiple times, configuration, initialization, suspended only once, need to initialize memory is released at the time of application terminate, other memory needs in destroy release, the required plug-ins are:

  • configurePlugin: configuration input output attribute (number, dimensions, type, broadcasting, format selection, the BatchSize maximum), widget selects the most appropriate algorithms and data structures;
  • initialize: After the plug-in configuration and create an inference engine to use, configure and ready to perform according to the data structure of the set;
  • enqueue: widget actual process, need to enter the BatchSize operation, the input pointer, the output pointer, the buffer space pointer, the CUDA stream;
  • terminate: the release of all resources in the context engine plugin is released;
  • clone: ​​when you need a separate plug-in (new builder, network, engine created) use;
  • destroy: When builder, network, engine destruction calls, plug release corresponding resources;
  • set / getPluginNamespace: set or get the namespace widget defaults to "" (empty).

  4, the input and output properties can be achieved by broadcasting IPluginV2Ext, need to implement:

  • canBroadcastInputAcrossBatch: determining whether the input tensor can be broadcast in a batch, the return can broadcast it true, TensorRT not copied to enter and use the same copy of the input; not broadcast returns false, TensorRT copies input tensor;
  • isOutputBroadcastAcrossBatch: output whether the specified index is broadcast.

IPluginCreator 的 API

  IPluginCreator plug-in used to look up from the library and create a method plug-ins:

  • getPluginName: get plugin name, and with the use and getPluginType;
  • getPluginVersion: return plug-in version, TensorRT internal plug-in defaults to 1;
  • getFieldNames: Returns PluginFieldCollection configuration data, parameters name and type comprising adding plugins;
  • createPlugin: by creating a plug PluginFieldCollection given structural parameters, the parameters required for an actual filling;
  • deserializePlugin: According to internal plug-in name and version information call TensorRT engine, returned plugin object for reasoning;
  • set / getPluginNamespace: creator where the plug-in library namespace, the default is "" (empty).

Migrating from 5.xx to 5.1.x

  5.xx version does not getOutputDataType, isOutputBroadcastAcrossBatch, canBroadcastInputAcrossBatch, configurePlugin upgrade configureWithFormat for. 5.1.x when migrating to the need to implement these new features.

virtual nvinfer1::DataType getOutputDataType(int index, const nvinfer1::DataType* inputTypes, int nbInputs) const = 0;
virtual bool isOutputBroadcastAcrossBatch(int outputIndex, const bool* inputIsBroadcasted, int nbInputs) const = 0;
virtual bool canBroadcastInputAcrossBatch(int inputIndex) const = 0;
virtual void configurePlugin(const Dims* inputDims, int nbInputs, const Dims* outputDims, int nbOutputs, const DataType* inputTypes, const DataType* outputTypes, const bool* inputIsBroadcast, const bool* outputIsBroadcast, PluginFormat floatFormat, int maxBatchSize) = 0;

  

  

  

  

Published 24 original articles · won praise 8 · views 20000 +

Guess you like

Origin blog.csdn.net/yangjf91/article/details/98184540