TensorRT IPlugin基类源码解析

TensorRT IPulgin基类源码中的注释已经比较详细，这儿只是在原来的基础上进行一些补充注释，便于后续进一步解释caffe其他各层Plugin代码的实现。

class IPlugin
{
public:
 /**
 * \brief get the number of outputs from the layer
 *
 * \return the number of outputs
 *
 * this function is called by the implementations of INetworkDefinition and IBuilder. In particular, it is called prior to any call to initialize().
 */
 // 第一步：这个函数是在class ICaffeParser类中的parse()方法中调用，即解析caffe prototxt和caffemodel时调用
 virtual int getNbOutputs() const = 0;
 /**
 * \brief get the dimension of an output tensor
 *
 * \param index the index of the output tensor
 * \param inputs the input tensors
 * \param nbInputDims the number of input tensors
 *
 * this function is called by the implementations of INetworkDefinition and IBuilder. In particular, it is called prior to any call to initialize().
 */
 // 第二步：这个函数是在class IBuilder类中的buildCudaEngine()方法中调用，即在创建engine时调用
 virtual Dims getOutputDimensions(int index, const Dims* inputs, int nbInputDims) = 0;

 /**
 * \brief configure the layer
 *
 * this function is called by the builder prior to initialize(). It provides an opportunity for the layer to make algorithm choices on the basis
 * of its weights, dimensions, and maximum batch size
 *
 * \param inputDims the input tensor dimensions
 * \param nbInputs the number of inputs
 * \param outputDims the output tensor dimensions
 * \param nbOutputs the number of outputs
 * \param maxBatchSize the maximum batch size
 *
 * the dimensions passed here do not include the outermost batch size (i.e. for 2-D image networks, they will be 3-dimensional CHW dimensions)
 */
 // 第三步：这个函数是在class IBuilder类中的buildCudaEngine()方法中调用，即在创建engine时调用
 virtual void configure(const Dims* inputDims, int nbInputs, const Dims* outputDims, int nbOutputs, int maxBatchSize) = 0;

 /**
 * \brief initialize the layer for execution. This is called when the engine is created.
 *
 *
 * \return 0 for success, else non-zero (which will cause engine termination.)
 *
 */
 // 第五步：这个函数是在class IBuilder类中的buildCudaEngine()方法中调用，即在创建engine时调用
 virtual int initialize() = 0;

 /**
 * \brief shutdown the layer. This is called when the engine is destroyed
 */
 // 第八步：这个函数是在class ICudaEngine类中的destroy()方法中调用，即在创建engine后且serialize完成后调用
 virtual void terminate() = 0;


 /**
 * \brief find the workspace size required by the layer
 *
 * this function is called during engine startup, after initialize(). The workspace size returned should be sufficient for any
 * batch size up to the maximum
 *
 * \return the workspace size
 */
 // 第四步：这个函数是在class IBuilder类中的buildCudaEngine()方法中调用，即在创建engine时调用
 virtual size_t getWorkspaceSize(int maxBatchSize) const = 0;


 /**
 * \brief execute the layer
 *
 * \param batchSize the number of inputs in the batch
 * \param inputs the memory for the input tensors
 * \param outputs the memory for the output tensors
 * \param workspace workspace for execution
 * \param stream the stream in which to execute the kernels
 *
 * \return 0 for success, else non-zero (which will cause engine termination.)
 */
 virtual int enqueue(int batchSize, const void*const * inputs, void** outputs, void* workspace, cudaStream_t stream) = 0;

 /**
 * \brief find the size of the serialization buffer required
 *
 * \return the size of the serialization buffer
 */
 // 第六步：这个函数是在class ICudaEngine类中的serialize()方法中调用，即在创建engine后调用
 virtual size_t getSerializationSize() = 0;

 /**
 * \brief serialize the layer
 *
 * \param buffer a pointer to a buffer of size at least that returned by getSerializationSize()
 *
 * \see getSerializationSize()
 */
 // 第七步：这个函数是在class ICudaEngine类中的serialize()方法中调用，即在创建engine后调用
 virtual void serialize(void* buffer) = 0;
protected:
 virtual ~IPlugin() {}
};
TensorRT IPlugin基类源码解析

猜你喜欢