Nvidia官方实现: 量化感知训练QAT和稀疏化Sparsity

1. Quantization Aware Training(QAT)

1.1 概述

https://developer.nvidia.com/blog/achieving-fp32-accuracy-for-int8-inference-using-quantization-aware-training-with-tensorrt/

https://github.com/NVIDIA/TensorRT/tree/master/tools/pytorch-quantization

TensorRT 的 pytorch_quantization 是一个实现 fake quantizationpytorch plugin

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# 2021-10-29 14:24
import os
import torch

猜你喜欢

转载自blog.csdn.net/weixin_38346042/article/details/131096740