Quantization is a key technique to enable the efficient deployment of deep neural networks. In this talk, we present an overview of techniques for quantizing convolutional neural networks for inference with integer weights and activations. We explore simple and advanced […]