In this session we’ll explain two neural network quantization techniques, quantization-aware training (QAT) and post-training quantization (PTQ), and explain when to use each. We’ll discuss what needs to be done for efficient implementation of each: for example, QAT requires preparation […]
