When converting floating-point networks to low-precision equivalents for high-performance inference, the primary objective is to maximally compress the network whilst maintaining fidelity to the original, floating-point network. This is made particularly challenging when only a reduced or unlabelled dataset is available.
Data may be limited for reasons of a commercial or legal nature: for example, companies may be unwilling to share valuable data and labels that represent a substantial investment of resources; or the collector of the original dataset may not be permitted to share it for data privacy reasons. We present a method based on distillation that allows high-fidelity, low-precision networks to be produced for a wide range of different network types, using the original trained network in place of a labeled dataset. Our proposed approach is directly applicable across multiple domains (e.g. classification, segmentation and style transfer) and can be adapted to numerous network compression techniques.