The use of deep neural networks (DNNs) has expanded in recent years, with DNNs making their way into diverse markets. Autonomous vehicles, ADAS, robotics, drones, surveillance, augmented/virtual reality, smartphones, smart home, and IoT products are all trending towards a need for high performance and power-efficient AI Inference at the edge. The Cadence Tensilica standalone AI processor IP, a deep neural-network accelerator, delivers both high performance and power efficiency across a full range of compute from 0.5 TMAC to 100 TMACs. This presentation describes the architecture of the latest Tensilica standalone AI processor family and how it exploits inherent sparsity in weights and activations to provide a compute-, latency- and bandwidth-optimized solution. The talk will also cover how we use the Tensilica Neural Network Compiler and its configurability options to optimize system performance and speed up the deployment process from months to days.