Date: Tuesday, May 23
Start Time: 2:40 pm
End Time: 3:10 pm
When used correctly, transformer neural networks can deliver greater accuracy for less computation. But transformers are challenging for existing AI engine architectures because they use many compute functions not required by previously prevalent convolutional neural networks. In this talk, we explore key transformer compute requirements and highlight how they differ from CNN compute requirements. We then introduce Flex Logix’s silicon IP InferX X1 AI accelerator. We show how the dynamic TPU array architecture used by InferX efficiently executes transformer neural networks. We also explain how InferX integrates into your system and show how it scales to adapt to varying cost and performance requirements.