Bridging the Gap: Streamlining the Process of Mapping LLMs onto Processors

Large language models (LLMs) often demand hand-coded conversion scripts for deployment on each distinct processor-specific software stack—a process that’s time-consuming and prone to error. In this session, we introduce a model-agnostic approach designed to streamline LLM deployment, especially for the NVIDIA GPUs. We’ll demonstrate how our automated approach cuts through the complexity of constantly evolving software stacks, enabling faster, more reliable LLM adoption. Attendees will gain practical insights into a future-proof strategy that boosts coverage for new and upcoming LLM architectures, all while reducing manual coding effort.

Track

Enabling Technologies

Session Speakers

Taesu Kim
Chief Technology Officer, SqueezeBits

Taesu Kim is Chief Technology Officer at SqueezeBits, a company building an ecosystem for designing, training and deploying AI models efficiently. Taesu received a PhD from the Department of Convergence IT Engineering at Pohang University of Science and Technology (South Korea) in 2022. He received a BS in Electrical Engineering from the Korea Advanced Institute of Science and Technology (Daejeon, South Korea) in 2016. His current interests encompass efficient serving of large AI models, model-aware kernel design for NPUs and advanced optimization techniques for deploying compressed neural networks.

Bridging the Gap: Streamlining the Process of Mapping LLMs onto Processors

Track

Session Speakers

Taesu Kim

See you May 20-22, 2025, at the Santa Clara Convention Center!

Sponsors & Exhibitors

Get in Touch

Share