Title | : | Efficient AI Through Specialization and Sparse Computing |
Speaker | : | Dr. Sanchari Sen (Staff Research Scientist at IBM) |
Details | : | Wed, 18 Dec, 2024 10:00 AM @ SSB 233 (MR1) |
Abstract: | : | Artificial Intelligence (AI) models have undergone tremendous improvements over the past decade and are now widely deployed in a range of products and services involving generation and analysis of image, video, speech and text. However, the immense compute, memory and energy demands of these AI models pose challenges to their more sustainable use and deployment. In this talk, I will discuss hardware-software specialization and sparsity-aware computing as techniques to overcome these challenges.
Designing specialized hardware accelerators for AI has emerged as an attractive option to address the demands of AI workloads. Extracting maximum benefits from an accelerator also requires designing a specialized software stack associated with it, to enable mapping of any AI workload to the accelerator. In the first half of this talk, I will discuss the details of IBM’s AIU accelerator and its software stack for improving the efficiency of AI model execution. In addition to specialization, exploiting sparsity, or the presence of zero values in AI workloads, has also emerged as a promising approach to improve their efficiency. In this talk, I will present SparCE, a set of lightweight micro-architectural and instruction set extensions that enable exploiting sparsity in general-purpose processor cores. These extensions dynamically detect zero values when they are loaded and skip future instructions that are rendered redundant by them to yield both performance and energy improvements in AI workloads.
Bio: Sanchari Sen is a Staff Research Scientist at IBM T. J. Watson Research Center, Yorktown Heights, New York, USA, where she has been working since 2020. She received the B. Tech degree in Electronics and Electrical Communication Engineering from IIT Kharagpur, India and the PhD degree in Electrical and Computer Engineering from Purdue University, West Lafayette, Indiana, USA. Previously, she has interned with AMD Research, Austin, Texas and IBM T. J. Watson Research Center, Yorktown Heights, New York, USA. Her current research interests include hardware and software techniques for efficient deep learning, domain-specific accelerator designs and approximate computing. She has authored over 25 papers in top-tier conferences and journals on machine learning, design automation and computer architecture. She also holds several US patents related to efficient deep learning on different hardware platforms. She has received two Outstanding Research Division Achievement Awards and an Outstanding Technical Achievement Award from IBM Research. She was a recipient of the Ross Fellowship award in 2015 and the Bilsland Dissertation Fellowship in 2019 from Purdue University. She was also awarded the Institute Silver medal in 2011 for her academic performance at IIT Kharagpur. |