Accelerating the Pace of AWS Inferentia Chip Development, From Concept to End Customer Use

Speaker: Randy Huang, Amazon (AWS)

Date: October 18, 2022 at 3:30pm

Location: EER 3.646

Abstract: In this talk, I will detail the process and the decisions we have made to bring AWS Inferentia from a one-page press release to general availability. Our process starts with working backward from the customers and how we could bring real benefits to customers’ use cases. We will show that by separating out 1-way vs. 2-way door decisions, we can navigate technical and strategic decisions at AWS velocity and bring a deep-learning accelerator to the marketplace quickly.

Bio: Randy is a principal engineer of Inferentia and Trainium, custom chips designed by AWS to enable highly cost-effective low latency inference and training performance at any scale. Prior to joining AWS, he led the architecture group at Tabula, designing and building three dimensional field programmable gate arrays (3-D FPGAs). Randy received his Ph.D. from University of California, Berkeley.