![]() |
Kun Tan 2012 Labs, Huawei Keynote Title: |
Time & Venue
9:30 - 10:30, 4 December 2024 (Wednesday)
InterContinental Grand Stanford HK
Abstract
Modern large Al models demand tremendous computing power and high-speed interconnections, which may be provided with only purposely-built infrastructures or Al SuperPODs. However, managing such powerful computing infrastructure is a challenging task. Enterprise developers may face many difficult issues, such as long downtime as the infrastructure goes more complex, great efforts to parallelize their Al workloads, non-linear performance due to non-optimized resource scaling, and inefficient resource utilization. We argue that a developer-centric platform that contains a full software stack and provide a serverless abstraction is needed to ease the burden of enterprise developers for Al applications. In this talk, I will first discuss the challenges and opportunities for building SuperPOD and its software stack, then I will introduce some of our early efforts on building a serverless Al platform. We will discuss our design of a distributed serverless kernel for both general-purpose and high-performance computing. Then, I will talk several joint system optimizations enabled by our platform due to native integration of full-stack capabilities.
Biography
Dr. Kun Tan is the Director and Chief Expert of Distributed and Parallel Software Lab, 2012 Labs, Huawei. He is Huawei Scientist. His team develops cutting-edge Al framework, cloud native, serverless, big data analytics, and cloud networking technologies for many Huawei products. He is Huawei Scientist. Before joined Huawei, he was Research Manager and Senior Researcher of Wireless and Networking Group, Microsoft Research Asia. He won USENIX NSDI Best Paper Award in 2009 and USENIX Test-of-Time Award in 2019.