SEMLA 2021
Event Information
About this event
AIOps: From Research Innovations to Industrial Adoptions
Speaker: Yingnong Dang, Principal Data Scientist Manager in Microsoft Azure
Abstract
The scale and complexity of cloud computing has been ever-increasing. This brings challenges on effectively building and managing cloud computing systems that are highly efficient and reliable, enable high customer satisfaction, and achieve high engineering productivity. In this talk, I will first share an AIOps vision of infusing AI into the cloud computing platform and DevOps process. I will then share a few AIOps efforts in Microsoft Azure to demonstrate how an AIOps solution can be built and adopted in industrial settings. Specifically, I will share how Azure uses intelligent anomaly detection and correlation for safeguarding the rollouts of hundreds of component payloads to millions of machines spreading in 60+ Azure regions across five continents (project Gandalf safe deployment).
I will also share how we built a resilient mechanism for Azure against failures by employing ML-based prediction and an online learning mechanism (project Narya). I will then talk about our learnings on engineering AIOps solutions, and a few open challenges on cloud computing that need more research and innovations in the related areas including software engineering and systems.
Presentation and slides in English
________________________________________________________________________
Biography of the speaker
Yingnong Dang is is a Principal Data Scientist Manager in Microsoft Azure. Yingnong’s focus is on building analytics and ML solutions for improving Azure Infrastructure availability and capacity, boosting engineering productivity, and increasing customer satisfaction. Yingnong and the team have a close partnership with Microsoft Research and academia. Before joining Azure in December 2013, Yingnong was a researcher in Microsoft Research Asia lab. His research areas include software analytics, data visualization, data mining, and human-compute interaction. As a researcher, he has transferred various technologies to Microsoft product teams including code clone analysis, crash dump analysis, performance trace analysis, etc. He owns 45+ U.S. patents and has published papers in top conferences including ICSE, FSE, VLDB, USENIX ATC, and NSDI.