Technical Program
Keynotes
Challenges and Opportunities of AI for Network Management in Data Centers
Minlan Yu
Professor at Harvard University
Abstract: Network management of data centers is increasingly challenging due to the dramatic scale growth in AI infrastructure, increasing device types and speed, and the complex dependencies across components. In this talk, I'll take performance diagnosis and mitigation for large-scale LLM training as an example, and discuss our recent innovations across the diagnostic pipeline: rapidly localizing faulty machines by identifying abnormal metric patterns via machine learning, identifying the root cause by capturing intricate dependencies in collective communications, and mitigating the impact of interruptions to minimize training downtime. We will then end the talk by zooming out to broader network operations with Confucius, a production-ready, multi-agent LLM framework for intent-driven network management.
Minlan Yu is a Professor at Harvard University. Her research focuses on network management, cloud networking, and applying machine learning to improve the reliability and performance of large-scale data center infrastructures. Her work addresses challenges in diagnosing and mitigating failures in modern AI-driven computing environments and designing intelligent frameworks for autonomous network operations. More information can be found at her official bio page.
Human-Centric Communication Networks and Systems: Engineering Paradigm Changes
Tobias Hoßfeld
Professor at the Chair of Communication Networks, University of Würzburg, Germany
Abstract: Communication networks are evolving beyond simple connectivity into integrated networking and computing infrastructures, forming intelligent, AI-native systems. The paradigm of human-centric communication networks and systems aims to optimize measurable human, societal, and operational outcomes rather than purely technical performance indicators. This shift requires rigorous quantification; without it, human-centricity remains rhetoric. The central engineering challenge therefore becomes: how can we design and operate human-centric communication networks while satisfying technical limitations and regulatory constraints, such as the EU AI Act? This talk examines AI-native communication systems and services through a human-centric lens, where Quality of Experience and system-level QoE, as well as energy efficiency and sustainability, become measurable design objectives. In this view, human-centricity manifests in novel quantifiable metrics and evaluation frameworks, whose architectural implications and associated open research questions will be discussed.
Tobias Hoßfeld is professor at the Chair of Communication Networks at the University of Würzburg, Germany, since 2018. He finished his PhD in 2009 and his professorial thesis (habilitation) in 2013. From 2014 to 2018, he was head of the Chair "Modeling of Adaptive Systems" at the University of Duisburg-Essen, Germany. Among others, he received several awards for his PhD thesis on the performance evaluation of future internet applications and emerging user behavior; the Fred W. Ellersick Prize 2013 (IEEE Communications Society) for one of his articles on QoE; and the VDE ITG 2024 award for the definition of the scalability index. He is member of the editorial board of IEEE Communications Surveys & Tutorials, Springer Quality and User Experience, ACM SIGMM Records and elected chairperson of the VDE ITG expert group "Communication Networks and Systems" within the German society of Information Technology (ITG). More information: https://comnets.org/hossfeld.
