Hi, my name is Zizhao Mo (莫梓钊). I received my Ph.D. degree in Computer Science from the University of Macau, where I was fortunate to be advised by Prof. Huanle Xu at Cloud and Distributed Systems Lab.
My research interests broadly revolve around machine learning systems (MLSys), with a focus on scheduling, parallelization optimization, and resource management topics. The optimization goal of my research is to improve the efficiency of ML workloads in terms of latency, cost, and throughput. I also have a specific interest in designing efficient systems over heterogeneous resources, including CPU-GPU and heterogeneous GPU platforms.
My research projects:
- Resource allocation in GPU clusters. Designing fine-grained scheduling policies in the (heterogeneous) GPU cluster to optimize the performance and resource efficiency for deep learning training jobs.
- LLM inference optimization. Proposing optimization techniques for the LLM inference service, primarily focusing on the improvement of system-level metrics like throughput and latency.
🔥 News
- 2026.01: I am invited to give a talk at Cloudflare in March.
- 2025.06: 🎉🎉 I defense my Ph.D. thesis!
📝 Publications
Hetis: Serving LLMs in Heterogeneous GPU Clusters with Fine-grained and Dynamic Parallelism
- Accepted by SC’ 25 (CCF-A, CSRanking), Top conference in supercomputing.
- Authors: Zizhao Mo, Jianxiong Liao, Huanle Xu, Zhi Zhou, Cheng-Zhong Xu
Fast and Fair Training for Deep Learning in Heterogeneous GPU Clusters
- Accepted by ICS’ 25 (CCF-B, CSRanking), Top conference in supercomputing.
- Authors: Zizhao Mo, Huanle Xu, Wing Cheong Lau
Heet: Accelerating Elastic Training in Heterogeneous Deep Learning Clusters
- Accepted by ASPLOS’ 24 (CCF-A, CSRanking), Top conference in system architecture.
- Authors: Zizhao Mo, Huanle Xu, Cheng-Zhong Xu
Optimal Resource Efficiency with Fairness in Heterogeneous GPU Clusters
- Accepted by Middleware’ 24 (CCF-B).
- Authors: Zizhao Mo, Huanle Xu, Wing Cheong Lau
Derm: SLA-aware Resource Management for Highly Dynamic Microservices
- Accepted by ISCA’ 24 (CCF-A, CSRanking), Top conference in system architecture.
- Authors: Liao Chen, Shutian Luo, Chenyu Lin, Zizhao Mo, Huanle Xu, Kejiang Ye, Cheng-Zhong Xu
Interference-aware Multiplexing for Deep Learning in GPU Clusters A Middleware Approach
- Accepted by SC’ 23 (CCF-A, CSRanking), Top conference in supercomputing.
- Authors: Wenyan Chen, Zizhao Mo, Huanle Xu, Kejiang Ye, Cheng-Zhong Xu
🎖 Honors and Awards
- 2021 - 2025: Ph.D. Scholarship. University of Macau.
- 2024: Travel grant for ASPLOS’24.
- 2021: Arthur and Louis May Scholarship. Hong Kong University of Science and Technology.
🔖 Academic Activities
Reviewers:
- IEEE Transactions on Computers (IEEE TC)
- Transactions on Architecture and Code Optimization (ACM TACO)
- Transactions on Services Computing (IEEE TSC)
- Journal of Systems Architecture (JSA)
- IEEE Transactions on Consumer Electronics
📖 Educations
- 2021 - 2025: Ph.D. in Computer Science, University of Macau.
- 2020 - 2021: MSc in Information Technology, Hong Kong University of Science and Technology.
- 2014 - 2018: B.Eng. in Software Engineering, South China University of Technology.