Hi, my name is Zizhao Mo (莫梓钊). I received my Ph.D. degree in Computer Science from the University of Macau, where I was fortunate to be advised by Prof. Huanle Xu at Cloud and Distributed Systems Lab.

My research interests broadly revolve around machine learning systems (MLSys), with a focus on scheduling, parallelization optimization, and resource management topics. The optimization goal of my research is to improve the efficiency of ML workloads in terms of latency, cost, and throughput. I also have a specific interest in designing efficient systems over heterogeneous resources, including CPU-GPU and heterogeneous GPU platforms.

My research projects:

  • Resource allocation in GPU clusters. Designing fine-grained scheduling policies in the (heterogeneous) GPU cluster to optimize the performance and resource efficiency for deep learning training jobs.
  • LLM inference optimization. Proposing optimization techniques for the LLM inference service, primarily focusing on the improvement of system-level metrics like throughput and latency.

🔥 News

  • 2026.01: I am invited to give a talk at Cloudflare in March.
  • 2025.06:  🎉🎉 I defense my Ph.D. thesis!

📝 Publications

Hetis: Serving LLMs in Heterogeneous GPU Clusters with Fine-grained and Dynamic Parallelism

  • Accepted by SC’ 25 (CCF-A, CSRanking), Top conference in supercomputing.
  • Authors: Zizhao Mo, Jianxiong Liao, Huanle Xu, Zhi Zhou, Cheng-Zhong Xu

Fast and Fair Training for Deep Learning in Heterogeneous GPU Clusters

  • Accepted by ICS’ 25 (CCF-B, CSRanking), Top conference in supercomputing.
  • Authors: Zizhao Mo, Huanle Xu, Wing Cheong Lau

Heet: Accelerating Elastic Training in Heterogeneous Deep Learning Clusters

  • Accepted by ASPLOS’ 24 (CCF-A, CSRanking), Top conference in system architecture.
  • Authors: Zizhao Mo, Huanle Xu, Cheng-Zhong Xu

Optimal Resource Efficiency with Fairness in Heterogeneous GPU Clusters

  • Accepted by Middleware’ 24 (CCF-B).
  • Authors: Zizhao Mo, Huanle Xu, Wing Cheong Lau

Derm: SLA-aware Resource Management for Highly Dynamic Microservices

  • Accepted by ISCA’ 24 (CCF-A, CSRanking), Top conference in system architecture.
  • Authors: Liao Chen, Shutian Luo, Chenyu Lin, Zizhao Mo, Huanle Xu, Kejiang Ye, Cheng-Zhong Xu

Interference-aware Multiplexing for Deep Learning in GPU Clusters A Middleware Approach

  • Accepted by SC’ 23 (CCF-A, CSRanking), Top conference in supercomputing.
  • Authors: Wenyan Chen, Zizhao Mo, Huanle Xu, Kejiang Ye, Cheng-Zhong Xu

🎖 Honors and Awards

  • 2021 - 2025: Ph.D. Scholarship. University of Macau.
  • 2024: Travel grant for ASPLOS’24.
  • 2021: Arthur and Louis May Scholarship. Hong Kong University of Science and Technology.

🔖 Academic Activities

Reviewers:

  • IEEE Transactions on Computers (IEEE TC)
  • Transactions on Architecture and Code Optimization (ACM TACO)
  • Transactions on Services Computing (IEEE TSC)
  • Journal of Systems Architecture (JSA)
  • IEEE Transactions on Consumer Electronics

📖 Educations

  • 2021 - 2025: Ph.D. in Computer Science, University of Macau.
  • 2020 - 2021: MSc in Information Technology, Hong Kong University of Science and Technology.
  • 2014 - 2018: B.Eng. in Software Engineering, South China University of Technology.