Portrait of Chuanxiong Guo

Chuanxiong Guo

Principal Researcher

About

I am a Principal Researcher in the Systems Research Group at Microsoft Research Redmond. I am currently working on systems availability and troubleshooting, and networking. Before that I was a Principal Software Eng Manager at Microsoft Azure. Before I joined Microsoft Azure, I was a Senior Researcher in the Wireless and Networking Group at Microsoft Research Asia, worked on Data Center Networking (DCN) research. My areas of interest include networked systems at large-scale, data center networking, cloud computing, systems availability and troubleshooting.

 

 

Projects

CloudBrain for Automatic Troubleshooting for the Cloud

Established: January 1, 2016

Service availability, which is arguably the single most import KPI for cloud computing, can be brought down by various incidents. The state-of-the-art of incident troubleshooting, however, is still an (exhausting) effort of human experts. Our ongoing project, CloudBrain, aims for inventing new algorithms and building systems for automatic and real-time troubleshooting for large scale Cloud systems. At the algorithms level, CloudBrain tries to construct global views by connecting subcomponents of the systems, and then localize…

RDMA for Cloud Computing

Established: May 1, 2013

In this project, we have introduced a series of technologies, including DCQCN congestion control and DSCP-based PFC, and addressed a set of challenges including PFC deadlock, RDMA transport livelock, PFC pause frame storm, slow-receiver symptom, to make RDMA scalable and safe, and to enable RDMA deployable in production at large scale. We currently are working on RDMA deadlock understanding and prevention, and RDMA support for future AI infrastructure. RDMA Congestion Control Modern datacenter applications demand…

Publications

2017

2016

2015

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2001

2000

Projects

Other

I have served on the following technical program committees:

Talks

I recently gave two talks in SIGCOMM APNet 2017 and Hot Interconnects 2017, all about RDMA. I hope these talks give the message that RDMA can be deployed in production data centers at scale, and there are interesting research problems ahead.

RDMA in Data Centers: Looking Back and Looking Forward, in SIGCOMM APNet 2017

RDMA Deployments: From Cloud Computing to Machine Learning, in Hot Interconnects 2017