Portrait of Lidong Zhou

Lidong Zhou

Principal Researcher and Research Manager

About

Lidong Zhou is a Principal Researcher and Research Manager of the Systems Research Group at Microsoft Research Redmond since September of 2014. Previously, he was a Principal Researcher at Microsoft Research Asia managing the systems research group. Before joining Microsoft Research Asia in 2008, he was a researcher at Microsoft Research Silicon Valley for 6 years, after receiving his Ph.D. from Cornell University. His research interests are in distributed systems, storage systems, operating systems, system security, and wireless communications. His research has been significantly advancing the state of art in both the theory and practice of scalable and reliable distributed systems powering on-line cloud services, while making direct real-world impact through his technical contributions to live deployed large-scale services.

Lidong has been playing a significant technical role in the design and development of a wide range of large-scale distributed systems supporting the major Microsoft services. He was an architect for the Kirin system, which went live as Bing Search’s new index generation pipeline in November of 2009, thanks to a geo-distributed team of more than 100 people. He has initiated the Tiger project with the Search Technology Center in Beijing and the Bing Platform team in Bellevue to increase index serving cost-efficiency using flash and to enable innovations in relevance. He has also been working with the Big Data infrastructure teams in Microsoft on significant improvement in terms of performance, reliability, manageability, and user experiences with a combination of techniques in systems, databases, program analysis, compiler optimization, software engineering, and data science. He has recently been working on cloud architecture and abstractions, and is currently leading a collaboration with Azure teams on near-real-time monitoring and diagnostics infrastructure and on improving Azure reliability and availability in general.

Lidong serves on the editorial board of ACM Transactions on Storage and on the program committees for SOSP (2011/2013), OSDI (2010/2012), ASPLOS (2017), SoCC (2016), NSDI (2015), Eurosys (2014), PODC (2006), and DISC (2009). He was the Program Co-Chair for the 7th Workshop on Large-Scale Distributed Systems and Middleware (LADIS) in 2013 and for the 1st ACM SIGCOMM Asia-Pacific Workshop on Systems, ApSys 2010. He is the general co-chair of the upcoming SOSP 2017 in Shanghai, after years of effort of bringing the top system conference to the Asia-Pacific region.

 

Publications and Patents

 

Refereed Articles in Conferences

Srinath Setty, Chunzhi Su, Jacob R. Lorch, Lidong Zhou, Hao Chen, Parveen Patel, and Jinglei, Ren, Realizing the Fault-Tolerance Promise of Cloud Storage Using Locks with Intent, In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16), November 2016.

Wei Lin, Haochuan Fan, Zhengping Qian, Junwei Xu, Sen Yang, Jingren Zhou, and Lidong Zhou, StreamScope: Continuous Reliable Distributed Processing of Big Data Streams, In Proceedings of the 13th USENIX Symposium on Networked System Design and Implementation (NSDI’16): 439–453, March 2016

Ming Wu, Fan Yang, Jilong Xue, Wencong Xiao, Youshan Miao, Lan Wei, Haoxiang Lin, Yafei Dai, and Lidong Zhou, GraM: Scaling Graph Computation to the Trillions, In Proceedings of the 6th ACM Symposium on Cloud Computing (SoCC’15): 408-421, August 2015

Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping Qian, Ming Wu, and Lidong Zhou, Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing, in Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14): 285–300, October 2014

Tian Xiao, Zhenyu Guo, Hucheng Zhou, Jiaxing Zhang, Xu Zhao, Chencheng Ye, Xi Wang, Wei Lin, Wenguang Chen, and Lidong Zhou, Cybertron: Pushing the Limit on I/O Reduction in Data-Parallel Programs, In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications (OOPSLA’14): 895–908, October 2014

Tian Xiao, Jiaxing Zhang, Hucheng Zhou, Zhenyu Guo, Sean McDirmid, Wei Lin, Wenguang Chen, and Lidong Zhou, Nondeterminism in MapReduce Considered Harmful? An Empirical Study on Noncommutative Aggregators in MapReduce Programs, In Proceedings of the 36th International Conference on Software Engineering, Software Engineering in Practice (ICSE SEIP’14): 44-53, April 2014

Zhenyu Guo, Chuntao Hong, Mao Yang, Dong Zhou, Lidong Zhou, Li Zhuang, Rex: Replication at the Speed of Multicore, In Proceedings of the 9th European Conference on Computer Systems (EuroSys’14): 11:1–11:14, April 2014

Wentao Han, Youshan Miao, Kaiwei Li, Ming Wu, Fan Yang, Lidong Zhou, Vijayan Prabhakaran, Wenguang Chen, and Enhong Chen, Chronos: A Graph Engine for Temporal Graph Analysis, In Proceedings of the 9th European Conference on Computer Systems (EuroSys’14): 1:1–1:14, April 2014

Zhenyu Guo, Sean McDirmid, Mao Yang, Li Zhuang, Pu Zhang, Yingwei Luo, Tom Bergan, Peter Bodik, Madan Musuvathi, Zheng Zhang, and Lidong Zhou, Failure Recovery: When the Cure Is Worse Than the Disease, in Proceedings of the 14th Workshop on Hot Topics in Operating Systems (HotOS XIV), May 2013

Zhengping Qian, Yong He, Chunzhi Su, Zhuojie Wu, Hongyu Zhu, Taizhi Zhang, Lidong Zhou, Yuan Yu, and Zheng Zhang, TimeStream: Reliable Stream Computation in the Cloud, in Proceedings of the 8th ACM European Conference on Computer Systems (Eurosys’13): 1-14, April 2013

Chuntao Hong, Mao Yang, Lidong Zhou, Lintao Zhang, Dong Zhou, and Chia-pao Kuo, KuaFu: Closing the Parallelism Gap in Database Replication, in Proceedings of the 29th IEEE International Conference on Data Engineering, International Conference on Data Engineering (ICDE’13):1186-1195, April 2013

Zhenyu Guo, Xuepeng Fan, Rishan Chen, Jiaxing Zhang, Hucheng Zhou, Sean McDirmid, Chang Liu, Wei Lin, Jingren Zhou, and Lidong Zhou, Spotting Code Optimizations in Data-Parallel Pipelines through PeriSCOPE, in Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI’12): 121–133, October 2012

Vijayan Prabhakaran, Ming Wu, Xuetian Weng, Frank McSherry, Lidong Zhou, and Maya Haridasan, Managing Large Graphs on Multi-Cores with Graph Awareness, in Proceedings of the 2012 USENIX Annual Technical Conference (USENIX ATC’12): 41–52, June 2012

Jiaxing Zhang, Hucheng Zhou, Rishan Chen, Xuepeng Fan, Zhenyu Guo, Haoxiang Lin, Jack Y. Li, Wei Lin, Jingren Zhou, and Lidong Zhou, Optimizing Data Shuffling in Data-Parallel Computation by Understanding User-Defined Functions, in Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI’12):295-308, April 2012

Raymond Cheng, Ji Hong, Aapo Kyrola, Youshan Miao, Xuetian Weng, Ming Wu, Fan Yang, Lidong Zhou, Feng Zhao, and Enhong Chen, Kineograph: Taking the Pulse of a Fast-Changing and Connected World, In Proceedings of the 7th ACM European Conference on Computer Systems (Eurosys’12): 85-98, April 2012

Huayang Guo, Ming Wu, Lidong Zhou, Gang Hu, Junfeng Yang, Lintao Zhang: Practical Software Model Checking via Dynamic Interface Reduction, in Proceedings of ACM Symposium on Operating Systems Principles (SOSP’11): 265-278, October 2011

Zhenyu Guo, Dong Zhou, Haoxiang Lin, Mao Yang, Fan Long, Chaoqiang Deng, Changshu Liu, Lidong Zhou, G2: A Graph Processing System for Diagnosing Distributed Systems, in Proceedings of 2011 USENIX Annual Technical Conference (USENIX ATC ’11): 299-312, June 2011

Ming Wu, Fan Long, Xi Wang, Zhilei Xu, Haoxiang Lin, Xuezheng Liu, Zhenyu Guo, Huayang Guo, Lidong Zhou, and Zheng Zhang, Language-Based Replay via Data Flow Cut, in Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’10): 197-206, November 2010

Hongyi Wang, Qingfeng Jing, Rishan Chen, Bingsheng He, Zhengping Qian, and Lidong Zhou, Distributed  Systems Meet Economics: Pricing in the Cloud, in 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’10): 6-6. June 2010

Bingshen He, Mao Yang, Zhenyu Guo, Rishan Chen, Bing Su, Wei Lin, and Lidong Zhou, Comet: Batched  Stream Processing for Data Intensive Distributed Computing, In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10): 63-74, May 2010

Leslie Lamport, Dahlia Malkhi, Lidong Zhou: Brief Announcement: Vertical Paxos and Primary-Backup Replication. In Proceedings of the 28th Annual ACM Symposium on Principles of Distributed Computing (PODC’09): 312-313, August 2009: 312-313

Junfeng Yang, Tisheng Chen, Ming Wu, Zhilei Xu, Xuezheng Liu, Haoxiang Lin, Mao Yang, Fan Long, Lintao Zhang, and Lidong Zhou, MODIST: Transparent Model Checking of Unmodified Distributed Systems, in Proceedings of the 6th Symposium on Networked Systems Design and Implementation (NSDI’09): 213-228, April 2009

Bingsheng He, Mao Yang, Zhenyu Guo, Rishan Chen, Wei Lin, Bing Su, Hongyi Wang, Lidong Zhou, Wave  Computing in the Cloud, in Proceedings of the 12th Workshop on Hot Topics in Operating Systems (HotOS XII), April 2009

Vijayan Prabhakaran, Thomas L. Rodeheffer, Lidong Zhou, Transactional Flash, in Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI’08): 147-160, December 2008

Lidong Zhou, Vijayan Prabhakaran, Venugopalan Ramasubramanian, Roy Levin, Chandramohan A. Thekkath, Graceful Degradation via Versions: Specifications and Implementations, in Proceedings of The 26th Annual ACM Symposium on Principles of Distributed Computing (PODC’07):264-273, August 2007

Manuel Costa, Miguel Castro, Lidong Zhou, Lintao Zhang, Marcus Peinado: Bouncer: Securing Software by Blocking Bad Input. In Proceedings of the 21st ACM Symposium on Operating Systems Principles 2007 (SOSP’07): 117-130, October 2007

Manuel Costa, Jon Crowcroft, Miguel Castro, Antony Rowstron, Lidong Zhou, Lintao Zhang, Paul Barham, Vigilante: End-to-End Containment of Internet Worms, in Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP’05): 133–147, October 2005, Award Paper

Dahlia Malkhi, Florin Oprea, Lidong Zhou, Ω Meets Paxos: Leader Election and Stability without Eventual Timely Links, in Proceedings of the 19th International Conference on Distributed Computing (DISC’05): 199-213, September 2005

Lidong Zhou, Michael A. Marsh, Fred B. Schneider, Anna Redz, Distributed Blinding for Distributed ElGamal Re-encryption, in Proceedings of the 25th IEEE International Conference on Distributed Computing Systems (ICDCS’05): 815–824, June 2005

Lidong Zhou, Lintao Zhang, Frank McSherry, Nicole Immorlica, Manuel Costa, Steve Chien, A First Look at Peer-to-Peer Worms: Threats and Defenses, in Proceedings of the 4th International Workshop on Peer-To-Peer Systems (IPTPS ’05), Volume 3640 of Lecture Notes in Computer Science: 24-35, February 2005

John MacCormick, Nick Murphy, Marc Najork, Chandramohan A. Thekkath, Lidong Zhou, Boxwood: Abstractions as the Foundation for Storage Infrastructure, in Proceedings of the 6th USENIX Symposium on Operating System Design and Implementation (OSDI’04): 105-120, December 2004

Atul Adya, Paramvir Bahl, Jitendra Padhye, Alec Wolman, Lidong Zhou, A Multi-Radio Unification Protocol for IEEE 802.11 Wireless Networks, in Proceedings of the 1st International Conference on Broadband Networks (BROADNETS’04): 344-354, October 2004

 

Refereed Articles in Journals

Youshan Miao, Wentao Han, Kaiwei Li, Ming Wu, Fan Yang, Lidong Zhou, Vijayan Prabhakaran, Enhong Chen, and Wenguang Chen, ImmortalGraph: A System for Storage and Analysis of Temporal Graphs, ACM Transactions on Storage (TOS), Volume 11, Issue 3, Article No. 14:1-34, July 2015

Xuepeng Fan, Zhenyu Guo, Hai Jin, Xiaofei Liao, Jiaxing Zhang, Hucheng Zhou, Sean McDirmid, Wei Lin, Jingren Zhou, and Lidong Zhou, Spotting Code Optimizations in Data-Parallel Pipelines through PeriSCOPE, IEEE Transactions on Parallel & Distributed Systems (TPDS), Volume 26, Number 6: 1718-1731, June 2015

Martin Hutle, Dahlia Malkhi, Ulrich Schmid, Lidong Zhou: Chasing the Weakest System Model for Implementing Ω and Consensus. IEEE Transactions on Dependable and Secure Computing Volume 6, Issue 4: 269-281, October-December 2009

John MacCormick, Nicholas Murphy, Venugopalan Ramasubramanian, Udi Wieder, Junfeng Yang, Lidong Zhou: Kinesis: A new approach to replica placement in distributed storage systems. ACM Transactions on Storage (TOS): Volume 4, Number 4, Article 11: 1-28, January 2009 (pdf)

Manuel Costa, Jon Crowcroft, Miguel Castro, Antony Rowstron, Lidong Zhou, Lintao Zhang, Paul Barham: Vigilante: End-to-End Containment of Internet Worm Epidemics. ACM Transactions on Computer Systems, Volume 26, Number 4, Article 9: 1-68, December 2008

John MacCormick, Chandramohan A. Thekkath, Marcus Jager, Kristof Roomp, Lidong Zhou, Ryan Peterson: Niobe: A practical replication protocol, ACM Transactions on Storage (TOS), Volume 3, Number 4, Article 1: 1-43, February 2008 (pdf)

Lili Qiu, Paramvir Bahl, Ananth Rao, Lidong Zhou: Troubleshooting wireless mesh networks. ACM SIGCOMM Computer Communication Review, Volume 36, Number 5: 17-28, October 2006 (pdf)

Fred B. Schneider, Lidong Zhou: Implementing Trustworthy Services Using Replicated State Machines. IEEE Security & Privacy, Volume 3, Issue 5: 34-43, September/October 2005 (pdf)

Lidong Zhou, Fred B. Schneider, Robbert van Renesse, APSS: Proactive secret sharing in asynchronous systems.  ACM Transactions on Information and System Security, Volume 8, Number 3: 259-286, August 2005 (pdf)

Lidong Zhou, Fred B. Schneider, Robbert van Renesse, COCA: A Secure Distributed On-line Certification Authority, Transactions on Computer Systems (TOCS), Volume 20, Number 4: 329-368, November 2002 (pdf)

Lidong Zhou and Zygmunt J. Haas. Securing Ad Hoc Networks. IEEE Network: The Magazine of Global Internetworking. Volume 13 Issue 6: 24-30, November 1999 (pdf)

 

Invited Articles

Leslie Lamport, Dahlia Malkhi, and Lidong Zhou: Reconfiguring a State Machine. ACM SIGACT News, Volume 41, Issue 1: 63-73, March 2010 (pdf)

Lidong Zhou, Building Reliable Large-Scale Distributed Systems: When Theory Meets Practice, ACM SIGACT News, Volume 40, Issue 3: 78-85, September 2009

 

Granted Patents

Multi-radio unification protocol, with Alastair Wolman, Atul Adya, Paramvir Bahl, Jitendra D. Padhye, Patent No. 7,065,376 (filed on November 26, 2003, granted on June 20, 2006); Patent No. 7,283,834 (Filed on February 24, 2006, granted on October 16, 2007), Patent No. 8,078,208 (Filed on February 23, 2007, granted on December 13, 2011)

Balanced prefetching exploiting structured data, with Chandramohan A. Thekkath, John P. MacCormick, Nicholas Charles Murphy, Patent No. 7,529,891 (Filed on September 19, 2005, granted on May 5, 2009)

Fault detection and diagnosis, with Lili Qiu, Paramvir Bahl, Ananth Rajagopala Rao, Patent No. 7,583,587 (Filed on June 30, 2004, granted on September 1, 2009)

What-if analysis for network diagnostics, with Lili Qiu, Paramvir Bahl, Ananth Rajagopala Rao, Patent No. 7,606,165 (Filed on June 30, 2004, granted on October 20, 2009)

Methods and systems for removing data inconsistencies for a network simulation, with Lili Qiu, Paramvir Bahl, Ananth Rajagopala Rao, Patent No. 7,613,105, (Filed on June 30, 2004, granted on November 3, 2009)

Performing a deletion of a node in a tree data storage structure, with Chandramohan A. Thekkath, Patent No. 7,630,998 (Filed on June 10, 2005, granted on December 8, 2009)

Data replication in a distributed system, with William R. Hoffman, Marcus J. Jager, John P. MacCormick, Kristof Roomp, Chandramohan A. Thekkath, Patent No: 7,636,868 (Filed on June 27, 2006, granted on December 22, 2009)

Implementing a tree data storage structure in a distributed environment, with Chandramohan A. Thekkath, Patent No. 7730101 (Filed on June 10, 2005, granted on June 1, 2010)

Efficient recovery of replicated data items, with John P. MacCormick, Chandramohan A. Thekkath, Patent No. 7,734,573 (Filed on December 14, 2014, granted on June 8, 2010)

Gracefully degradable versioned storage systems, with Vijayan Prabhakaran, Venugopalan Ramasubramanian, Roy Levin, Chandramohan A. Thekkath, Patent No. 7,849,354 (Filed on June 12, 2007, granted on December 7, 2010)

Virtually synchronous Paxos, with Dahlia Malkhi, Leslie B. Lamport, Patent No. 7,849,223 (Filed on December 7, 2007, granted on December 7 2010)

Distributed system checker, with Junfeng Yang, Lintao Zhang, Zhenyu Guo, Xuezheng Liu, Jian Tang, Mao Yang, Patent No. 7,984,332, (Filed on November 17, 2008, granted on July 19, 2011)

Extensible browser platform for web applications, with Shiding Lin, Chandramohan A. Thekkath, Dahlia Malkhi, Zheng Zhang, Patent No. 8,190,703, 2012 (Filed on April 23, 2008, granted on May 29, 2012)

Automatic filter generation and generalization, with Marcus Peinado, Manuel Costa, Miguel Castro, Lintao Zhang, Patent No. 8,316,448 (Filed on October 26, 2007, granted on November 20, 2012)

Sharing data over trusted networks, with Dahlia Malkhi, Yaacov Fernandess, Patent No. 8,560,630 (Filed on February 28, 2007, granted on October 15, 2013)

Dynamic interface reduction for software model checking, with Ming Wu, Huayang Guo, Yi Yang, Gang Hu, Lintao Zhang, Tisheng Chen, Patent No. 8,671,396 (Filed on May 30, 2011, granted on March 11, 2014)

Platform for continuous mobile-cloud services, with Fan Yang, Zhengping Qian, Xiuwei Chen, Ivan Beschastnikh, Li Zhuang, Guobin Shen, Patent No. 8,745,434 (Filed on May 16, 2011, granted on June 3, 2014)

Usable security of online password management with sensor-based authentication, with Guobin Shen, Fan Yang, Patent No. 9,141,779 (Filed on May 19, 2011, granted on September 22, 2015)

Platform for continuous graph update and computation, with Fan Yang, Ming Wu, Aapo Kyrola, Raymond Cheng, Youshan Miao, Xuetian Weng, Ji Hong, Patent No. 9,244,983 (Filed on April 5, 2012, granted on January 26, 2016)

Data-parallel computation management, with Jiaxing Zhang, Hucheng Zhou, Zhenyu Guo, Haoxiang Lin, Patent No. 9,383,982 (Filed on September 12, 2012, granted on July 5, 2016)