1. AI-Transformed Medical Research
This project focuses on developing media foundation models that integrate multiple modalities, including text, audio, image, video, and other signals. By leveraging these comprehensive models, we aim to explore and innovate across several research themes, including medical LLMs, multi-modal medical LLM integration, education agents, human-agent interaction, biomedical data synthesis and clinical translation and deployment.
1) Foundation Model: This theme delves into the development of medical LLM for medical diagnostics, treatment planning, and patient care. Leveraging medical specific data and advanced pre-training and fine-tuning algorithms, we develop medical foundation models to improve the accuracy and efficiency of medical decision-making, ultimately enhancing patient outcomes.
2)Multi-Modal Medical LLM Integration. Building a universal model for all medical modalities is unrealistic due to limited data and the need for tailored domain knowledge integration for each modality. Instead, we unify unimodal foundation models into a single framework, fine-tuned for multi-modality tasks. This approach enables scalable and flexible deployment across diverse medical scenarios while preserving modality-specific strengths.
3) Agentic System: This theme focuses on building a multi-agentic system powered by LLMs and VLMs to support doctors and medical researchers in their professional development and clinical practice. The system simulates collaborative environments involving expert agents that represent clinicians, researchers, and patients. It facilitates clinical scenario simulations for diagnostic reasoning and treatment planning, research collaboration simulations to support hypothesis generation, literature synthesis, and experimental design, and interactive training modules that enhance communication, decision-making, and interdisciplinary teamwork.
4) AI-Transformed Medical Education. Leverage AI to revolutionize medical education for both students and professionals. The system provides personalized learning pathways using adaptive LLMs Immersive simulations with agent-based role-play (e.g., educators, patients, peers), real-time feedback and assessment to reinforce clinical reasoning and communication skills, and scalable platforms for continuous medical education (CME) and certification.
5)Medical Data Synthesis. This theme focusses on building biomedical data synthesis models that learn implicit medical distributions, generate diverse synthetic samples, and support causal reasoning to enhance the generalization of medical AI. By introducing a temporal disease progression framework that integrates longitudinal imaging, patient-specific conditions, and medical interventions, our approach seeks to provide a more comprehensive perspective on disease progression and treatment responses.
6) Clinical Translation and Deployment for improved patient care. We focus on bridging the gap between research and real-world application by enabling the translation and deployment of AI systems in clinical settings. This includes collaborating with healthcare institutions, adapting and validating foundation models, and integrating solutions into healthcare workflows.
Through these research themes, our project aims to achieve several key goals including cultivating future medical talents, fostering technological breakthroughs and accelerating translation to practical applications by developing SOTA performance models for improved patient care and publishing papers in top-tier conferences and journals.
-
- Medical LLMs
- Multi-modal LLMs
- Medical AI
- Medical Agents
- Medical Education
- Medical Data Synthesis
- Medical Diagnosis
- Disease Progression Prediction
-

Shujie Liu (Engaging Lead)
Principal ResearcherWe invite researchers passionate about advancing AI in healthcare to join our pioneering research initiatives. Our programs focus on AI foundation models, agentic medical research, and transformative approaches to medical education. We aim to develop intelligent systems that not only analyze and interpret complex medical data but also assist in decision-making, personalized care, and adaptive learning for clinicians and students. We are particularly interested in scholars exploring multimodal AI for medical imaging and diagnostics, autonomous agents for clinical workflows, and innovative educational platforms powered by AI. By fostering interdisciplinary collaboration across medicine, AI, and human-computer interaction, we strive to create groundbreaking technologies that redefine the future of healthcare and medical training.

Xinxing Xu (Engaging Lead)
Principal Research ManagerWe invite researchers passionate about advancing AI in healthcare to join our pioneering initiatives spanning multimodal AI, generative AI, agentic AI, and the translation and deployment of AI systems in real-world clinical environments. We aim to bridge the gap between cutting-edge AI research and clinical impact by ensuring that AI innovations are not only scientifically rigorous but also validated, adapted, and seamlessly integrated into healthcare workflows. We are committed to transforming foundational AI advances into practical, trustworthy tools that enhance patient outcomes, clinical decision-making, and healthcare system efficiency.

Zilong Wang
Senior ResearcherWe invite researchers with strong interests in advancing the intersection of artificial intelligence and real-world healthcare practice. Our research focuses on evaluating and benchmarking foundation models and AI agents in realistic medical contexts, emphasizing their capabilities, reliability, and collaboration with clinicians. We are particularly interested in developing and assessing practical foundation models for clinical decision support, diagnostic reasoning, and workflow optimization.
Our work aims to bridge the gap between AI research and clinical application by exploring user interaction, human-AI collaboration, and agentic intelligence in healthcare environments. We welcome visiting scholars and collaborators passionate about building, adapting, and rigorously testing foundation models and agent-driven systems that advance trustworthy, effective, and human-centered AI for medicine.

Xinyang Jiang
Senior ResearcherOur research projects centre on developing and applying foundation models using diverse types of medical data and unifiedly integrate different foundation models and modalities to address significant challenges in healthcare research. We are particularly interested in visiting scholars and collaborators with research interests and expertise in medical AI and multi-modal intelligence, especially those aiming to develop novel foundation models that advance diagnosis, prognosis, and treatment personalization.

Jinglu Wang
Senior ResearcherWe invite researchers with strong interests in the intersection of artificial intelligence, medicine, and education, particularly those exploring agentic AI. Our overarching goal is to advance the scientific foundations and practical applications of agentic AI systems for healthcare and medical education. We aim to investigate how autonomous, communicative, and trustworthy AI agents can enhance medical reasoning, clinical training, and knowledge dissemination. This includes developing frameworks that integrate multi-modal understanding, collaborative reasoning, and interactive learning environments. We welcome visiting scholars and collaborators who seek to contribute to this emerging paradigm of agentic intelligence in healthcare and medical education, and to jointly explore its theoretical, technical, and societal implications.

Jingjing Fu
Senior ResearcherWe invite researchers with strong interests in advancing the frontier of medical data synthesis and shaping its transformative impact on healthcare AI. Our work focuses on developing medical world models and advanced generative frameworks that learn implicit medical distributions, generate diverse and high-fidelity synthetic data, and support causal reasoning to enhance model generalization and robustness. We welcome visiting scholars and collaborators passionate about multimodal generative modeling, disease trajectory simulation, and causal inference in medicine.

Chang Xu
Senior ResearcherWe invite researchers passionate about advancing medical AI through foundation models, time series analysis, generative modeling, and agent systems. Our research focuses on developing medical foundation models that integrate multi-modal data—including time series, text, and imaging—to enhance diagnostic reasoning, clinical decision-making, and data-driven discovery. We also explore intelligent medical agents and synthetic data generation frameworks that enable adaptive learning, collaborative reasoning, and robust generalization. We welcome visiting scholars eager to contribute to building trustworthy, interpretable, and impactful AI systems for next-generation healthcare.
2. Agentic AI: Reimagining Future Human–Agent Communication and Collaboration
Agentic AI marks a shift from passive systems to active collaborators—AI agents that engage in sustained reasoning, understand complex multimodal environments, and interact naturally with humans over extended time and contexts. Our vision is to build intelligent systems that participate meaningfully in knowledge discovery, content creation, communication, and decision-making.
We organize our understanding of Agentic AI into three interrelated categories, each reflecting a different facet of the ecosystem these systems must inhabit:
1. Foundations & Frameworks
Agentic AI requires new computational foundations to operate effectively across time, context, and modalities. We seek to develop compact and grounded multimodal representations that allow systems to perceive and reason over complex visual, auditory, and sensor-rich environments.
To support long-term engagement and contextual reasoning, we explore advances in semantic memory and process memory—compression mechanisms that allow agents to retain knowledge across interactions and reason over long horizons. Retrieval-augmented pipelines further enrich reasoning with dynamic access to knowledge bases and structured memory.
Agents should also coordinate and plan over extended workflows. We encourage research on multi-agent collaboration, process-aligned action spaces, and models that align with the semantics of user-driven tasks. Contributions in this category might include novel architectures, datasets, training methods, or theoretical insights that strengthen the core reasoning and planning capabilities of agentic systems.
2. User Experiences and Human-Agent Interfaces
Human–agent interaction must evolve to meet the demands of fluid, multimodal collaboration. We envision interfaces that are generative and dynamic, constructed to suit user intent and task context. Proposals should consider how agents generate or adapt interfaces in real time, across devices, modalities, and immersive environments.
Agents should be capable of interactive visualization, using media to communicate internal states, uncertainties, and possible outcomes. Interaction should be audience- and context-adaptive, with personalized behaviors that respect different usage settings and preserve user privacy.
We also encourage work on process-aware collaboration, where agents sustain memory across sessions and adapt over time. Topics such as communicative effectiveness, trust, interpretability, and longitudinal studies are central. Researchers should also consider how interface design, memory, and communication strategies can support natural, explainable, and long-term agentic interaction.
3. Applications & Societal Impact
We see Agentic AI as a transformative force across domains like science, healthcare, education, and enterprise decision-making. These applications require agents to perform deep research—gathering evidence, testing hypotheses, and constructing trustworthy narratives with transparency and provenance.
In the media domain, agentic AI systems could enable advanced content generation, curation, and personalization by understanding and synthesizing information from multiple modalities such as text, audio, and video. These agents support content creators by automating research, summarization, and fact-checking, or by generating interactive and adaptive media experiences tailored to audience preferences. Additionally, agents could facilitate media analysis at scale—detecting trends, ensuring provenance, and helping organizations manage and distribute content more effectively.
Finally, as these systems scale, we must ensure they operate responsibly. We encourage proposals on provenance tracking, watermarking, privacy-preserving design, and energy-efficient deployment. We also welcome organizational and societal studies that examine how agentic systems are adopted, trusted, and evaluated in the wild.
-
- Agentic deep research model training: data, tools, and algorithms
- Deep research applications in science, healthcare, enterprise, and education
- Media foundations: multimodal representation learning, neural compression, multimodal understanding, world models
- Context and memory: semantic memory, process memory, long-context reasoning, retrieval-augmented systems
- Agentic capabilities: planning, tool use, multi-agent collaboration, process-aligned action spaces
- Generative and adaptive user interfaces: UI generation, audience/context adaptation, interactive visualization, cross-device/immersive experiences
- Human-centered AI application and evaluation: process-aware collaboration, communicative effectiveness, trust and interpretability, longitudinal and organizational studies
- Responsible media and systems: provenance, watermarking, privacy-preserving design, efficiency, and deployment at scale
-

Yan Lu (Engaging Lead)
Partner Research ManagerWe welcome applicants who possess a strong background in Multimedia, Immersive AI, Agentic AI, HCI or related fields, including individuals with experience in neural video communication, computer vision, 3D and graphics, audio and speech, and other related domains. We also expect participants to be passionate about exploring the paradigm shifts in building future media and communication experience and are eager to contribute to the development of advanced agentic media ecosystem that can enhance human learning and creativity.

Chong Luo (Engaging Lead)
Sr. Principal Research ManagerMy team is particularly excited about advancing the deep research agent aspect of Agentic AI. We aim to develop models capable of conducting autonomous, evidence-based research across scientific and enterprise domains. We hope to collaborate closely with visiting researchers on data, tools, and training algorithms that empower agents to reason, hypothesize, and construct transparent knowledge artifacts. Through this collaboration, we expect to jointly explore real-world applications in science, healthcare, and education, and to establish a foundation for trustworthy agentic research systems.

Jiahao Li
Principal ResearcherWe welcome candidates with expertise in multimedia and AI, such as neural compression, representation learning, and generative modeling. We value individuals passionate about investigating paradigm shifts in video research and contributing to advanced AI systems, including agentic capabilities like long-horizon planning, tool use, and memory-centric workflows, to propel communication, decision-making, and knowledge creation. Through seamless collaboration, we are excited to push the frontiers of AI and media technologies together.

Yun Wang
Senior ResearcherWe welcome applicants with strong backgrounds in Human–AI Interaction, Intelligent Systems, and related areas such as visualization, multimodal communication, and cognitive modeling. We are particularly interested in individuals who aspire to rethink how intelligence is expressed, shared, and co-evolved between humans and AI. We value researchers who not only advance technical capability but also interrogate the underlying paradigms of collaboration, reasoning, and communication. Visiting researchers are encouraged to explore new frameworks, representations, and processes that connect human cognition with computational intelligence—toward more transparent, adaptive, and meaning-centered systems that augment human understanding and creativity.

Xun Guo
Principal ResearcherWe welcome applicants with a strong background in Multimedia, Agentic AI, and Multimodal Learning, and particularly value experience in generative models for video, vision-language, and other modalities. Participants are encouraged to explore agentic AI systems that understand user intent across modalities (e.g., video, audio, and text) and generate interactive outputs accordingly, or contribute to related research areas such as multimodal learning, video generation, and vision-language alignment. These efforts aim to support intuitive, intent-driven user experiences and advance the frontier of future media technologies.

Kai Qiu
Senior ResearcherWe welcome applicants with a strong background in reinforcement learning, LLM, and agent-based AI. We also expect participants to be passionate about exploring paradigm shifts in creating advanced autonomous agents – such as designing agents with long-horizon planning, tool-use integration, and memory-driven reasoning – and are eager to contribute to the development of next-generation agentic AI systems capable of autonomously tackling complex and augmenting human capabilities.

Qi Dai
Principal ResearcherWe welcome applicants who are excited to push the frontier of agentic AI research. Ideal candidates will have a solid foundation in reinforcement learning, agents, LLM and VLM. We are particularly interested in researchers who can imagine and build systems in which agents perceive multimodal inputs, remember and reason over extended time horizons, and dynamically orchestrate external tools to solve open-ended tasks. If you are passionate about transforming these ingredients into robust, adaptive agents that expand human potential, we encourage you to apply.

Bei Liu
Senior ResearcherWe welcome visiting researchers who are motivated, collaborative, and passionate about advancing agentic AI. Candidates with independent ideas and strong interest in general agentic models, Vision-Language Models, and related areas—such as multimodal reasoning, long-horizon reasoning, or reinforcement learning—are highly encouraged to join us and engage in open, productive exchange.
3. Brain and Artificial Intelligence
We conduct interdisciplinary research at the intersection of artificial intelligence (AI) and neuroscience, aiming both to leverage AI for advancing our understanding of the brain and to use these insights to improve AI systems and brain health. On one front, brain-inspired AI seeks to incorporate neurobiological principles to create energy-efficient, robust, and human-like intelligence, driving the development of next-generation AI technologies. On the other front, human brain signals (e.g., EEG and fMRI) are inherently noisy, with low signal-to-noise ratios that hinder their practical utility. To overcome this challenge, we strive to develop foundational models of brain activity capable of decoding a wide range of human perceptions and supporting efforts to address neurological disorders.
-
- Brain-inspired AI
- Brain-computer Interface
- AI for Brain Health
-

Dongsheng Li (Engaging Lead)
Principal Research ManagerAs AI is transforming the world, it is equally crucial to deepen our understanding of biological intelligence. The integration of AI and brain science research holds immense potential to drive innovation, enhance human well-being, and expand our knowledge of both artificial and biological intelligence. I am eager to collaborate with researchers in this field to revolutionize existing paradigms through interdisciplinary studies, advancing both AI and brain science in groundbreaking ways.

Yansen Wang
Senior ResearcherOur brain, encapsulated within a structure no larger than our two fists, is a marvel of complexity and power. It governs every facet of our thoughts, emotions, and actions, enabling feats from artistic expression to scientific innovation. Yet, much of its inner workings remain shrouded in mystery, with countless questions about consciousness, memory, and cognition still unanswered. Understanding the brain is not only essential for unraveling these mysteries but also for constructing interfaces between the brain and artificial intelligence. I hope to see a better synergy in the future.

Dongqi Han
Senior ResearcherGeoffrey Hinton once said, “I have always been convinced that the only way to get artificial intelligence to work is to do the computation in a way similar to the human brain.” This conviction reveals a deep philosophical and practical bridge between neuroscience and AI: by studying how the brain computes, learns, and adapts, we gain principles that can guide more effective artificial systems; conversely, AI models provide testable hypotheses and experimental platforms for exploring brain function. Synergizing AI and brain science means accelerating mutual progress — using neural insights to inspire new architectures, and leveraging AI to simulate, analyze, and even predict biological behavior — all toward unraveling the deeper mysteries of intelligence in nature and in machines.

Mingqing Xiao (opens in new tab)
ResearcherThe human brain remains the only known realization of general intelligence, exhibiting remarkable efficiency and adaptability. Its energy efficiency, learning efficiency, and generalization capability are unparalleled compared to current AI models. Drawing inspiration from the brain at multiple levels—from spatiotemporal neural dynamics, through learning principles, to high-level representations—grounded in solid theoretical foundations, is essential for advancing AI toward more robust and human-like intelligence. At the same time, this synergy works both ways: AI can serve as a surrogate computational model to simulate or as a tool to decode brain processes, deepening our understanding of human brains. Exploring this bidirectional relationship between AI and neuroscience holds immense promise for shaping the next generation of intelligent systems and unraveling the mysteries of biological intelligence.
4. Rethink systems and networking in the era of AI
Rapid advances in AI are transforming modern computing infrastructure – bringing both fundamental challenges and unprecedented opportunities.
We call for proposals that tackle these challenges and harness the emerging opportunities of the AI era. Submissions may span multiple dimensions of AI and computing systems, including but not limited to:
1) AI-assisted self-driving networking infrastructure;
2) Disruptive AI-assisted methods for lowering the barrier to building secure and reliable systems;
3) Next-generation software and hardware architectures for AI;
4) Development and debugging tools for intelligent systems.
We particularly encourage unconventional and multidisciplinary approaches that break traditional boundaries and push the frontiers of AI and systems research.
-
- RL infrastructure
- Agentic AI systems
- Storage systems for AI
- Verification for AI
- LLM-Assisted Network Optimization
- AI Networking Infrastructure
- Network Protocol Troubleshooting
- Infrastructure-aware resource allocation across compute, network, and devices
- Adaptive configuration of collective communication algorithms
- Network path selection and routing optimization for AI workloads
- Autonomous device configuration and tuning
- Integration of workload sensing with system-level orchestration for self-optimizing AI infrastructure
-

Yongqiang Xiong (Engaging Lead)
Sr. Principal Research ManagerJoin us to revisit/rethink/rebuild the system and networking infrastructure by AI and for AI with a holistic and clean-slate perspective.

Fan Yang (Engaging Lead)
Sr. Principal Research ManagerWe welcome young systems researchers to join us exploring the disruptive system opportunities in the era of AI.

Jing Liu
Senior ResearcherWe welcome systems researchers to join us and build strong systems for the AI era together.

Baotong Lu
Senior ResearcherWe welcome systems researchers to explore the next generation of data systems in the era of AI.

Li Lyna Zhang
Principal ResearcherWe welcome researchers who are passionate about advancing agentic RL infrastructure and building general agents!

Yi Zhu
Senior ResearcherWe are exploring how AI can help build safer and more efficient distributed systems in the LLM era. Our current focus spans distributed pretraining, reinforcement learning-based post-training, and inference engine optimization—both in terms of performance and correctness. These areas are critical for scaling LLMs across heterogeneous infrastructure. We welcome collaborators interested in pushing the boundaries of system reliability and efficiency, especially those passionate about bridging AI and systems through novel learning-driven approaches.

Ran Shu
Senior ResearcherWe welcome young researchers in system and networking field to explore cutting-edge techniques in AI infrastructure simulation.

Wenxue Cheng
Senior ResearcherWe welcome passionate researchers to explore cross-domain techniques for reimagining and optimizing AI infrastructure.

Zhixiong Niu
Senior ResearcherJoin us to explore the next wave of system and networking innovations in the AI era.
5. Societal AI
With the rise of large-scale AI models, such as Large Language Models (LLMs), we are witnessing a transformation in how these technologies are integrated into various aspects of our society. These models stand out for two key reasons: 1) General-purpose functionality: LLMs can perform a wide range of tasks, from translation and question answering to code completion and more; 2) Human-like competence: They have demonstrated the ability to perform many tasks at a level comparable to human, making them accessible and versatile tools for various domains.
While these powerful models offer significant societal benefits, they also introduce unforeseen challenges. These challenges arise not only from technical complexities but also from the broader social implications of widespread AI adoption. As Brad Smith aptly noted, “The more powerful the tool, the greater the benefit or damage it can cause.”
To ensure that AI’s integration into society is harmonious, synergistic, and resilient—minimizing any potential side effects—it is critical to foster Societal AI research. This emerging field prioritizes a multi-disciplinary approach, bringing together computer science and social science to address the complex dynamics of AI’s role in shaping our world.
We invite researchers from both fields to join us in this exciting endeavor. Together, we can explore innovative solutions that ensure the responsible and equitable advancement of AI technologies.
-
- AI’s impact on human cognition, learning, and creativity
- AI’s role in reshaping work and global business models
- Designing fair and inclusive AI systems
- Ensuring AI safety, reliability, and control
- Aligning AI with human values and ethics
- Optimizing human-AI collaboration
- Evaluating AI in unforeseen tasks and environments
- Enhancing AI transparency and interpretability
- AI’s transformation of social science research
- Evolving regulatory frameworks for AI governance
-

Xing Xie (Engaging Lead)
Partner Research ManagerWe warmly invite researchers from both computer science and social sciences to join us in exploring the exciting frontier of societal AI. As we stand at the intersection of technological innovation and social impact, this collaboration provides a unique opportunity to bridge diverse fields of expertise. We are confident that this interdisciplinary approach will not only advance our knowledge of intelligence but also help shape AI technologies that are aligned with human values and needs. Ultimately, this partnership holds great promise for generating long-term benefits for human society, ensuring that AI serves as a force for positive social change.

Xiaoyuan Yi
Senior ResearcherThrough this collaboration, we aim to investigate AI value alignment grounded in value systems defined in social sciences, focusing on two research questions: 1) Which value frameworks best support AI development, balancing safety and capability while maximizing utility in multi-party human–AI collaboration? 2) How can we align AI with plural, diverse, and personalized values across cultures, groups, and individuals to enhance user experience and maximize well-being? To address these questions, we need a deep, interdisciplinary understanding of human and AI behavior and the motivational factors behind them, and we must operationalize these cross-disciplinary concepts and methodologies into quantitative, algorithmic forms.

Jianxun Lian
Principal ResearcherThrough this collaboration, we aim to explore the synergy between AI and social science. For instance, how can we leverage social science to evaluate, interpret, and shape the validity, authenticity, and safety in AI behaviors? Conversely, how can a well-aligned AI transform the field of social science research? To answer these questions, we need to gain a deeper understanding of AI behaviors from an interdisciplinary perspective, particularly on those traits that reflect human-likeness, such as emotions, motives, preferences, beliefs, theory of mind, and other social intelligence.

Fangzhao Wu
Principal ResearcherWorking together on interesting real-world problems (such as the responsible and reliable AI, as well as AI’s impact on society) and getting paper published on high-quality journals.
6. Spatial Intelligence
Spatial intelligence stands as a critical frontier in the evolution of AI, demanding not only a deep understanding of three-dimensional environments but also the capacity to act effectively within them. Unlike existing foundation models that focus on static data in language or vision, spatially intelligent systems require dynamic reasoning—predicting how an environment changes and choosing actions to accomplish meaningful goals. Our vision is to develop foundation models that seamlessly unify perception, reasoning, and action, enabling agents to generalize across diverse environments and tasks with minimal adaptation.
Central to this approach is the dual focus on Embodied AI and 3D Vision Foundation Models. In Embodied AI, we aim to build robotic foundation models (e.g., large Vision-Language-Action frameworks) that can handle a wide variety of tasks in complex physical or simulated worlds. Concurrently, 3D Vision Foundation Models provide the ability to reconstruct, generate, and understand intricate three-dimensional scenes, serving as a pivotal technique for spatial AI. By integrating latent action learning for compact yet expressive action representations, world models for predictive understanding and planning, and reinforcement learning for experience-driven policy improvement, we lay the groundwork for truly generalist agents.
These efforts will catalyze significant advances in robotics, games, industrial automation, and digital twins, and serve as the backbone of next-generation autonomous systems. By prioritizing high-level spatial reasoning over narrow, low-level control, our approach aims to yield robust, adaptable AI capable of excelling in unfamiliar environments. Our research group stands at the forefront of these developments, anticipating new technical breakthroughs and fostering enduring collaborations that will shape the future of spatial intelligence.
-
- Embodied AI & Multimodal Perception
- Robotic Vison-Language-Action Models
- 3D Computer Vision Foundation Models
- Latent Action Learning
- World Modeling,
- Reinforcement Learning
-

Baining Guo (Engaging Lead)
Technical FellowWe invite researchers with a deep enthusiasm for Spatial Intelligence and Embodied AI to become part of our team. AI models with spatial understanding and physical interaction capabilities represent the next breakthrough in developing intelligent systems capable of perceiving, understanding, and interacting with the world in a more human-like way. If you have a strong research foundation in areas like foundation models, 3D computer vision, and robotics, we encourage you to join us. Together, let’s advance this new frontier in AI.

Jiang Bian (Engaging Lead)
Partner Research ManagerWe are eager to collaborate with visiting researchers who are genuinely passionate about advancing spatial intelligence—particularly through breakthroughs in robotics foundation model, latent action learning, and reinforcement learning. Our team is focused on building foundation models and intelligent systems that can not only perceive and reason, but also generalize across diverse tasks and dynamic environments. We are looking for collaborators who are excited about developing adaptable agents, capable of transferring learned skills and strategies to novel and unfamiliar situations. If you are driven to tackle the challenges of generalization in embodied AI and bring practical solutions to real-world problems, we look forward to your valuable contribution.

Jiaolong Yang
Principal Research ManagerI’m eager to welcome professionals with a solid background in 3D computer vision, robotics, LLM/VLM/VLA, reinforcement learning, video generation, and related fields. I expect our visiting researcher to have a strong passion for Spatial Intelligence and Embodied AI and be prepared to collaborate closely with our team during their visit. I also hope to build long-term partnerships with our visiting scholars, working together to produce influential contributions in the forthcoming surge of advancements in Embodied AI.

Li Zhao
Principal ResearcherI am eager to engage with visiting scholars who share a strong passion for latent action learning, reinforcement learning, robotic foundation models, and who are motivated by the challenge of enabling intelligent agents to generalize effectively. Our goal is to create systems that not only perform well in specific scenarios, but can robustly adapt and succeed across a wide range of environments and objectives. If you are interested in foundational research that bridges perception, dynamic decision making, and action generalization in embodied AI, your expertise and creativity will be instrumental as we push the boundaries of what autonomous agents can achieve.

Hao Chen
Senior Research PMI look forward to collaborating closely with visiting scholars who share a strong passion for Embodied AI. In particular, I hope we can work together to explore practical applications of embodied intelligence in real-world scenarios.I also look forward to engaging in deep discussions about the future development of Embodied AI, its integration with industry, and the challenges and opportunities ahead. Additionally, I am excited about the possibility of jointly creating impressive and innovative demos.

Yaobo Liang
Senior ResearcherWe welcome researchers with a strong passion for Embodied AI to join our team. Embodied AI is the next frontier in creating intelligent systems that can perceive, understand, and interact with the real world in a more human-like manner. When AI can talk and see, let’s give it a body! If you have a solid research background in foundation models, 3D computer vision, and robotics, we invite you to join us. Let’s work together push this new AI frontier.

Yu Deng (opens in new tab)
Senior ResearcherI look forward to collaborating closely with visiting researchers on topics related to foundation models in Embodied AI, as well as 3D/4D reconstruction and generation. I hope participants will have a solid background in 3D computer vision and graphics, along with a passion for tackling cutting-edge challenges in these fields. Let’s work together to advance research frontiers and make a big impact.

Sicheng Xu (opens in new tab)
Senior ResearcherI’m eager to work with visiting scholars on foundation models in 3D computer vision and Embodied AI, including 3D/4D reconstruction and generation, and robotic VLA models. I hope participants will have a solid background in related fields and a deep passion for working on high-impact projects and tackling grand challenges.

Chuheng Zhang (opens in new tab)
Senior ResearcherI’m enthusiastic about collaborating with visiting scholars on VLA models and RL across diverse applications and domains. I’m particularly interested in exploring the theoretical foundations and practical implementations of VLA, including their integration with RL algorithms, foundation models, and real-world robotic systems. I look forward to cooperate with you to make significant progress in identifying, exploring, and solving the most fundamental problems in VLA.

Kaixin Wang (opens in new tab)
Senior ResearcherI’m excited to collaborate with visiting scholars on topics such as world models, latent action learning, reinforcement learning, and their applications in spatial intelligence. I look forward to building long-term collaborations and working together to address meaningful real-world challenges.
The six research fields mentioned above are the primary focus for collaboration in the Microsoft Research Asia StarTrack Scholars 2026. In addition to these, applicants can also choose a field of their interest from the following options: Heterogeneous Extreme Computing, Intelligent Cloud and Edge, Intelligent Multimedia, Internet Graphics, Machine Learning, Media Computing, Multi-Modal Computing, Natural Language Computing, Networking Infrastructure, Social Computing, Systems, Trustworthy Systems, Visual Computing.