The Swiss Joint Research Center (Swiss JRC) is a collaborative research engagement between Microsoft Research and the two universities that make up the Swiss Federal Institutes of Technology: ETH Zurich (Eidgenössische Technische Hochschule Zürich, which serves German-speaking students) and EPFL (École Polytechnique Fédérale de Lausanne, which serves French-speaking students). The Swiss JRC is a continuation of a collaborative engagement that was initiated by Steve Ballmer in 2008, when the same three partners embarked on ICES (Microsoft Innovation Cluster for Embedded Software). Currently, the Swiss JRC supports eleven research projects in cutting-edge research areas, including artificial intelligence (AI), mixed reality (MR), customized hardware for data centers, systems, and security.
The Swiss JRC is governed by a Steering Committee consisting of representatives from ETH Zurich, EPFL, Microsoft Switzerland, and Microsoft Research that ensures the smooth operation of the engagement for all three partners and decides on strategic direction. The Steering Committee reviews, ranks, and selects projects based on the potential scientific advance, impact, and alignment with Microsoft research priorities and invites to annual workshops where research teams come together to provide progress updates and exchange ideas.
Swiss JRC Steering Committee
EPFL projects (2019-2020)
Hands in Contact for Augmented Reality
EPFL PI(s): Pascal Fua, Mathieu Salzmann, Helge Rhodin
Microsoft PI(s): Bugra Tekin, Sudipta Sinha, Federica Bogo, Marc Pollefeys
In recent years, there has been tremendous progress in camera-based 6D object pose, hand pose and human 3D pose estimation. They can now both be done in real time but not yet to the level of accuracy required to properly capture how people interact with each other and with objects, which is a crucial component of modeling the world in which we live. For example, when someone grasps an object, types on a keyboard, or shakes someone else’s hand, the position of their fingers with respect to what they are interacting with must be precisely recovered for the resulting models to be used by AR devices, such as the HoloLens device or consumer-level video see-through AR ones. This remains a challenge, especially given the fact that hands are often severely occluded in the egocentric views that are the norm in AR.
We will, therefore, work on accurately capturing the interaction between hands and objects they touch and manipulate. At the heart of it, will be the precise modeling of contact points and the resulting physical forces between interacting hands and objects. This is essential for two reasons. First, objects in contact exert forces on each other; their pose and motion can only be accurately captured and understood if reaction forces at contact points and areas are modeled jointly. Second, touch and touch-force devices, such as keyboards and touch-screens are the most common human-computer interfaces, and by sensing contact and contact forces purely visually, every-day objects could be turned into tangible interfaces, that react as if they were equipped with touch-sensitive electronics. For instance, a soft cushion could become a non-intrusive input device that, unlike virtual mid-air menus, provides natural force feedback.
In this talk, I will present some of our preliminary results and discuss our research agenda for the year to come.
Monitoring, Modelling, and Modifying Dietary Habits and Nutrition Based on Large-Scale Digital Traces
EPFL PI(s): Robert West, Arnaud Chiolero, Magali Rios-Leyvraz
Microsoft PI(s): Ryen White, Eric Horvitz, Emre Kiciman
The overall goal of this project is to develop methods for monitoring, modeling, and modifying dietary habits and nutrition based on large-scale digital traces. We will leverage data from both EPFL and Microsoft, to shed light on dietary habits from different angles and at different scales: Our team has access to logs of food purchases made on the EPFL campus with the badges carried by all EPFL members. Via the Microsoft collaborators involved, we have access to Web usage logs from IE/Edge and Bing, and via MSR’s subscription to the Twitter firehose, we gain full access to a major social media platform. Our agenda broadly decomposes into three sets of research questions: (1) Monitoring and modeling: How to mine digital traces for spatiotemporal variation of dietary habits? What nutritional patterns emerge? And how do they relate to, and expand, the current state of research in nutrition? (2) Quantifying and correcting biases: The log data does not directly capture food consumption, but provides indirect proxies; these are likely to be affected by data biases, and correcting for those biases will be an integral part of this project. (3) Modifying dietary habits: Our lab is co-organizing an annual EPFL-wide event called the Act4Change challenge, whose goal is to foster healthy and sustainable habits on the EPFL campus. Our close involvement with Act4Change will allow us to validate our methods and findings on the ground via surveys and A/B tests. Applications of our work will include new methods for conducting population nutrition monitoring, recommending better-personalized eating practices, optimizing food offerings, and minimizing food waste.
Photonic Integrated Multi-Wavelength Sources for Data Centers
EPFL PI(s): Tobias J. Kippenberg
Microsoft PI(s): Hitesh Ballani
The substantial increase in optical data transmission, and cloud computing, has fueled research into new technologies that can increase communication capacity. Optical communication through fiber, which traditionally has been used for long haul fiber optical communication, is now also employed for short haul communication, even with data-centers. In a similar vein, the increasing capacity crunch in optical fibers, driven in particular by video streaming, can only be met by two degrees of freedom: spatial and wavelength division multiplexing. Spatial multiplexing refers to the use of optical fibers that have multiple cores, allowing to transmit the same carrier wavelength in multiple fibers. Wavelength division multiplexing (WDM or dense-DWM) refers to the use of multiple optical carriers on the same fiber. A key advantage of WDM is the ability to increase line-rates on existing legacy network, without requirements to change existing SMF28 single mode fibers. WDM is also expected to be employed in data-centers. Yet to date, WDM implementation within datacenters faces a key challenge: a CMOS compatible, power efficient source of multi-wavelengths. Currently employed existing solutions, such as multi-laser chips based on InP (as developed by Infinera) cannot be readily scaled to a larger number of carriers. As a result, the currently prevalently employed solution is to use a bank of multiple, individual laser modules. This approach is not viable for datacenters due to space and power constraints. Over the past years, a new technology has rapidly matured – that was developed by EPFL – microresonator frequency combs, or microcombs that satisfy these requirements. The potential of this new technology in telecommunications has recently been demonstrated with the use of microcombs for massively coherent parallel communication on the receiver and transmitter side. Yet to date the use of such micro-combs in data-centers has not been addressed.
- Kippenberg, T. J., Gaeta, A. L., Lipson, M. & Gorodetsky, M. L. Dissipative Kerr solitons in optical microresonators. Science 361, eaan8083 (2018).
- Brasch, V. et al. Photonic chip–based optical frequency comb using soliton Cherenkov radiation. Science aad4811 (2015). doi:10.1126/science.aad4811
- Marin-Palomo, P. et al. Microresonator-based solitons for massively parallel coherent optical communications. Nature 546, 274–279 (2017).
- Trocha, P. et al. Ultrafast optical ranging using microresonator soliton frequency combs. Science 359, 887–891 (2018).
TTL-MSR Taiming Tail-Latency for Microsecond-scale RPCs
EPFL PI(s): Marios Kogias, Edouard Bugnion
Microsoft PI(s): Irene Zhang, Dan Ports
The deployment of a web-scale application within a datacenter can comprise of hundreds of software components, deployed on thousands of servers organized in multiple tiers and interconnected by commodity Ethernet switches. These versatile components communicate with each other via Remote Procedure Calls (RPCs) with the cost of an individual RPC service typically measured in microseconds. The end-user performance, availability and overall efficiency of the entire system are largely dependent on the efficient delivery and scheduling of these RPCs. Yet, these RPCs are ubiquitously deployed today on top fo general-purpose transport protocols such as TCP.
We propose to make RPC first-class citizens of datacenter deployment. This requires a revisitation of the overall architecture, application API, and network protocols. Our research direction is based on a novel RPC-oriented protocol, R2P2, which separates control flow from data flow and provides in-networking scheduling opportunities to tame tail latency. We are also building the tools that are necessary to scientifically evaluate microsesecond-scale services.
ETH Zurich projects (2019-2020)
A Modular Approach for Lifelong Mapping from End-User Data
ETH Zurich PI(s): Roland Siegwart, Cesar Cadena, Juan Nieto
Microsoft PI(s): Johannes Schönberger, Marc Pollefeys
AR/VR allow new and innovative ways of visualizing information and provide a very intuitive interface for interaction. At their core, they rely only on a camera and inertial measurement unit (IMU) setup or a stereo-vision setup to provide the necessary data, either of which are readily available on most commercial mobile devices. Early adoptions of this technology have already been deployed in the real estate business, sports, gaming, retail, tourism, transportation and many other fields. The current technologies in visual-aided motion estimation and mapping on mobile devices have three main requirements to produce highly accurate 3D metric reconstructions: An accurate spatial and temporal calibration of the sensor suite, a procedure which is typically carried out with the help of external infrastructure, like calibration markers, and by following a set of predefined movements. Well-lit, textured environments and feature-rich, smooth trajectories. The continuous and reliable operation of all sensors involved.
This project aims at relaxing these requirements, to enable continuous and robust lifelong mapping on end-user mobile devices. Thus, the specific objectives of this work are: 1. Formalize a modular and adaptable multi-modal sensor fusion framework for online map generation; 2. Improve the robustness of mapping and motion estimation by exploiting high-level semantic features; 3. Develop techniques for automatic detection and execution of sensor calibration in the wild. A modular SLAM (simultaneous localization and mapping) pipeline which is able to exploit all available sensing modalities can overcome the individual limitations of each sensor and increase the overall robustness of the estimation. Such an information-rich map representation allows us to leverage recent advances in semantic scene understanding, providing an abstraction from low-level geometric features – which are fragile to noise, sensing conditions and small changes in the environment – to higher-level semantic features that are robust against these effects. Using this complete map representation, we will explore new ways to detect miscalibrations and sensor failures, so that the SLAM process can be adapted online without the need for explicit user intervention.
Automatic Recipe Generation for ML.NET Pipelines
ETH Zurich PI(s): Ce Zhang
Microsoft PI(s): Matteo Interlandi
Project Altair: Infrared Vision and AI Decision-Making for Longer Drone Flights
ETH Zurich PI(s): Roland Siegwart, Nicholas Lawrance, Jen Jen Chung
Microsft PI(s): Andrey Kolobov, Debadeepta Dey
A major factor restricting the utility of UAVs is the amount of energy aboard, which limits the duration of their flights. Birds face largely the same problem, but they are adept at using their vision to aid in spotting — and exploiting — opportunities for extracting extra energy from the air around them. Project Altair aims at developing infrared (IR) sensing techniques for detecting, mapping and exploiting naturally occurring atmospheric phenomena called thermals for extending the flight endurance of fixed-wing UAVs. In this presentation, we will introduce our vision and goals for this project.
QIRO - A Quantum Intermediate Representation for Program Optimisation
ETH Zurich PI(s): Torsten Hoefler, Renato Renner
Microsoft PI(s): Matthias Troyer, Martin Roetteler
QIRO will establish a new internal representation for compilation systems on quantum computers. Since quantum computation is still emerging, I will provide an introduction to the general concepts of quantum computation and a brief discussion of its strengths and weaknesses from a high-performance computing perspective. This talk is tailored for a computer science audience with basic (popular-science) or no background in quantum mechanics and will focus on the computational aspects. I will also discuss systems aspects of quantum computers and how to map quantum algorithms to their high-level architecture. I will close with the principles of practical implementation of quantum computers and outline the project.
Scalable Active Reward Learning for Reinforcement Learning
ETH Zurich PI(s): Andreas Krause
Microsoft PI(s): Sebastian Tschiatschek
Reinforcement learning (RL) is a promising paradigm in machine learning and gained considerable attention in recent years, partly because of its successful application in previously unsolved challenging games like Go and Atari. While these are impressive results, applying reinforcement learning in most other domains, e.g. virtual personal assistants, self-driving cars or robotics, remains challenging. One key reason for this is the difficulty of specifying the reward function a reinforcement learning agent is intended to optimize. For instance, in a virtual personal assistant, the reward function might correspond to the user’s satisfaction with the assistant’s behavior and is difficult to specify as a function of observations (e.g. sensory information) available to the system. In such applications, an alternative to specifying the reward function is to actually query the user for the reward. This, however, is only feasible if the number of queries to the user are limited and the user’s response can be provided in a natural way such that the system’s queries are non-irritating. Similar problems arise in other application domains such as robotics in which, for instance, the true reward can only be obtained by actually deploying the robot but an approximation to the reward can be computed by a simulator. In this case, it is important to optimize the agent’s behavior while simultaneously minimizing the number of costly deployments. This project’s aim is to develop algorithms for these types of problems via scalable active reward learning for reinforcement learning. The project’s focus is on scalability in terms of computational complexity (to scale to large real-world problems) and sample complexity (to minimize the number of costly queries).
Skilled Assistive-Care Robots through Immersive Mixed-Reality Telemanipulation
ETH Zurich PI(s): Stelian Coros, Roi Poranne
Microsoft PI(s): Federica Bogo, Bugra Tekin, Marc Pollefeys
With this project, we aim to accelerate the development of intelligent robots that can assist those in need with a variety of everyday tasks. People suffering from physical impairments, for example, often need help dressing or brushing their own hair. Skilled robotic assistants would allow these persons to live an independent lifestyle. Even such seemingly simple tasks, however, require complex manipulation of physical objects, advanced motion planning capabilities, as well as close interactions with human subjects. We believe the key to robots being able to undertake such societally important functions is learning from demonstration. The fundamental research question is, therefore, how can we enable human operators to seamlessly teach a robot how to perform complex tasks? The answer, we argue, lies in immersive telemanipulation. More specifically, we are inspired by the vision of James Cameron’s Avatar, where humans are endowed with alternative embodiments. In such a setting, the human’s intent must be seamlessly mapped to the motions of a robot as the human operator becomes completely immersed in the environment the robot operates in. To achieve this ambitious vision, many technologies must come together: mixed reality as the medium for robot-human communication, perception and action recognition to detect the intent of both the human operator and the human patient, motion retargeting techniques to map the actions of the human to the robot’s motions, and physics-based models to enable the robot to predict and understand the implications of its actions.
Tiered NVM Designs, Software-NVM Interfaces, and Isolation Support
ETH Zurich PI(s): Onur Mutlu
Microsoft PI(s): Michael Cornwell, Kushagra Vaid
EPFL projects (2017-2018)
Coltrain: Co-located Deep Learning Training and Inference
EPFL PI(s): Babak Falsafi, Martin Jaggi
Microsoft Co-PI(s): Eric Chung
Deep Neural Networks (DNNs) have emerged as algorithms of choice for many prominent machine learning tasks, including image analysis and speech recognition. In datacenters, DNNs are trained on massive datasets to improve prediction accuracy. While the computational demands for performing online inference in an already trained DNN can be furnished by commodity servers, training DNNs often requires computational density that is orders of magnitude higher than that provided by modern servers. As such, operators often use dedicated clusters of GPUs for training DNNs. Unfortunately, dedicated GPU clusters introduce significant additional acquisition costs, break the continuity and homogeneity of datacenters, and are inherently not scalable.
FPGAs are appearing in server nodes either as daughter cards (e.g., Catapult) or coherent sockets (e.g., Intel HARP) providing a great opportunity to co-locate inference and training on the same platform. While these designs enable natural continuity for platforms, co-locating inference and training on a single node faces a number of key challenges. First, FPGAs inherently suffer from low computational density. Second, conventional training algorithms do not scale due to inherent high communication requirements. Finally, co-location may lead to contention requiring mechanisms to prioritize inference over training.In this project, we will address these fundamental challenges in DNN inference/training co-location on servers with integrated FPGAs. Our goals are:
- Redesign training and inference algorithms to take advantage of DNNs inherent tolerance for low precision operations.
- Identify good candidates for hard-logic blocks for the next generations of FPGAs.
- Redesign DNN training algorithms to aggressively approximate and compress intermediate results, to target communication bottlenecks and scale the training of single networks to an arbitrary number of nodes.
- Implement FPGA-based load balancing techniques in order to provide latency guarantees for inference tasks under heavy loads and enable the use of idle accelerator cycles to train networks when operating under lower loads.
Fast and Accurate Algorithms for Clustering
EPFL PI(s): Michael Kapralov, Ola Svensson
Microsoft Co-PI(s): Yuval Peres, Nikhil Devanur, Sebastien Bubeck
The task of grouping data according to similarity is a basic computational task with numerous applications. The right notion of similarity often depends on the application and different measures yield different algorithmic problems.
The goal of this project is to design faster and more accurate algorithms for fundamental clustering problems such as the k-means problem, correlation clustering and hierarchical clustering. We propose to perform a fine grained study of these problems and design algorithms that achieve optimal trade-offs between approximation quality, runtime and space/communication complexity, making our algorithms well-suited for modern data models such as streaming and MapReduce.
From Companion Drones to Personal Trainers
EPFL PI(s): Pascal Fua, Mathieu Salzmann
Microsoft Co-PI(s): Debadeepta Dey, Ashish Kapoor, Sudipta Sinha
Several companies are now launching drones that autonomously follow and film their owners, often by tracking a GPS device they are carrying. This holds the promise to fundamentally change the way in which drones are used by allowing them to bring back videos of their owners performing activities, such as playing sports, unimpeded by the need to control the drone. In this project, we propose to go one step further and turn the drone into a personal trainer that will not only film but also analyse the video sequences and provide advice on how to improve performance. For example, a golfer could be followed by such a drone that will detect when he swings and offer advice on how to improve the motion. Similarly, a skier coming down a slope could be given advice on how to better turn and carve. In short, the drone would replace the GoPro-style action cameras that many people now carry when exercising. Instead of recording what they see, it would film them and augment the resulting sequences with useful advice. To make this solution as lightweight as possible, we will strive to achieve this goal using the on-board camera as the sole sensor and free the user from the need to carry a special device that the drone locks onto. This will require:
- Detecting the subject in the video sequences acquired by the drone so as to keep him in the middle of its field of view. This must be done in real-time and integrated into the drone’s control system.
- Recovering the subject’s 3D pose as he moves from the drone’s videos. This can be done with a slight delay since the critique only has to be provided once the motion has been performed.
- Providing feedback. In both the golf and ski cases, this would mean quantifying leg, hips, shoulders, and head position during a swing or a turn, offering practical suggestions on how to change them, and showing how an expert would have performed the same action.
Near-Memory System Services
EPFL PI(s): Babak Falsafi
Microsoft Co-PI(s): Stavros Volos
Near-memory processing (NMP) is a promising approach to satisfy the performance requirements of modern datacenter services at a fraction of modern infrastructure’s power. NMP leverages emerging die-stacked DRAM technology, which (a) delivers high-bandwidth memory access, and (b) features a logic die, which provides the opportunity for dramatic data movement reduction – and consequently energy savings – by pushing computation closer to the data. In the precursor to this project (the MSR European PhD Scholarship), we evaluated algorithms suitable for database join operators near memory. We showed, while sort join has been conventionally thought of as inferior to hash join in performance for CPUs, near-memory processing favors sequential over random memory access, making sort join superior in performance and efficiency as a near-memory service. In this project, we propose to answer the following questions:
- What data-specific functionality should be implemented near memory (e.g., data filtering, data reorganization, data fetch)?
- What ubiquitous, yet simple system-level functionality should be implemented near memory (e.g., security, compression, remote memory access)?
- How should the services be integrated with the system (e.g., how does the software use them)?
- How do we employ near-threshold logic in near-memory processing?
Revisiting Transactional Computing on Modern Hardware
EPFL PI(s): Rachid Guerraoui, Georgios Chatzopoulos
Microsoft Co-PI(s): Aleksandar Dragojevic
Modern hardware trends have changed the way we build systems and applications. Increasing memory (DRAM) capacities at reduced prices make keeping all data in-memory cost-effective, presenting opportunities for high performance applications such as in-memory graphs with billions of edges (e.g. Facebook’s TAO). Non-Volatile RAM (NVRAM) promises durability in the presence of failures, without the high price of disk accesses. Yet, even with this increase in inexpensive memory, storing the data in the memory of one machine is still not possible for applications that operate on TB of data, and systems need to distribute the data and synchronize accesses among machines.
This project proposes the design and building of support for high-level transactions on top of modern hardware platforms, using the Structured Query Language (SQL). The important question to be answered is whether transactions can get the maximum benefit of these modern networking and hardware capabilities, while offering a significantly easier interface for developers to work with. This project will require both research in the transactional support to be offered, including the operations that can be efficiently supported, as well as research in the execution plans for transactions in this distributed setting.
Towards Resource-Efficient Data Centers
EPFL PI(s): Florin Dinu
Microsoft PI(s): Christos Gkantsidis, Sergey Legtchenko
The goal of our project is to improve the utilization of server resources in data centers. Our proposed approach was to attain a better understanding of the resource requirements of data-parallel applications and then incorporate this understanding into the design of more informed and efficient data center (cluster) schedulers. While pursuing these directions we have identified two related challenges that we believe hold the key towards significant additional improvements in application performance as well as cluster-wide resource utilization. We will explore these two challenges as a continuation of our project. These two challenges are: Resource inter-dependency and time-varying resource requirements. Resource inter-dependency refers to the impact that a change in the allocation of one server resource (memory, CPU, network bandwidth, disk bandwidth) to an application has on that application’s need for the other resources. Time-varying resource requirements refers to the fact that over the lifetime of an application its resource requirements may vary. Studying these two challenges together holds the potential for improving resource utilization by aggressively but safely collocating applications on servers.
ETH Zurich projects (2017-2018)
Data Science with FPGAs in the Data Center
ETH Zurich PI(s): Gustavo Alonso
Microsoft Co-PI(s): Ken Eguro
While in the first phase of the project we explored the efficient implementation of data processing operators in FPGAs as well as the architectural issues involved in the integration of FPGAs as co-processors in commodity servers, in this new proposal we intend to focus on architectural aspects of in-network data processing. The choice is motivated by the growing gap between the bandwidth and very low latencies that modern networks support and the overhead of ingress and egress from VMs and applications running on conventional CPUs. A first goal is to explore the type of problems and algorithms that can be best run as the data flows through the network so as to be able to exploit the bare wire speed and allow off-loading of expensive computations to the FPGA. A second, but not less important goal, is to explore how to best operate FPGA based accelerators when directly connected to the network and operating independently from the software part of the application. In terms of applications, the focus will remain on data processing (relational, No-SQL, data warehouses, etc.) with the intention of starting to move towards machine learning algorithms at the end of the two-year project. On the network side, the project will work on developing networking protocols suitable to this new configuration and how to combine the network stack with the data processing stack.
Enabling Practical, Efficient and Large-Scale Computation Near Data to Improve the Performance and Efficiency of Data Center and Consumer Systems
ETH Zurich PI(s): Onur Mutlu, Luca Benini
Microsoft Co-PI(s): Derek Chiou
Today’s systems are overwhelmingly designed to move data to computation. This design choice goes directly against key trends in systems and technology that cause performance, scalability and energy bottlenecks:
- data access from memory is a key bottleneck as applications become more data-intensive and memory bandwidth and energy do not scale well,
- energy consumption is a key constraint in especially mobile and server systems,
- data movement is very costly in terms of bandwidth, energy and latency, much more so than computation.
Our goal is to comprehensively examine the premise of adaptively performing computation near where the data resides, when it makes sense to do so, in an implementable manner and considering multiple new memory technologies, including 3D-stacked memory and non-volatile memory (NVM). We will examine practical hardware substrates and software interfaces to accelerate key computational primitives of modern data-intensive applications in memory, runtime and software techniques that can take advantage of such substrates and interfaces. Our special focus will be on key data-intensive applications, including deep learning, neural networks, graph processing, bioinformatics (DNA sequence analysis and assembly), and in-memory data stores. Our approach is software/hardware cooperative, breaking the barriers between the two and melding applications, systems and hardware substrates for extremely efficient execution, while still providing efficient interfaces to the software programmer.
Human-Centric-Flight II: End-user Design of High-level Robotic Behavior
ETH Zurich PI(s): Otmar Hilliges
Microsoft Co-PI(s): Marc Pollefeys
Micro-aerial vehicles (MAVs) have been made accessible to end-users via the emergence of simple to use hardware and programmable software platforms and have seen a surge in consumer and research interest as a consequence. Clearly there is a desire to use such platforms in a variety of application scenarios but manually flying quadcopters remains a surprisingly hard task even for expert users. More importantly, state-of-the-art technologies offer only very limited support for users who want to employ MAVs to reach a certain high-level goal. This is maybe best illustrated by the currently most successful application area – that of aerial videography. While manual flight is hard, piloting and controlling a camera simultaneously is practically impossible. An alternative to manual control is offered via waypoint based control of MAVs, shielding novices from the underlying complexities. However, this simplicity comes at the cost of flexibility and existing flight planning tools are not designed with high-level user goals in mind.
Building on our own (MSR JRC funded) prior work, we propose an alternative approach to robotic motion planning. The key idea is to let the user work in solution-space – instead of defining trajectories the user would define what the resulting output should be (e.g., shot composition, transitions, area to reconstruct). We propose an optimization-based approach that takes such high-level goals as input and generates the trajectories and control inputs for a gimbal mounted camera automatically. We call this solution-space driven, inverse kinematic motion planning. Defining the problem directly in the solution space removes several layers of indirection and allows users to operate in a more natural way, focusing only on the application specific goals and the quality of the final result, whereas the control aspects are entirely hidden.
Tractable by Design
ETH Zurich PI(s): Thomas Hofmann, Aurélien Lucchi
Microsoft Co-PI(s): Sebastian Nowozin
The past decade has seen a growth in application of big data and machine learning systems. Probabilistic models of data are theoretically well understood and in principle provide an optimal approach to inference and learning from data. However, for richly structured data domains such as natural language and images, probabilistic models are often computationally intractable and/or have to make strong conditional independence assumptions to retain computational as well as statistical efficiency. As a consequence, they are often inferior in predictive performance, when compared to current state-of-the-art deep learning approaches. It is a natural question to ask, whether one can combine the benefits of deep learning with those of probabilistic models. The major conceptual challenge is to define deep models that are generative, i.e. that can be thought of as models of the underlying data generating mechanism.
We thus propose to leverage and extend recent advances in generative neural networks to build rich probabilistic models for structured domains such as text and images. The extension of efficient probabilistic neural models will allow us to represent complex and multimodal uncertainty efficiently. To demonstrate the usefulness of the developed probabilistic neural models we plan to apply them to challenging multimodal applications such as creating textual descriptions for images or database records.
EPFL projects (2014-2016)
Authenticated Encryption: Security Notions, Constructions, and Applications
EPFL PI(s): Serge Vaudenay
Microsoft PI(s): Markulf Kohlweiss
For an encryption scheme to be practically useful, it must deliver on two complementary goals: the confidentiality and integrity of encrypted data. Historically, these goals were achieved by combining separate primitives, one to ensure confidentiality and another to guarantee integrity. This approach is neither the most efficient (for instance, it requires processing the input stream at least twice), nor does it protect against implementation errors. To address these concerns, the notion of Authenticated Encryption (AE), which simultaneously achieves confidentiality and integrity, was put forward as a desirable first-class primitive to be exposed by libraries and APIs to the end developer. Providing direct access to AE rather than requiring developers to orchestrate calls to several lower-level functions is seen as a step towards improving the quality of security-critical code.
An indication of both the importance of useable AE and the difficulty of getting it right, are the number of standards that were developed over the years. These specified different methods for AE: the CCM method is specified in IEEE 802.11i, IPsec ESP, and IKEv2; the GCM method is specified in NIST SP 800-38D; the EAX method is specified in ANSI C12.22; and ISO/IEC 19772:2009 defines six methods, including five dedicated AE designs and one generic composition method, namely Encrypt-then-MAC.
Several security issues have recently arisen and been reported in the (mis)use of symmetric key encryption with authentication in practice. As a result, the cryptographic community has initiated the Competition for Authenticated Encryption: Security, Applicability, and Robustness (CAESAR), to boost public discussions towards a better understanding of these issues, and to identify a portfolio of efficient and secure AE schemes.
Our project aims to contribute to the design, analysis, evaluation, and classification of the emerging AE schemes during the CAESAR competition. It has effected many practical security protocols that use AE schemes as indispensable underlying primitives. Our work has broader implications for the theory of AE as an important research area in symmetric-key cryptography.
EPFL PI(s): Edouard Bugnion, Babak Falsafi
Microsoft PI(s): Dushyanth Naraya
The goal of the Scale-Out NUMA project is to deliver energy-efficient, low-latency access to remote memory in datacentre applications, with a focus on rack-scale deployments. Such infrastructure will become critical for both web-scale only applications as well as scale-out analytics where the dataset can reside in the collective (but distributed) memory of a cluster of servers.
Our approach to the problem layers an RDMA-inspired programming model directly on top of a NUMA fabric via stateless messaging protocol. To facilitate interactions between the application, the OS and the fabric, soNUMA relies on the remote memory controller – a new architecturally-exposed hardware block integrated into the node’s local coherence hierarchy.
Towards Resource-Efficient Data Centers
EPFL PI(s): Florin Dinu
Microsoft PI(s): Sergey Legtchenko
Our vision is of resource-efficient datacenters where the compute nodes are fully utilized. We see two challenges to manifesting this vision. The first is the increasing use of hardware heterogeneity in datacenters. Heterogeneity, while both unavoidable and desirable, does not lend itself to today’s systems and algorithms, which inefficiently handle heterogeneity. The second challenge is the aggressive scale-out of datacenters. Scale-out has made it conveniently easy to disregard inefficiencies at the level of individual compute nodes because it has been historically easy to expand to new resources. However, apart from being unnecessarily costly, such scale-out techniques are now becoming impractical due to the size of the datasets. Moreover, scale-out often adds new inefficiencies.
We argue that to meet these challenges, we must start from a thorough understanding of the resource requirements of today’s datacenter jobs. With this understanding, we aim to design new scheduling techniques that efficiently use resources, even in heterogeneous environments. Further, we aim to fundamentally change the way data-parallel processing systems are built and to make efficient compute node resource utilization a cornerstone of their design.
Our first goal is to automatically characterize the pattern of memory requirements of data-parallel jobs. Specifically, we want to go beyond the current practices that are interested only in peak memory usage. To better identify opportunities for efficient memory management, more granular information is necessary.
Our second goal is to use knowledge of the pattern of memory requirements to design informed scheduling algorithms that manage memory efficiently.
The third goal of the project is to design data-parallel processing systems that are efficient in terms of managing memory, not only by understanding task memory requirements, but also by shaping those memory requirements.
ETH Zurich projects (2014-2016)
ARRID: Availability and Reliability as a Resource for Large-Scale In-Memory Databases on Datacenter Computers
ETH Zurich PI(s): Torsten Hoefler
Microsoft PI(s): Miguel Castro
Disk-backed in-memory key/value stores are gaining significance as many industries are moving toward big data analytics. Storage space and query time requirements are challenging, since the analysis has to be performed at the lowest cost to be useful from a business perspective. Despite those cost constraints, today’s systems are heavily overprovisioned when it comes to resiliency. The undifferentiated three-copy approach leads to a potential waste of bandwidth and storage resources, which then makes the overall system less efficient or more expensive. We propose to revisit currently used resiliency schemes, with the help of analytical hardware failure models. We will utilize those models to capture the exact tradeoff between the overhead due to replication and the exact resiliency requirements that are defined in a contract. Our key idea is to model reliability as an explicit resource that the user allocates consciously. In previous work, we have been able to speed-up scientific computing applications, as well as a distributed hashtable, on several hundred-thousand cores by more than 20 percent, with the use of advanced RDMA programming techniques. We have also demonstrated low-cost resiliency schemes based on erasure coding for RDMA environments. In addition, we propose to apply our experience with large-scale RDMA programming to the design of in-memory databases, a problem very similar to distributed hashtables. To make reliability explicit, we plan to extend the key value store with explicit reliability attributes that allow the user to specify reliability and availability requirements for each key (or group of keys). Our work may change the perspective in datacenter resiliency. Defining fine-grained, per-object resiliency levels and tuning them to the exact environment may provide large cost benefits and impact industry. For example, changing the standard three-replica scheme to erasure coding can easily save 30 percent of storage expenses.
Efficient Data Processing Through Massive Parallelism and FPGA-Based Acceleration
ETH Zurich PI(s): Gustavo Alonso
Microsoft PI(s): Ken Eguro
One of the biggest challenges for software these days is to adapt to the rapid changes in hardware and processor architecture. On the one hand, extracting performance from modern hardware requires dealing with increasing levels of parallelism. On the other hand, the wide variety of architectural possibilities and multiplicity of processor types raise many questions in terms of the optimal platform for deploying applications.
In this project we will explore the efficient implementation of data processing operators in FPGAs, as well as the architectural issues involved in the integration of FPGAs as co-processors in commodity servers. The target application is big data and data processing engines (relational, No-SQL, data warehouses, etc.). Through this line of work, the project aims at exploring architectures that will result in computing nodes with a smaller energy consumption and physical size, but capable of providing a performance boost to applications for big data. FPGAs should be seen here not as a goal in themselves, but as an enabling platform for the exploration of different architectures and levels of parallelism that will allow us to bypass the inherent restriction of conventional processors.
On the practical side, the project will focus on both the use of FPGAs as co-processors inside existing engines, as well as on developing proof-of-concept implementations of data processing engines entirely implemented in FPGAs. In this area, the project complements very well with ongoing efforts at Microsoft Research around Cipherbase, a trusted computing system based on SQL server deployments in the cloud. On the conceptual side, the project will explore the development of data structures and algorithms capable of exploiting the massive parallelism available in FPGAs, with a view to gaining much needed insights on how to adapt existing data processing systems to multi- and many-core architectures. Here, we expect to gain insights on how to redesign both standard relational data operators, as well as data mining and machine learning operators, to better take advantage of the increasing amounts of parallelism available in future processors.
Human-Centric Flight: Micro-Aerial Vehicles (MAVs) for Interaction, Videography, and 3D Reconstruction
ETH Zurich PI(s): Otmar Hilliges, Marc Pollefeys
Microsoft PI(s): Shahram Izadi
In recent years, robotics research has made tremendous progress and it is becoming conceivable that robots will be as ubiquitous and irreplaceable in our daily lives as they are within industrial settings. Continued improvements, in terms of mechatronics and control aspects, coupled with continued advances in consumer electronics, have made robots ever smaller, autonomous, and agile.
One area of recent advances in robotics is the notion of micro-aerial vehicles (MAVs) [14, 16]. These are small, flying robots that are very agile, can operate in a 3D space, indoors and outdoors, and can carry small payloads — including input and output devices — and can navigate difficult environments, such as stairs, more easily than terrestrial robots; and hence can reach locations that no other robot or indeed humans can reach.
Surprisingly, to date there is little research on such flying robots in an interactive context or on MAVs operating in near proximity to humans. In our project, we explore the opportunities that arise from aerial robots that operate in close proximity to and in collaboration with a human user. In particular, we are interested in developing a robotic platform in which a) the robot is aware of the human user and can navigate relative to the user; b) the robot can recognize various gestures from afar, as well as receive direct, physical manipulations; c) the robot can carry small payloads — in particular input and output devices such as additional cameras or projectors.
Finally, we are developing novel algorithms to track and recognize user input, by using the onboard cameras, in real-time and with very low-latency, to build on the now substantial body of research on gestural and natural interfaces. Gesture recognition can be used for MAV control (for example, controlling the camera) or to interact with virtual content.
Software-Defined Networks: Algorithms and Mechanisms
ETH Zurich PI(s): Roger Wattenhofer
Microsoft PI(s): Ratul Mahajan
The Internet is designed as a robust service to ensure that we can use it with selfish participants present. As such, a loss in total performance must be accepted. However, if a whole wide-area network (WAN) was controlled by a single entity, why should one use the very techniques designed for the Internet? Large providers such as Microsoft, Amazon, or Google operate their own WANs, which cost them hundreds of millions of dollars per year; yet even their busier links average only 40–60 percent utilization.
This gives rise to Software Defined Networks (SDNs), which allow the separation of the data and the control plane in a network. A centralized controller can install and update rules all over the WAN, to optimize its goals.
Despite SDNs receiving a lot of attention in both theory and practice, many questions are still unanswered. Even though the control of the network is centralized, distributing the updates does not happen instantaneously. Numerous problems can occur, such as the dropping of packets, generation of loops, breaking the memory/bandwidth limit of switches/links, and missing packet coherence. These problems must be solved before SDNs can be broadly deployed.
This research project sheds more light on these fundamental issues of SDNs and how they can be tackled. In parallel, we look at SDNs from a game-theoretic perspective.
Blogs & podcasts
Microsoft Research Podcast | April 10, 2019
Microsoft Switzerland | February 5, 2019
Microsoft continues to invest in research cooperation with ETH and EPFL, thereby strengthening Switzerland’s innovation power
Microsoft Switzerland | November 1, 2018
Microsoft Switzerland | October 29, 2018
Microsoft Research Blog | December 4, 2017
Microsoft Research Blog | February 21, 2017
Microsoft Research Blog | February 5, 2014
Microsoft Research Blog | February 5, 2014
In the news
Don’t read German? Download Microsoft Translator
DNA as storage medium of the future (German)
Neue Zürcher Zeitung | March 30, 2019
Smartglasses will replace the mobile phone (German, PDF)
Tages-Anzeiger | March 13, 2019
Man and Machine – Panel discussion at WEF 2019
ETH Zürich | February 12, 2019
Microsoft CEO Satya Nadella at ETH Zurich
Startup Ticker | November 1, 2018
ETH and Microsoft: Hunting for talent (German)
ETH News | November 1, 2018
Microsoft expands partnership with ETH Zurich (German)
Inside IT & Inside Channels | November 1, 2018
Microsoft and ETH are chasing talent (German)
Netzwoche | November 2, 2018
Microsoft CEO Satya Nadella visits ETH Zurich (German)
Computerworld | November 2, 2018
Microsoft extends cooperation with ETH, EPFL, partners with Mixed Reality & AI Zurich Lab
Telecompaper | November 2, 2018