EPFL projects (2022-2023)
EPFL PIs: Bruno Correia (and Michael Bronstein, Imperial)
Microsoft PIs: Max Welling, Chris Bishop
PhD Student: Freyr Sverisson, Arne Scheuing
Proteins play a crucial role in every form of life. The function of proteins is largely determined by their 3D structure and the way they interact with other molecules. Understanding the mechanisms that govern protein structure and their interactions with other molecules is a holy grail of biology that also paves the path to ground-breaking new applications in biotechnology and medicine. Over the past three decades, large amounts of structural data on proteins has been made available to the wide-scientific community. This has created opportunities for machine learning (ML) approaches to improve our ability to better understand the governing principles of these molecules, as well as to develop computational approaches for the design of novel proteins and small molecules drugs. The three-dimensional structures of proteins and imolecular objects are a natural fit for Geometric Deep Learning (GDL). In this proposal, we will develop GDL-based approaches that describe molecular entities using point clouds engraved with descriptors capturing physical features (geometry and chemistry) that will be optimized to describe different aspects of proteins. Specifically, through the aims of this grant we will attempt to: capture dynamic features of protein surfaces (Aim 1); leverage the surface descriptors to condition the generation of small-molecules to engage specific pockets (Aim 2); couple new structure prediction algorithms with surface descriptor optimization for the design of new functions in proteins (Aim 3). Towards the generative aspects of our application (designing new surfaces, small-molecules, proteins), a common problem is that the spaces to be sampled are extremely large and thus the expertise within the Microsoft Research team could be critical to reach a functional solution. Specifically, the expertise in variational autoencoders, equivariant architectures and Bayesian optimization will be of major importance. In summary, we propose a novel approach powered by cutting edge computational methods to model and design de novo proteins that globally has an enormous potential to help addressing problems in medicine and biotechnology.
EPFL PIs: Alexander Mathis, Friedhelm Hummel, Silvestro Micera
Microsoft PIs: Marc Pollefeys
PhD Student: Haozhe Qi
Despite many advances in neuroprosthetics and neurorehabilitation, the techniques to measure, to personalize and thus to optimize the functional improvements that patients gain with therapy are limited. Impairments remain to be assessed by standardized functional tests, which fail to capture everyday behaviour and quality of life or allow to be well used for personalization and have to be performed by trained health care professionals in the clinical environment. By leveraging recent advances in motion capture and hardware, we will create novel metrics to evaluate, personalize and improve the dexterity of patients in their everyday life. We will utilize the EPFL Smart Kitchen platform to assess naturalistic behaviour in the kitchen of both healthy subjects, upper-limb amputees and stroke patients filmed from a head mounted camera (Microsoft HoloLens). We will develop a computer vision pipeline that is capable of measuring hand-object interactions in patient’s kitchens. Based on this novel, large-scale dataset collected in patient’s kitchens, we will derive metrics that measure dexterity in the “natural world,” as well as recovered and compensatory movements due to the pathology/assistive device. We will also use those data, to assess novel control strategies for neuroprosthetics and design optimal, personalized rehabilitation treatment by leveraging virtual reality.
As machine learning (ML) models are becoming more complex, there has been a growing interest in making use of decentrally generated data (e.g., from smartphones) and in pooling data from many actors. At the same time, however, privacy concerns about organizations collecting data have risen. As an additional challenge, decentrally generated data is often highly heterogeneous, thus breaking assumptions needed by standard ML models. Here, we propose to “kill two birds with one stone” by developing Invariant Federated Learning, a framework for training ML models without directly collecting data, while not only being robust to, but even benefiting from, heterogeneous data. For the problem of learning from distributed data, the Federated Learning (FL) framework has been proposed. Instead of sharing raw data, clients share model updates to help train an ML model on a central server. We combine this idea with the recently proposed Invariant Risk Minimization (IRM) approach, a solution for causal learning. IRM aims to build models that are robust to changes in the data distribution and provide better out-of-distribution (OOD) generalization by using data from different environments during training. This integrates naturally with FL, where each client may be seen as constituting its own environment. We seek to gain robustness to distributional changes and better OOD generalization, as compared to FL methods based on the standard empirical risk minimization. Previous work has further shown that causal models possess better privacy properties than associational models . We will turn these theoretical insights into practical algorithms to, e.g., provide Differential Privacy guarantees for FL. The project proposed here integrates naturally with ideas pursued in the context of the Microsoft Turing Academic Program (MS-TAP), where the PI’s lab is already collaborating with Microsoft (including Emre Kıcıman, a co-author of this proposal) in order to make language models more robust via IRM.
The fundamental equations governing interacting quantum-mechanical matter in solids have been known for over 90 years. However, these equations are simply “much too complicated to be soluble” (Paul A. M. Dirac, 1929). Besides experiments, the main source of information that we have available originates from computational methods to simulate these systems. Machine learning approaches based on artificial neural networks (NN) have recently been shown to be a new powerful tool in simulating systems governed by the laws of quantum mechanics. The leading approach in the field, pioneered by Carleo and Troyer, are known as neural quantum states, and have been successfully applied to several model quantum systems. For these typically prototypical and simplified – yet hard to solve– models of interacting quantum matter, neural quantum states have shown state-of-the-art – or better – performance. Despite this success, however, the application of neural quantum states to the ab-initio simulation of solids and materials is largely unexplored, both theoretically and computationally. Compared to the method for quantum spin systems, this requires methods that intrinsically work on continuous degrees of freedom, rather than discrete ones. Examples of important systems that can be studied with continuous space methods are crystals and several phases of matter that show a periodic lattice structure. In this project, we will introduce deep-learning-based approaches for the ab-initio simulation of solids, with a focus on imposing physical symmetries and scalability. With a powerful and efficient computational method to simulate continuous-space atomic quantum systems, we will be able to access unprecedented regimes of accuracy for the descriptions of materials, especially in two dimensions, where strong interactions are dominant.
EPFL PI: Pascal Fua
Microsoft PIs: Chris Bishop
Research Engineer: Benoit Gherardi
We live in a three-dimensional world full of manufactured objects of ever-increasing complexity. To be functional, they require clever engineering and design. The search for energy-efficient designs of objects, such as the windmill exemplifies the challenges and promises of such engineering: The blades must have the right shapes to harness as much energy from the wind by balancing lift and drag, and the whole assembly must be strong and light. With ever more powerful simulation techniques and the advent of digital sensors that enable precise measurements, shape engineering relies increasingly on the resulting algorithmic developments. As a result, Computer Aided Design (CAD) has become central to engineering but is not yet capable of addressing all the relevant issues simultaneously. Computer Vision and Computer Graphics are among the fields with the greatest potential for impact in CAD, especially given the remarkable progress that deep learning has fostered in these fields. For example, continuous deep implicit-fields have recently emerged as one of the most promising 3D shape-modeling approaches for objects that can be represented by a single watertight surface.
However, current approaches to modeling complex composite objects cannot jointly account for geometric, topological, engineering constraints as well as for performance requirements. To remedy this, we will build latent models that can be used to represent and optimize complex composite shapes while strictly enforcing compatibility constraints between their components and controllability constraints on the whole. A central focus will be on developing training methods that guarantee that the output of the deep networks we train strictly obey these constraints, something that existing methods that rely on adding ad hoc loss functions cannot do. The results will be integrated into Microsoft’s simulation platforms— AirSim and Bonsai —with a view to rapidly building and designing real-world robots.
EPFL PIs: Edouard Bugnion, Mathias Payer
Microsoft PIs: Adrien Ghosn
PhD Student: Charly Castes
Confidential computing is an increasingly popular means to wider Cloud adoption. By offering confidential virtual machines and enclaves, Cloud service providers now host organizations, such as banks and hospitals, that abide by stringent legal requirement with regards to their client’s data confidentiality. These technologies foster sufficient trust to enable such clients to transition to the Cloud, while protecting themselves against a potentially compromised or malicious host. Unfortunately, confidential computing solutions depend on bleeding-edge emerging hardware that (1) takes long to roll out at the Cloud scale and (2) as a recent technology, lacks a clear consensus on both the underlying hardware mechanisms and the exposed programming model and is thus bound to frequent changes and potential security vulnerabilities. This proposal strives to explore the possibilities of building confidential systems without special hardware support. Instead, we will leverage existing commodity hardware that is already deployed in Cloud datacenters combined with new programming language and formal method techniques and identify how to provide similar or even more elaborate confidentiality and integrity guarantees than the existing confidential hardware. Achieving such a software/hardware co-design will enable Cloud providers to deploy new Cloud products for confidential computing without waiting for neither the standardization nor the wide installation of confidential hardware. The key goal of this project is the design and implementation of a trusted, attested, and formally verified monitor acting as a trusted intermediary between resource managers, such as a Cloud hypervisor or an OS, and their clients, e.g., confidential virtual machines and applications. We plan to explore how commodity hardware features, such as hardware support for virtualization, can be leveraged in the implementation of such a solution with as little modification as possible to existing hypervisor implementations.
ETH Zurich projects (2022-2023)
Despite popular depictions in sci-fi movies and TV shows, robots remain limited in their ability to autonomously solve complex tasks. Indeed, even the most advanced commercial robots are only now just starting to navigate man-made environments while performing simple pick-and-place operations. In order to enable complex high-level behaviours, such as the abstract reasoning required to manoeuvre objects in highly constrained environments, we propose to leverage human intelligence and intuition. The challenge here is one of representation and communication. In order to communicate human insights about a problem to a robot, or to communicate a robot’s plans and intent to a human, it is necessary to utilize representations of space, tasks, and movements that are mutually intelligible for both human and robot. This work will focus on the problem of single and multi-robot motion planning with human guidance, where a human assists a team of robots in solving a motion-based task that is beyond the reasoning capabilities of the robot systems. We will exploit the ability of Mixed Reality (MR) technology to communicate spatial concepts between robots and humans, and will focus our research efforts on exploring the representations, optimization techniques, and multi-robot task planning necessary to advance the ability of robots to solve complex tasks with human guidance.
Goal of this research project is to reduce the trust placed in the Cloud Service Provider by increasing the control of the customer over the resources assigned to it in the cloud infrastructure.
We plan to investigate this specifically in the context of a device or chiplet owned by the client and then placed within the cloud infrastructure, an “Embassy Hardware Device”. Such device would be able to control, manage, and retain access to the data while remaining inaccessible (in terms of data and control flow) to the Cloud Service Provider. Several research challenges need to be solved in order to develop an end-to-end working prototype.
DRAM is the prevalent technology used to architect main memory across a wide range of computing platforms. Unfortunately, DRAM suffers from the RowHammer vulnerability. RowHammer is caused by repeatedly accessing (i.e., hammering) a DRAM row such that the electro-magnetic interference that develops due to the rapid DRAM row activations causing bit flips in DRAM rows that are physically nearby the hammered row. Prior research demonstrates that the RowHammer vulnerability of DRAM chips worsens as DRAM cell size and cell-to-cell spacing shrink. Numerous works demonstrate RowHammer attacks to escalate user privileges, obtain private keys, manipulate sensitive data, and destroy the accuracy of neural networks. Given that the RowHammer vulnerability of modern DRAM chips worsens and can be used to compromise a wide range of computing platforms, it is crucial to fundamentally understand and solve RowHammer to ensure secure and reliable DRAM operation. Our goal in this project is to
- rigorously study the unexplored aspects of RowHammer via rigorous experiments, using hundreds of real DRAM chips, and leverage all the understanding we develop to
- experimentally analyze the security guarantees of existing RowHammer mitigation mechanisms (e.g., Tar-get Row Refresh (TRR)),
- craft more effective RowHammer access patterns, and
- design completely secure, efficient, and low-cost RowHammer mitigation mechanisms.
Digital capture of human bodies is a rapidly growing research area in computer vision and computer graphics that puts scenarios such as life-like Mixed Reality (MR) virtual-social interactions into reach, albeit not without overcoming several challenging research problems. A core question in this respect is how to faithfully transmit a virtual copy of oneself so that a remote collaborator may perceive the interaction as immersive and engaging. To present a real alternative to face-to-face meetings, future AR/VR systems will crucially depend on the following two core building blocks:
- means to capture the 3D geometry and appearance (e.g., texture, lighting) of individuals with consumer-grade infrastructure (e.g., a single RGB-D camera) and with very little time and expertise and
- means to represent the captured geometry and appearance information in a fashion that is suitable for photorealistic rendering under fine-grained control over the underlying factors such as pose and facial expressions amongst others.
In this project, we plan to develop novel methods to learn animatable representations of humans from ‘cheap’ data sources alone. Furthermore, we plan to extend our own recent work on animatable neural implicit surfaces, such that it can represent not only the geometry but also the appearance of subjects in high visual fidelity. Finally, we plan to study techniques to enforce geometric and temporal consistency in such methods to make them suitable for MR and other telepresence downstream applications.
ETH Zurich PI: Christian Holz
Microsoft PIs: Tadas Baltrusaitis (opens in new tab)
PhD Student: Björn Braun
The passive measurement of cognitive stress and its impact on performance in cognitive tasks has a huge potential for human-computer interaction (HCI) and affective computing, including workload optimization or “flow” understanding for future of work productivity scenarios, remote learning, automated tutor systems, as well as stress monitoring, mental health, and telehealth applications more generally. When cognitive demands exceed resources, people experience stress and task performance degrades. In this project, we will develop intelligent software experiences that reduce workers’ stress and optimize their cognitive resources. We will develop sensing models that capture the body’s autonomic nervous system (“fight or flight”) responses to cognitive demands in real-time using information from multiple physiologic processes. These inputs will then help drive AI support that adapts to provide cognitive support while also maintaining autonomy (e.g., avoiding unnecessary and annoying interventions.) Specifically, we will develop novel computer vision and signal processing approaches for measuring cardiovascular, respiratory, pupil/ocular, and dermal changes using ubiquitous sensors. For desktop environments, we will develop, evaluate, and demonstrate our methods using non-contact sensing (the webcams built into PCs). For head-mounted displays, we will appropriate our methods to utilize signals originating from the wearer’s head using built-in headset sensors. In both cases, our developments will produce novel datasets, computational methods, and the results of in-situ evaluations in productivity scenarios. Using our novel methods, we will also investigate their implications for telehealth scenarios, which often contain cardiovascular and respiratory assessments. We will develop scenarios that guide the user while assessing these metrics and visually present the remote physician with the results for examination.
ETH Zurich PI: Sebastian Kozerke
Microsoft PI: Michael Hansen
PhD Student: Pietro Dirix
Cardiovascular Magnetic Resonance Imaging (MRI) has become a key imaging modality to diagnose, monitor and stratify patients suffering from a wide range of cardiovascular diseases. Using Flow MRI, time-resolved blood flow patterns can be quantified throughout the circulatory system providing information on the interplay of anatomical and hemodynamic conditions in health and disease.
Today, inference of Flow MRI data is based on data post-processing, which includes massive data reduction to yield metrics such as mean and peak flow, kinetic energy, and wall shear rates. In consequence of the data reduction step, however, the wealth of information encoded in the data including fundamental causal relations are potentially missed. In addition, the dependency of the metrics on parameters of the measurement and image reconstruction process itself compromises the diagnostic yield and the reproducibility of the method, hence hampering further dissemination.
Here we propose to develop and implement a computational framework for Flow Tensor MRI data synthesis to train physics-based neural networks for image reconstruction and inference of the complex interplay of anatomy, coherent and incoherent flows in the aorta in-vivo. Using cloud-based, scalable computing resources, we will demonstrate that synthetically trained reconstruction and inference machines permit high-speed image reconstruction and inference to unravel complex structure-function relations using real-world in-vivo Flow Tensor MRI by exploiting the entirety of information contained in the data along with the information of the measurement process itself.
Our objective is to exploit recent developments in MR to enhance human capabilities with robotic assistance. Robots offer mobility and power but are not capable of performing complex tasks in challenging environments such as construction, contact-based inspection, cleaning, and maintenance. On the other hand, humans have excellent higher-order reasoning, and skilled workers have the experience and training to adapt to new circumstances quickly and effectively. However, they lack in mobility and power. We envision to reduce this limitation by empowering human operators with the assistance and the capabilities provided by a robot system. This requires a human-robot interface that fully leverages the capabilities of both the human operator and the robot system. In this project we aim to explore the problem of shared autonomy for physical interaction tasks in shared physical workspaces. We will explore how an operator can effectively command a robot system using a MR interface over a range of autonomy levels from low-level direct teleoperation to high-level task specification. We will develop methods for estimating the intent and comfort level of an operator to provide an intuitive and effective interface. Finally, we will explore how to pass information from the robot system back to the human operator for effective understanding of the robot’s plans. We will prove the value of mixed reality interfaces by enhancing human capabilities with robot systems through effective, bilateral communication for a wide variety of complex tasks.
Goal of the is research project is to give visibility on whether any abuse is happening, particularly if it is happening from untrusted software (e.g., Operating System, Hypervisor) or trusted-but-erroneous software (e.g., Trusted Execution Environment management).
The key idea is to have a small, trusted software to check the runtime behavior of the untrusted and trusted-but-erroneous software. Such a minimal security monitor can restrict the privileged software’s capabilities and visibility over the system while still adequately managing the resources.
ETH Zurich PI: Kaveh Razavi
Microsoft PI: Boris Köpf
PhD Student: Flavien Solt
There is currently a large gap between the capabilities of Electronic Design Automation (EDA) tools and what is required to detect various classes of microarchitectural vulnerabilities pre-silicon. This project aims to bridge this gap by leveraging recent advances in software testing to produce the necessary knowledge and tools for effective hardware testing. Our driving hypothesis is that if we could provide crucial information about the privilege and domain of instructions and/or data in the microarchitecture during simulation or emulation, then we can easily detect many classes of microarchitectural vulnerabilities. As an example, with the right test cases, we could detect Meltdown-type vulnerabilities since seemingly different variants all require an instruction that can access data from a different privilege domain.
Probabilistic data structures (PDS) are becoming extremely widely used in practice in the era of “big data”. They are used to process large data sets, often in a streaming setting, and to provide approximate answers to basic data exploration questions such as “Has a particular bit-string in this data stream been encountered before?” or “How many distinct bit-strings are there in this data set?”. They are increasingly supported in systems like Microsoft Azure Data Explorer, Google Big Query, Apache Spark, Presto and Redis, and there is an active research community working on PDS within computer science. Generally, PDS are designed to perform well “in the average case”, where the inputs are selected independently at random from some distribution. This we refer to as the non-adversarial setting. However, they are increasingly being used in adversarial settings, where the inputs can be chosen by an adversary interested in causing the PDS to perform badly in some way, e.g. creating many false positives for a Bloom filter, or underestimating the set cardinality for a cardinality estimator. In recent work, we performed an in-depth analysis of the HyperLogLog (HLL) PDS and its security under adversarial input. The proposed research will extend our prior work in three directions:
- address the mergeability problem for HLL;
- extend our simulation-based framework for studying the correctness and security of HLL to other PDS in adversarial settings;
- study the specific case of cascaded Bloom filters, which have been proposed for use in CRLite, a privacy-preserving system for managing certificate revocation for the webPKI.
EPFL projects (2019-2021)
EPFL PIs: Pascal Fua (opens in new tab), Mathieu Salzmann (opens in new tab)
Microsoft PIs: Bugra Tekin (opens in new tab), Sudipta Sinha (opens in new tab), Federica Bogo (opens in new tab), Marc Pollefeys (opens in new tab)
PhD Student: Mengshi Qi (opens in new tab)
In recent years, there has been tremendous progress in camera-based 6D object pose, hand pose, and human 3D pose estimation. They can now both be done in real-time but not yet to the level of accuracy required to properly capture how people interact with each other and with objects, which is a crucial component of modeling the world in which we live. For example, when someone grasps an object, types on a keyboard, or shakes someone else’s hand, the position of their fingers with respect to what they are interacting with must be precisely recovered for the resulting models to be used by AR devices, such as the HoloLens device or consumer-level video see-through AR ones. This remains a challenge, especially given the fact that hands are often severely occluded in the egocentric views that are the norm in AR. We will, therefore, work on accurately capturing the interaction between hands and objects they touch and manipulate. At the heart of it, will be the precise modeling of contact points and the resulting physical forces between interacting hands and objects. This is essential for two reasons. First, objects in contact exert forces on each other; their pose and motion can only be accurately captured and understood if reaction forces at contact points and areas are modeled jointly. Second, touch and touch-force devices, such as keyboards and touch-screens are the most common human-computer interfaces, and by sensing contact and contact forces purely visually, everyday objects could be turned into tangible interfaces, that react as if they were equipped with touch-sensitive electronics. For instance, a soft cushion could become a non-intrusive input device that, unlike virtual mid-air menus, provides natural force feedback. In this talk, I will present some of our preliminary results and discuss our research agenda for the year to come.
EPFL PIs: Robert West (opens in new tab), Arnaud Chiolero (opens in new tab)
Microsoft PIs: Ryen White (opens in new tab), Eric Horvitz (opens in new tab), Emre Kiciman (opens in new tab)
PhD Student: Kristina Gligoric (opens in new tab)
The overall goal of this project is to develop methods for monitoring, modeling, and modifying dietary habits and nutrition based on large-scale digital traces. We will leverage data from both EPFL and Microsoft, to shed light on dietary habits from different angles and at different scales: Our team has access to logs of food purchases made on the EPFL campus with the badges carried by all EPFL members. Via the Microsoft collaborators involved, we have access to Web usage logs from IE/Edge and Bing, and via MSR’s subscription to the Twitter firehose, we gain full access to a major social media platform. Our agenda broadly decomposes into three sets of research questions: (1) Monitoring and modeling: How to mine digital traces for spatiotemporal variation of dietary habits? What nutritional patterns emerge? And how do they relate to, and expand, the current state of research in nutrition? (2) Quantifying and correcting biases: The log data does not directly capture food consumption, but provides indirect proxies; these are likely to be affected by data biases, and correcting for those biases will be an integral part of this project. (3) Modifying dietary habits: Our lab is co-organizing an annual EPFL-wide event called the Act4Change challenge, whose goal is to foster healthy and sustainable habits on the EPFL campus. Our close involvement with Act4Change will allow us to validate our methods and findings on the ground via surveys and A/B tests. Applications of our work will include new methods for conducting population nutrition monitoring, recommending better-personalized eating practices, optimizing food offerings, and minimizing food waste.
The substantial increase in optical data transmission, and cloud computing, has fueled research into new technologies that can increase communication capacity. Optical communication through fiber, which traditionally has been used for long haul fiber optical communication, is now also employed for short haul communication, even with data-centers. In a similar vein, the increasing capacity crunch in optical fibers, driven in particular by video streaming, can only be met by two degrees of freedom: spatial and wavelength division multiplexing. Spatial multiplexing refers to the use of optical fibers that have multiple cores, allowing to transmit the same carrier wavelength in multiple fibers. Wavelength division multiplexing (WDM or dense-DWM) refers to the use of multiple optical carriers on the same fiber. A key advantage of WDM is the ability to increase line-rates on existing legacy network, without requirements to change existing SMF28 single mode fibers. WDM is also expected to be employed in data-centers. Yet to date, WDM implementation within datacenters faces a key challenge: a CMOS compatible, power efficient source of multi-wavelengths. Currently employed existing solutions, such as multi-laser chips based on InP (as developed by Infinera) cannot be readily scaled to a larger number of carriers. As a result, the currently prevalently employed solution is to use a bank of multiple, individual laser modules. This approach is not viable for datacenters due to space and power constraints. Over the past years, a new technology has rapidly matured – that was developed by EPFL – microresonator frequency combs, or microcombs that satisfy these requirements. The potential of this new technology in telecommunications has recently been demonstrated with the use of microcombs for massively coherent parallel communication on the receiver and transmitter side. Yet to date the use of such micro-combs in data-centers has not been addressed.
- Kippenberg, T. J., Gaeta, A. L., Lipson, M. & Gorodetsky, M. L. Dissipative Kerr solitons in optical microresonators. Science 361, eaan8083 (2018).
- Brasch, V. et al. Photonic chip–based optical frequency comb using soliton Cherenkov radiation. Science aad4811 (2015). doi:10.1126/science.aad4811
- Marin-Palomo, P. et al. Microresonator-based solitons for massively parallel coherent optical communications. Nature 546, 274–279 (2017).
- Trocha, P. et al. Ultrafast optical ranging using microresonator soliton frequency combs. Science 359, 887–891 (2018).
EPFL PIs: Edouard Bugnion (opens in new tab)
Microsoft PIs: Irene Zhang (opens in new tab), Dan Ports (opens in new tab), Marios Kogias (opens in new tab)
PhD Student: Konstantinos Prasopoulos (opens in new tab)
The deployment of a web-scale application within a datacenter can comprise of hundreds of software components, deployed on thousands of servers organized in multiple tiers and interconnected by commodity Ethernet switches. These versatile components communicate with each other via Remote Procedure Calls (RPCs) with the cost of an individual RPC service typically measured in microseconds. The end-user performance, availability and overall efficiency of the entire system are largely dependent on the efficient delivery and scheduling of these RPCs. Yet, these RPCs are ubiquitously deployed today on top of general-purpose transport protocols such as TCP. We propose to make RPC first-class citizens of datacenter deployment. This requires a revisitation of the overall architecture, application API, and network protocols. Our research direction is based on a novel RPC-oriented protocol, R2P2, which separates control flow from data flow and provides in-networking scheduling opportunities to tame tail latency. We are also building the tools that are necessary to scientifically evaluate microsesecond-scale services.
ETH Zurich projects (2019-2021)
ETH Zurich PIs: Roland Siegwart (opens in new tab), Cesar Cadena (opens in new tab)
Microsoft PIs: Johannes Schönberger (opens in new tab), Marc Pollefeys (opens in new tab)
PhD Student: Lukas Schmid (opens in new tab)
AR/VR allow new and innovative ways of visualizing information and provide a very intuitive interface for interaction. At their core, they rely only on a camera and inertial measurement unit (IMU) setup or a stereo-vision setup to provide the necessary data, either of which are readily available on most commercial mobile devices. Early adoptions of this technology have already been deployed in the real estate business, sports, gaming, retail, tourism, transportation and many other fields. The current technologies in visual-aided motion estimation and mapping on mobile devices have three main requirements to produce highly accurate 3D metric reconstructions: An accurate spatial and temporal calibration of the sensor suite, a procedure which is typically carried out with the help of external infrastructure, like calibration markers, and by following a set of predefined movements. Well-lit, textured environments and feature-rich, smooth trajectories. The continuous and reliable operation of all sensors involved. This project aims at relaxing these requirements, to enable continuous and robust lifelong mapping on end-user mobile devices. Thus, the specific objectives of this work are: 1. Formalize a modular and adaptable multi-modal sensor fusion framework for online map generation; 2. Improve the robustness of mapping and motion estimation by exploiting high-level semantic features; 3. Develop techniques for automatic detection and execution of sensor calibration in the wild. A modular SLAM (simultaneous localization and mapping) pipeline which is able to exploit all available sensing modalities can overcome the individual limitations of each sensor and increase the overall robustness of the estimation. Such an information-rich map representation allows us to leverage recent advances in semantic scene understanding, providing an abstraction from low-level geometric features – which are fragile to noise, sensing conditions and small changes in the environment – to higher-level semantic features that are robust against these effects. Using this complete map representation, we will explore new ways to detect miscalibrations and sensor failures, so that the SLAM process can be adapted online without the need for explicit user intervention.
The goal of this project is to mine ML.NET historical data such as user telemetry and logs to understand how ML.NET transformations and learners are used and eventually being able to use this knowledge to automatically provide suggestions to data scientists using ML.NET.Suggestions can be in the form of: Better or additional recipes for unexplored tasks (e.g., neural networks). Auto-completion suggestions for pipelines directly authored for example in .NET or Python.Automatically generation of parameters and sweep strategies optimal for the task at hand. We will try to develop a solution that is extensible such that, if new tasks, algorithms, etc. are added to the library, suggestions will be eventually properly upgraded as well. Additionally, the tool will have to interface with ML.NET and make easy to add new recipes coming either from users or the log mining tool.
Humans are social beings and frequently interacting with one another, e.g. spending a large amount of their time being socially engaged, working in teams, or just being as part of the crowd. Understanding human interaction from visual input is an important aspect of visual cognition and key to many applications including assistive robotics, human-computer interaction and AR/VR. Despite rapid progresses in estimating 3D pose and shape of a single person from RGB images, capturing and modelling human interactions is rather poorly studied in the literature. Particularly for the first-person-view settings, the problem has drawn little attention from the computer vision community. We argue that it is essential for the augmented reality glasses, e.g. Microsoft HoloLens, to capture and model the interactions between the camera wearer and others as the interaction between humans characterises how they move, behave and perform tasks in a collaborative setting.
In this project, we aim to understand how to recognise and predict the interactions between humans under the first-person view setting. To that end, we will create a 3D human-human interaction dataset where the goal is to capture rich and complex interaction signals including body and hand poses, facial expression and gaze directions using Microsoft Kinect and HoloLens. We will develop models that can recognise the dynamics of human interactions and even predict the motion and activities of the interacting humans. We believe such models will facilitate various down-streaming applications for the augmented reality glasses, e.g. Microsoft HoloLens.
ETH Zurich PIs: Roland Siegwart (opens in new tab), Nicholas Lawrance (opens in new tab), Jen Jen Chung (opens in new tab)
Microsft PIs: Andrey Kolobov (opens in new tab), Debadeepta Dey (opens in new tab)
PhD Student: Florian Achermann (opens in new tab)
A major factor restricting the utility of UAVs is the amount of energy aboard, which limits the duration of their flights. Birds face largely the same problem, but they are adept at using their vision to aid in spotting — and exploiting — opportunities for extracting extra energy from the air around them. Project Altair aims at developing infrared (IR) sensing techniques for detecting, mapping and exploiting naturally occurring atmospheric phenomena called thermals for extending the flight endurance of fixed-wing UAVs. In this presentation, we will introduce our vision and goals for this project.
ETH Zurich PIs: Torsten Hoefler (opens in new tab), Renato Renner (opens in new tab)
Microsoft PIs: Matthias Troyer (opens in new tab), Martin Roetteler (opens in new tab)
PhD Student: Niels Gleinig (opens in new tab)
QIRO will establish a new internal representation for compilation systems on quantum computers. Since quantum computation is still emerging, I will provide an introduction to the general concepts of quantum computation and a brief discussion of its strengths and weaknesses from a high-performance computing perspective. This talk is tailored for a computer science audience with basic (popular-science) or no background in quantum mechanics and will focus on the computational aspects. I will also discuss systems aspects of quantum computers and how to map quantum algorithms to their high-level architecture. I will close with the principles of practical implementation of quantum computers and outline the project.
Reinforcement learning (RL) is a promising paradigm in machine learning and gained considerable attention in recent years, partly because of its successful application in previously unsolved challenging games like Go and Atari. While these are impressive results, applying reinforcement learning in most other domains, e.g. virtual personal assistants, self-driving cars or robotics, remains challenging. One key reason for this is the difficulty of specifying the reward function a reinforcement learning agent is intended to optimize. For instance, in a virtual personal assistant, the reward function might correspond to the user’s satisfaction with the assistant’s behavior and is difficult to specify as a function of observations (e.g. sensory information) available to the system. In such applications, an alternative to specifying the reward function is to actually query the user for the reward. This, however, is only feasible if the number of queries to the user are limited and the user’s response can be provided in a natural way such that the system’s queries are non-irritating. Similar problems arise in other application domains such as robotics in which, for instance, the true reward can only be obtained by actually deploying the robot but an approximation to the reward can be computed by a simulator. In this case, it is important to optimize the agent’s behavior while simultaneously minimizing the number of costly deployments. This project’s aim is to develop algorithms for these types of problems via scalable active reward learning for reinforcement learning. The project’s focus is on scalability in terms of computational complexity (to scale to large real-world problems) and sample complexity (to minimize the number of costly queries).
ETH Zurich PIs: Stelian Coros (opens in new tab), Roi Poranne (opens in new tab)
Microsoft PIs: Federica Bogo (opens in new tab), Bugra Tekin (opens in new tab), Marc Pollefeys (opens in new tab)
PhD Students: Simon Zimmermann (opens in new tab)
With this project, we aim to accelerate the development of intelligent robots that can assist those in need with a variety of everyday tasks. People suffering from physical impairments, for example, often need help dressing or brushing their own hair. Skilled robotic assistants would allow these persons to live an independent lifestyle. Even such seemingly simple tasks, however, require complex manipulation of physical objects, advanced motion planning capabilities, as well as close interactions with human subjects. We believe the key to robots being able to undertake such societally important functions is learning from demonstration. The fundamental research question is, therefore, how can we enable human operators to seamlessly teach a robot how to perform complex tasks? The answer, we argue, lies in immersive telemanipulation. More specifically, we are inspired by the vision of James Cameron’s Avatar, where humans are endowed with alternative embodiments. In such a setting, the human’s intent must be seamlessly mapped to the motions of a robot as the human operator becomes completely immersed in the environment the robot operates in. To achieve this ambitious vision, many technologies must come together: mixed reality as the medium for robot-human communication, perception and action recognition to detect the intent of both the human operator and the human patient, motion retargeting techniques to map the actions of the human to the robot’s motions, and physics-based models to enable the robot to predict and understand the implications of its actions.
Over the past dozen years, touch input – seemingly well-understood – has become the predominate means of interacting with devices such as smartphones, tablets, and large displays. Yet we argue that much remains unknown – in the form of a seen but unnoticed vocabulary of natural touch – that suggests tremendous untapped potential. For example, touchscreens remain largely ignorant of the human activity, manual behavior, and context-of-use beyond the moment of finger-contact with the screen itself. In a sense, status quo interactions are trapped in a flatland of touch, while systems remain oblivious to the vibrant world of human behavior, activity, and movement that surrounds them.We posit that an entire vocabulary of naturally-occurring gestures – both in terms of the activity of the hands, as well as the subtle corresponding motion and compensatory movements of the devices themselves – exists in plain sight.Our intended outcome is creating a conceptual understanding as well as a deployable interactive system, both of which blend the naturally-occurring gestures – interactions users embody through their actions – with the explicit input through traditional touch operation.
This project examines the architecture and management of next-generation data center storage devices within the context of realistic data-intensive workloads. The aim is to investigate novel techniques that can greatly improve performance, cost, and efficiency in real world systems with real world applications, breaking the barriers between the applications and devices, such that the software can much more effectively and efficiently manage the underlying storage devices that consist of (potentially different types of) flash memory, emerging SCM (storage class memory) technologies, and (potentially different types of) DRAM memories. We realize that there is a disconnect in the communication between applications/software and the NVM devices: the interfaces and designs we currently have enable little communication of useful information from the application/software level (including the kernel) to the NVM devices, and vice versa, causing significant performance and efficiency loss and likely fueling higher “managed” storage device costs because applications cannot even communicate their requirements to the devices. We aim to fundamentally examine the software-NVM interfaces as well as designs for the underlying storage devices to minimize the disconnect in communication and empower applications and system software to more effectively manage the underlying devices, optimizing important system-level metrics that are of interest to the system designer or the application (at different points in time of execution).
EPFL projects (2017-2018)
EPFL PIs: Babak Falsafi, Martin Jaggi
Microsoft Co-PI: Eric Chung
Deep Neural Networks (DNNs) have emerged as algorithms of choice for many prominent machine learning tasks, including image analysis and speech recognition. In datacenters, DNNs are trained on massive datasets to improve prediction accuracy. While the computational demands for performing online inference in an already trained DNN can be furnished by commodity servers, training DNNs often requires computational density that is orders of magnitude higher than that provided by modern servers. As such, operators often use dedicated clusters of GPUs for training DNNs. Unfortunately, dedicated GPU clusters introduce significant additional acquisition costs, break the continuity and homogeneity of datacenters, and are inherently not scalable. FPGAs are appearing in server nodes either as daughter cards (e.g., Catapult) or coherent sockets (e.g., Intel HARP) providing a great opportunity to co-locate inference and training on the same platform. While these designs enable natural continuity for platforms, co-locating inference and training on a single node faces a number of key challenges. First, FPGAs inherently suffer from low computational density. Second, conventional training algorithms do not scale due to inherent high communication requirements. Finally, co-location may lead to contention requiring mechanisms to prioritize inference over training. In this project, we will address these fundamental challenges in DNN inference/training co-location on servers with integrated FPGAs. Our goals are:
- Redesign training and inference algorithms to take advantage of DNNs inherent tolerance for low precision operations.
- Identify good candidates for hard-logic blocks for the next generations of FPGAs.
- Redesign DNN training algorithms to aggressively approximate and compress intermediate results, to target communication bottlenecks and scale the training of single networks to an arbitrary number of nodes.
- Implement FPGA-based load balancing techniques in order to provide latency guarantees for inference tasks under heavy loads and enable the use of idle accelerator cycles to train networks when operating under lower loads.
The task of grouping data according to similarity is a basic computational task with numerous applications. The right notion of similarity often depends on the application and different measures yield different algorithmic problems. The goal of this project is to design faster and more accurate algorithms for fundamental clustering problems such as the k-means problem, correlation clustering and hierarchical clustering. We propose to perform a fine grained study of these problems and design algorithms that achieve optimal trade-offs between approximation quality, runtime and space/communication complexity, making our algorithms well-suited for modern data models such as streaming and MapReduce.
Several companies are now launching drones that autonomously follow and film their owners, often by tracking a GPS device they are carrying. This holds the promise to fundamentally change the way in which drones are used by allowing them to bring back videos of their owners performing activities, such as playing sports, unimpeded by the need to control the drone. In this project, we propose to go one step further and turn the drone into a personal trainer that will not only film but also analyse the video sequences and provide advice on how to improve performance. For example, a golfer could be followed by such a drone that will detect when he swings and offer advice on how to improve the motion. Similarly, a skier coming down a slope could be given advice on how to better turn and carve. In short, the drone would replace the GoPro-style action cameras that many people now carry when exercising. Instead of recording what they see, it would film them and augment the resulting sequences with useful advice. To make this solution as lightweight as possible, we will strive to achieve this goal using the on-board camera as the sole sensor and free the user from the need to carry a special device that the drone locks onto. This will require:
- Detecting the subject in the video sequences acquired by the drone so as to keep him in the middle of its field of view. This must be done in real-time and integrated into the drone’s control system.
- Recovering the subject’s 3D pose as he moves from the drone’s videos. This can be done with a slight delay since the critique only has to be provided once the motion has been performed.
- Providing feedback. In both the golf and ski cases, this would mean quantifying leg, hips, shoulders, and head position during a swing or a turn, offering practical suggestions on how to change them, and showing how an expert would have performed the same action.
EPFL PI: Babak Falsafi
Microsoft Co-PI: Stavros Volos (opens in new tab)
Near-memory processing (NMP) is a promising approach to satisfy the performance requirements of modern datacenter services at a fraction of modern infrastructure’s power. NMP leverages emerging die-stacked DRAM technology, which (a) delivers high-bandwidth memory access, and (b) features a logic die, which provides the opportunity for dramatic data movement reduction – and consequently energy savings – by pushing computation closer to the data. In the precursor to this project (the MSR European PhD Scholarship), we evaluated algorithms suitable for database join operators near memory. We showed, while sort join has been conventionally thought of as inferior to hash join in performance for CPUs, near-memory processing favors sequential over random memory access, making sort join superior in performance and efficiency as a near-memory service. In this project, we propose to answer the following questions:
- What data-specific functionality should be implemented near memory (e.g., data filtering, data reorganization, data fetch)?
- What ubiquitous, yet simple system-level functionality should be implemented near memory (e.g., security, compression, remote memory access)?
- How should the services be integrated with the system (e.g., how does the software use them)?
- How do we employ near-threshold logic in near-memory processing?
EPFL PIs: Rachid Guerraoui, Georgios Chatzopoulos
Microsoft Co-PI: Aleksandar Dragojevic (opens in new tab)
Modern hardware trends have changed the way we build systems and applications. Increasing memory (DRAM) capacities at reduced prices make keeping all data in-memory cost-effective, presenting opportunities for high performance applications such as in-memory graphs with billions of edges (e.g. Facebook’s TAO). Non-Volatile RAM (NVRAM) promises durability in the presence of failures, without the high price of disk accesses. Yet, even with this increase in inexpensive memory, storing the data in the memory of one machine is still not possible for applications that operate on TB of data, and systems need to distribute the data and synchronize accesses among machines. This project proposes the design and building of support for high-level transactions on top of modern hardware platforms, using the Structured Query Language (SQL). The important question to be answered is whether transactions can get the maximum benefit of these modern networking and hardware capabilities, while offering a significantly easier interface for developers to work with. This project will require both research in the transactional support to be offered, including the operations that can be efficiently supported, as well as research in the execution plans for transactions in this distributed setting.
The goal of our project is to improve the utilization of server resources in data centers. Our proposed approach was to attain a better understanding of the resource requirements of data-parallel applications and then incorporate this understanding into the design of more informed and efficient data center (cluster) schedulers. While pursuing these directions we have identified two related challenges that we believe hold the key towards significant additional improvements in application performance as well as cluster-wide resource utilization. We will explore these two challenges as a continuation of our project. These two challenges are: Resource inter-dependency and time-varying resource requirements. Resource inter-dependency refers to the impact that a change in the allocation of one server resource (memory, CPU, network bandwidth, disk bandwidth) to an application has on that application’s need for the other resources. Time-varying resource requirements refers to the fact that over the lifetime of an application its resource requirements may vary. Studying these two challenges together holds the potential for improving resource utilization by aggressively but safely collocating applications on servers.
ETH Zurich projects (2017-2018)
ETH Zurich PI: Gustavo Alonso
Microsoft Co-PI: Ken Eguro (opens in new tab)
While in the first phase of the project we explored the efficient implementation of data processing operators in FPGAs as well as the architectural issues involved in the integration of FPGAs as co-processors in commodity servers, in this new proposal we intend to focus on architectural aspects of in-network data processing. The choice is motivated by the growing gap between the bandwidth and very low latencies that modern networks support and the overhead of ingress and egress from VMs and applications running on conventional CPUs. A first goal is to explore the type of problems and algorithms that can be best run as the data flows through the network so as to be able to exploit the bare wire speed and allow off-loading of expensive computations to the FPGA. A second, but not less important goal, is to explore how to best operate FPGA based accelerators when directly connected to the network and operating independently from the software part of the application. In terms of applications, the focus will remain on data processing (relational, No-SQL, data warehouses, etc.) with the intention of starting to move towards machine learning algorithms at the end of the two-year project. On the network side, the project will work on developing networking protocols suitable to this new configuration and how to combine the network stack with the data processing stack.
ETH Zurich PIs: Onur Mutlu, Luca Benini
Microsoft Co-PI: Derek Chiou
Today’s systems are overwhelmingly designed to move data to computation. This design choice goes directly against key trends in systems and technology that cause performance, scalability and energy bottlenecks:
- data access from memory is a key bottleneck as applications become more data-intensive and memory bandwidth and energy do not scale well,
- energy consumption is a key constraint in especially mobile and server systems,
- data movement is very costly in terms of bandwidth, energy and latency, much more so than computation.
Our goal is to comprehensively examine the premise of adaptively performing computation near where the data resides, when it makes sense to do so, in an implementable manner and considering multiple new memory technologies, including 3D-stacked memory and non-volatile memory (NVM). We will examine practical hardware substrates and software interfaces to accelerate key computational primitives of modern data-intensive applications in memory, runtime and software techniques that can take advantage of such substrates and interfaces. Our special focus will be on key data-intensive applications, including deep learning, neural networks, graph processing, bioinformatics (DNA sequence analysis and assembly), and in-memory data stores. Our approach is software/hardware cooperative, breaking the barriers between the two and melding applications, systems and hardware substrates for extremely efficient execution, while still providing efficient interfaces to the software programmer.
ETH Zurich PI: Otmar Hilliges
Microsoft Co-PI: Marc Pollefeys (opens in new tab)
Micro-aerial vehicles (MAVs) have been made accessible to end-users via the emergence of simple to use hardware and programmable software platforms and have seen a surge in consumer and research interest as a consequence. Clearly there is a desire to use such platforms in a variety of application scenarios but manually flying quadcopters remains a surprisingly hard task even for expert users. More importantly, state-of-the-art technologies offer only very limited support for users who want to employ MAVs to reach a certain high-level goal. This is maybe best illustrated by the currently most successful application area – that of aerial videography. While manual flight is hard, piloting and controlling a camera simultaneously is practically impossible. An alternative to manual control is offered via waypoint based control of MAVs, shielding novices from the underlying complexities. However, this simplicity comes at the cost of flexibility and existing flight planning tools are not designed with high-level user goals in mind. Building on our own (MSR JRC funded) prior work, we propose an alternative approach to robotic motion planning. The key idea is to let the user work in solution-space – instead of defining trajectories the user would define what the resulting output should be (e.g., shot composition, transitions, area to reconstruct). We propose an optimization-based approach that takes such high-level goals as input and generates the trajectories and control inputs for a gimbal mounted camera automatically. We call this solution-space driven, inverse kinematic motion planning. Defining the problem directly in the solution space removes several layers of indirection and allows users to operate in a more natural way, focusing only on the application specific goals and the quality of the final result, whereas the control aspects are entirely hidden.
ETH Zurich PIs: Thomas Hofmann, Aurélien Lucchi
Microsoft Co-PI: Sebastian Nowozin
The past decade has seen a growth in application of big data and machine learning systems. Probabilistic models of data are theoretically well understood and in principle provide an optimal approach to inference and learning from data. However, for richly structured data domains such as natural language and images, probabilistic models are often computationally intractable and/or have to make strong conditional independence assumptions to retain computational as well as statistical efficiency. As a consequence, they are often inferior in predictive performance, when compared to current state-of-the-art deep learning approaches. It is a natural question to ask, whether one can combine the benefits of deep learning with those of probabilistic models. The major conceptual challenge is to define deep models that are generative, i.e. that can be thought of as models of the underlying data generating mechanism. We thus propose to leverage and extend recent advances in generative neural networks to build rich probabilistic models for structured domains such as text and images. The extension of efficient probabilistic neural models will allow us to represent complex and multimodal uncertainty efficiently. To demonstrate the usefulness of the developed probabilistic neural models we plan to apply them to challenging multimodal applications such as creating textual descriptions for images or database records.
EPFL projects (2014-2016)
EPFL PI: Serge Vaudenay
Microsoft PI: Markulf Kohlweiss
For an encryption scheme to be practically useful, it must deliver on two complementary goals: the confidentiality and integrity of encrypted data. Historically, these goals were achieved by combining separate primitives, one to ensure confidentiality and another to guarantee integrity. This approach is neither the most efficient (for instance, it requires processing the input stream at least twice), nor does it protect against implementation errors. To address these concerns, the notion of Authenticated Encryption (AE), which simultaneously achieves confidentiality and integrity, was put forward as a desirable first-class primitive to be exposed by libraries and APIs to the end developer. Providing direct access to AE rather than requiring developers to orchestrate calls to several lower-level functions is seen as a step towards improving the quality of security-critical code. An indication of both the importance of useable AE and the difficulty of getting it right, are the number of standards that were developed over the years. These specified different methods for AE: the CCM method is specified in IEEE 802.11i, IPsec ESP, and IKEv2; the GCM method is specified in NIST SP 800-38D; the EAX method is specified in ANSI C12.22; and ISO/IEC 19772:2009 defines six methods, including five dedicated AE designs and one generic composition method, namely Encrypt-then-MAC. Several security issues have recently arisen and been reported in the (mis)use of symmetric key encryption with authentication in practice. As a result, the cryptographic community has initiated the Competition for Authenticated Encryption: Security, Applicability, and Robustness (CAESAR), to boost public discussions towards a better understanding of these issues, and to identify a portfolio of efficient and secure AE schemes. Our project aims to contribute to the design, analysis, evaluation, and classification of the emerging AE schemes during the CAESAR competition. It has effected many practical security protocols that use AE schemes as indispensable underlying primitives. Our work has broader implications for the theory of AE as an important research area in symmetric-key cryptography.
EPFL PIs: Edouard Bugnion (opens in new tab), Babak Falsafi
Microsoft PI: Dushyanth Naraya
The goal of the Scale-Out NUMA project is to deliver energy-efficient, low-latency access to remote memory in datacentre applications, with a focus on rack-scale deployments. Such infrastructure will become critical for both web-scale only applications as well as scale-out analytics where the dataset can reside in the collective (but distributed) memory of a cluster of servers. Our approach to the problem layers an RDMA-inspired programming model directly on top of a NUMA fabric via stateless messaging protocol. To facilitate interactions between the application, the OS and the fabric, soNUMA relies on the remote memory controller – a new architecturally-exposed hardware block integrated into the node’s local coherence hierarchy.
EPFL PI: Florin Dinu
Microsoft PI: Sergey Legtchenko (opens in new tab)
Our vision is of resource-efficient datacenters where the compute nodes are fully utilized. We see two challenges to manifesting this vision. The first is the increasing use of hardware heterogeneity in datacenters. Heterogeneity, while both unavoidable and desirable, does not lend itself to today’s systems and algorithms, which inefficiently handle heterogeneity. The second challenge is the aggressive scale-out of datacenters. Scale-out has made it conveniently easy to disregard inefficiencies at the level of individual compute nodes because it has been historically easy to expand to new resources. However, apart from being unnecessarily costly, such scale-out techniques are now becoming impractical due to the size of the datasets. Moreover, scale-out often adds new inefficiencies. We argue that to meet these challenges, we must start from a thorough understanding of the resource requirements of today’s datacenter jobs. With this understanding, we aim to design new scheduling techniques that efficiently use resources, even in heterogeneous environments. Further, we aim to fundamentally change the way data-parallel processing systems are built and to make efficient compute node resource utilization a cornerstone of their design. Our first goal is to automatically characterize the pattern of memory requirements of data-parallel jobs. Specifically, we want to go beyond the current practices that are interested only in peak memory usage. To better identify opportunities for efficient memory management, more granular information is necessary. Our second goal is to use knowledge of the pattern of memory requirements to design informed scheduling algorithms that manage memory efficiently. The third goal of the project is to design data-parallel processing systems that are efficient in terms of managing memory, not only by understanding task memory requirements, but also by shaping those memory requirements.
ETH Zurich projects (2014-2016)
Disk-backed in-memory key/value stores are gaining significance as many industries are moving toward big data analytics. Storage space and query time requirements are challenging, since the analysis has to be performed at the lowest cost to be useful from a business perspective. Despite those cost constraints, today’s systems are heavily overprovisioned when it comes to resiliency. The undifferentiated three-copy approach leads to a potential waste of bandwidth and storage resources, which then makes the overall system less efficient or more expensive. We propose to revisit currently used resiliency schemes, with the help of analytical hardware failure models. We will utilize those models to capture the exact tradeoff between the overhead due to replication and the exact resiliency requirements that are defined in a contract. Our key idea is to model reliability as an explicit resource that the user allocates consciously. In previous work, we have been able to speed-up scientific computing applications, as well as a distributed hashtable, on several hundred-thousand cores by more than 20 percent, with the use of advanced RDMA programming techniques. We have also demonstrated low-cost resiliency schemes based on erasure coding for RDMA environments. In addition, we propose to apply our experience with large-scale RDMA programming to the design of in-memory databases, a problem very similar to distributed hashtables. To make reliability explicit, we plan to extend the key value store with explicit reliability attributes that allow the user to specify reliability and availability requirements for each key (or group of keys). Our work may change the perspective in datacenter resiliency. Defining fine-grained, per-object resiliency levels and tuning them to the exact environment may provide large cost benefits and impact industry. For example, changing the standard three-replica scheme to erasure coding can easily save 30 percent of storage expenses.
ETH Zurich PI: Gustavo Alonso
Microsoft PI: Ken Eguro (opens in new tab)
One of the biggest challenges for software these days is to adapt to the rapid changes in hardware and processor architecture. On the one hand, extracting performance from modern hardware requires dealing with increasing levels of parallelism. On the other hand, the wide variety of architectural possibilities and multiplicity of processor types raise many questions in terms of the optimal platform for deploying applications. In this project we will explore the efficient implementation of data processing operators in FPGAs, as well as the architectural issues involved in the integration of FPGAs as co-processors in commodity servers. The target application is big data and data processing engines (relational, No-SQL, data warehouses, etc.). Through this line of work, the project aims at exploring architectures that will result in computing nodes with a smaller energy consumption and physical size, but capable of providing a performance boost to applications for big data. FPGAs should be seen here not as a goal in themselves, but as an enabling platform for the exploration of different architectures and levels of parallelism that will allow us to bypass the inherent restriction of conventional processors. On the practical side, the project will focus on both the use of FPGAs as co-processors inside existing engines, as well as on developing proof-of-concept implementations of data processing engines entirely implemented in FPGAs. In this area, the project complements very well with ongoing efforts at Microsoft Research around Cipherbase, a trusted computing system based on SQL server deployments in the cloud. On the conceptual side, the project will explore the development of data structures and algorithms capable of exploiting the massive parallelism available in FPGAs, with a view to gaining much needed insights on how to adapt existing data processing systems to multi- and many-core architectures. Here, we expect to gain insights on how to redesign both standard relational data operators, as well as data mining and machine learning operators, to better take advantage of the increasing amounts of parallelism available in future processors.
ETH Zurich PIs: Otmar Hilliges, Marc Pollefeys
Microsoft PI: Shahram Izadi
In recent years, robotics research has made tremendous progress and it is becoming conceivable that robots will be as ubiquitous and irreplaceable in our daily lives as they are within industrial settings. Continued improvements, in terms of mechatronics and control aspects, coupled with continued advances in consumer electronics, have made robots ever smaller, autonomous, and agile. One area of recent advances in robotics is the notion of micro-aerial vehicles (MAVs) [14, 16]. These are small, flying robots that are very agile, can operate in a 3D space, indoors and outdoors, and can carry small payloads — including input and output devices — and can navigate difficult environments, such as stairs, more easily than terrestrial robots; and hence can reach locations that no other robot or indeed humans can reach. Surprisingly, to date there is little research on such flying robots in an interactive context or on MAVs operating in near proximity to humans. In our project, we explore the opportunities that arise from aerial robots that operate in close proximity to and in collaboration with a human user. In particular, we are interested in developing a robotic platform in which a) the robot is aware of the human user and can navigate relative to the user; b) the robot can recognize various gestures from afar, as well as receive direct, physical manipulations; c) the robot can carry small payloads — in particular input and output devices such as additional cameras or projectors. Finally, we are developing novel algorithms to track and recognize user input, by using the onboard cameras, in real-time and with very low-latency, to build on the now substantial body of research on gestural and natural interfaces. Gesture recognition can be used for MAV control (for example, controlling the camera) or to interact with virtual content.
ETH Zurich PI: Roger Wattenhofer
Microsoft PI: Ratul Mahajan
The Internet is designed as a robust service to ensure that we can use it with selfish participants present. As such, a loss in total performance must be accepted. However, if a whole wide-area network (WAN) was controlled by a single entity, why should one use the very techniques designed for the Internet? Large providers such as Microsoft, Amazon, or Google operate their own WANs, which cost them hundreds of millions of dollars per year; yet even their busier links average only 40–60 percent utilization. This gives rise to Software Defined Networks (SDNs), which allow the separation of the data and the control plane in a network. A centralized controller can install and update rules all over the WAN, to optimize its goals. Despite SDNs receiving a lot of attention in both theory and practice, many questions are still unanswered. Even though the control of the network is centralized, distributing the updates does not happen instantaneously. Numerous problems can occur, such as the dropping of packets, generation of loops, breaking the memory/bandwidth limit of switches/links, and missing packet coherence. These problems must be solved before SDNs can be broadly deployed. This research project sheds more light on these fundamental issues of SDNs and how they can be tackled. In parallel, we look at SDNs from a game-theoretic perspective.