Microsoft Research Summer Workshop 2018: Machine Learning on Constrained Devices

About

The Internet of Things (IoT) is poised to revolutionize our world. Billions of microcontrollers and sensors have already been deployed for predictive maintenance, connected cars, precision agriculture, personalized fitness and wearables, smart housing, cities, healthcare, etc. The dominant paradigm in these applications is that the IoT device is dumb – it just senses its environment and transmits the sensor readings to the cloud where all the intelligence resides and the decision making happens.

We envision an alternative paradigm where even tiny, resource-constrained IoT devices can run machine learning algorithms locally without necessarily connecting to the cloud. This enables a number of critical scenarios, beyond the pale of the traditional paradigm, where it is not desirable to send data to the cloud due to concerns about latency, connectivity, energy, privacy and security.

Microsoft Research (MSR) India has an ongoing effort on developing new algorithms and exploring applications in this area. An overview of our published work can be found at: https://github.com/Microsoft/EdgeML.

We are organizing an approximately month-long summer workshop focusing on “machine learning on constrained devices” such as IoT sensors.  The workshop would comprise lectures and hands on collaboration between participants and MSR India staff.  We welcome participants from academia, startups and other industry players to the workshop.

We hope that this workshop will facilitate development of new techniques as well as application of the state-of-the-art to real world scenarios, focusing on tangible outcomes during the period of collaboration.

The workshop will be hosted at the Microsoft Research premises in Bangalore, India.

 

Call For Proposals: Details & Guidelines

 

We invite proposals in the following broad areas:

1. ML applications on edge devices: Proposals that envision new applications and scenarios that are enabled by machine learning models with tiny footprint that can run on IoT devices, sensors and resource-constrained platforms. Platforms of interest include those that use ARM Cortex M-series microcontrollers (e.g. Arduino board) and the lower end of ARM Cortex A-series processors (e.g. Raspberry). The proposal can come from any application domain, including healthcare, industrial, speech, security, user analytics, sensor fusion, intrusion detection, but should aim to develop ML models to be deployed on resource-constrained devices. Preference would be given to proposals that aim to build a complete prototype and are able to provide data sets that have been shown to be useful in traditional (non-resource-constrained) settings.
Example 1: Gesture or activity recognition using accelerometer data
Example 2: Wake word or audio command detection
Example 3: Predicting upcoming downtime in factories using sensor data
Example 4: Sensors on sporting equipment to analyze game in real-time
Example 5: Wearable diagnostic devices/implements/monitors that don’t rely on labs
Example 6: Real-time intrusion detection in network hardware
Example 7: Spotting endangered species with stand-alone sensors deployed in the wild
Example 8: Predicting pollution levels based on traffic conditions

2. New algorithmic tools: Proposals that aim to build the state of the art algorithms that help derive small machine learning models. These can include: (a) new formulations for specific tasks (e.g. early warning in time-series data), (b) new neural networks formulations (e.g. compact cells with lesser state), (c) new techniques for training and compressing sparse models, (d) new computational frameworks relevant to the resource-constrained setting.

3. Software tools: Many embedded systems lack capabilities common in desktops, e.g., multi-threading, floating point arithmetic, security features. Proposal are invited for developing new software platforms and tools that solve these problems and allow easy deployment of machine learning pipelines in IoT devices. These can include (a) light weight operating systems, (b) compilers that can reason about precision and target new architectures, (c) software security features. Preference would be given to proposals that aim to build artefacts that drastically improve programming productivity on resource-constrained devices.

Resources from Microsoft: Microsoft Research (MSR) has expertise and active research agenda in this space and will act as advisors. We have existing machine learning algorithms/libraries for this setting available to the public (https://github.com/Microsoft/Edgeml), and we will work with the selected proposals to adapt and apply these algorithms/tools.
Further, MSR India will also provide resources including computing infrastructure, sensors and hardware, and logistic support from summer school student participants. Researchers and engineers from Microsoft will actively be involved in the research as well as the development aspects of each selected proposal.

Participation requirements: Each proposal can have one principal proposer and other co-proposers. We require that at least one of the proposers for each selected proposal is present in MSR India’s office in Bangalore for the duration of the workshop (it is OK to have two proposers who will spend two weeks each). Proposers are welcome to nominate students / team members as part of their proposal (see details below).
All proposers, students and team members selected for the workshop are expected to spend the majority of the time on the proposals during the summer workshop, since this is intended to be an intense 4-week effort leading to publishable work and/or significant progress in developing a prototype.

Intellectual Property (IP) : We require academic participants to release project artefacts with a liberal license and publish the findings in peer-reviewed conference. For proposals from start-ups and other industry participants, IP related aspects will be discussed and resolved to mutual satisfaction on a case-to-case basis.

Proposal Submission Guidelines

The submitted proposal should contain the following:
1. A precise problem statement including relevance for the resource-constrained or IoT setting.
2. Potential impact and/or business case for the problem.
3. Details about
a. resource constraints in terms of power, latency, memory, compute, bandwidth
b. the machine learning problem formulation (ranking/regression/anomaly detection)
c. sensors involved
d. relevant datasets and computing resources required.
4. Background on related work in the area.
5. An action plan for pre-summer workshop (e.g. acquiring and curating data), the 4-weeks of summer workshop, and post-summer school follow up (e.g. publishing results, releasing code) if necessary.
6. Details of faculty / students  OR team members (for startups / industry) who will participate (e.g. CV and/or website), and requirements for extra help (desired skillsets).
7. For academic participants, a plan to publish results and release software and other outcomes of the summer school.
8. For industry and start-up participants, preliminary suggestions on handling IP (to be discussed and mutually agreed later).

The final proposal should be no more than 2500 words (excluding references and supplementary material) and is due by March 31st 2018. For abstracts received before February 28th 2018, Microsoft Research will provide feedback, which we hope will be useful while submitting the final proposal. A few shortlisted proposals will be requested to make a presentation in mid-April and the final decision on selected proposals for the workshop will be communicated by the end of April.
Ambitious proposals with actionable and concrete plans will be given priority in the final selection.

Selection Process
The submitted proposals along with the nominations for research students  will be selected through a thorough evaluation and revision process by the Program Committee. In parallel, the Program / Organizing committee will also select students for the workshop through a different process and these students will be included with the shortlisted groups at the workshop.

Important dates:

Deadline for submitting proposals: March 31, 2018

Presentation by Shortlisted proposers : Mid-April 2018 

Notification of selection of shortlisted teams: April 30, 2018

Summer Workshop: June 11- July 06, 2018

 

Who can participate and How?

Faculty/Post-Docs: We invite proposals from researchers affiliated with universities. Proposers of shortlisted proposals may be asked to present the proposal either in person in Bangalore or via Skype. Proposals can also have collaborators from industry/other academic institutions from India or abroad. Selected proposers should be willing to work with other proposers in the event of their proposals getting merged. Proposers can nominate up to TWO research students in their proposal. Proposers will also be encouraged to give lectures and tutorials in their areas of expertise during the workshop.
The Principal proposer of each selected project will receive an amount of INR 2 Lakhs as remuneration for the research efforts in the project. This amount is also expected to cover the travel of the proposer to/from Bangalore, for meals outside of those provided during the workshop and other incidental expenses. Proposers will have to make their own travel arrangements. For the duration of the workshop, Microsoft Research will provide accommodation with breakfast, and meals during working hours. The amount of INR 2 Lakhs will be paid as an Unrestricted Research Grant to the institution that the proposer is affiliated with (note: we will not make individual payments to proposers).

Start-ups and Industry players: We invite proposals from Start-ups and industry players, with an identified Principal proposer. The proposer can also nominate up to TWO other team members for the workshop. Proposers will also be encouraged to give lectures and tutorials in their areas of expertise during the workshop. The Principal proposer of each selected project will receive an amount of INR 2 Lakhs as remuneration for the research efforts in the project. This amount is also expected to cover travel to/from Bangalore, meals outside of those provided during the workshop and other incidental expenses. The proposer(s) will have to make their own travel arrangements. For the duration of the workshop, Microsoft Research will provide accommodation with breakfast, and meals during working hours. The amount of INR 2 Lakhs will be paid as a Research Grant to the company entity that the principal proposer is affiliated with (note: we will not make individual payments to proposers).

Important:
a. The above remuneration will only be paid to the principal proposer. Collaborators, if any, will be provided with accommodation and meals during the workshop duration, as specified.
b. Each proposal is expected to work with 2-3 additional students nominated by Microsoft Research as part of their group during the workshop. These students will be independently selected by Microsoft Research via a separate process – we will work with leading academic institutions on a process of student nominations from Department Heads/Deans/Directors to select a group of students for the workshop. These students will then be assigned to selected propsals for the workshop.

Students: Students CANNOT apply directly to be part of the workshop. They need to be nominated by the proposer writing the proposal, or – for the students selected independently by Microsoft Research – by the Department Head / Dean / Director of their institution.
The students who are selected for the workshop will be provided with accommodation with breakfast, and meals during working hours for the duration of the workshop. For students from outside Bangalore, we will provide a travel allowance that should cover a large part of the estimated airfare (if booked well in advance), meals outside working hours and any other incidental expenses. They do not need to submit any bills or tickets to get the travel allowance. It will be a flat amount based on the city where their college is located. If they spend less than the allowance, they do not need to return the funds to us and if they spend more, we hope they can put in money from their own sources to cover the balance.

Team members from start-ups/industry: The team members who are nominated by the proposer will be provided with accommodation with breakfast, and meals during working hours for the duration of the workshop. For team members coming from outside Bangalore, we will provide a travel allowance that should cover a large part of the estimated airfare (if booked well in advance), meals outside working hours and any other incidental expenses. They do not need to submit any bills or tickets to get the travel allowance. It will be a flat amount based on the city where their company entity is located. If they spend less than the allowance, they do not need to return the funds to us and if they spend more, we hope they can put in money from their own sources to cover the balance.

The program committee will make the final decision on selection of proposals, students, team members, etc.

 

Computational Infrastructure
Garage style working spaces will be provided to each team in the Microsoft Research India premises. Team members are expected to work on their own laptops. However, we will provide Internet connection as well as Azure accounts to each team for running experiments. Access to certain Microsoft proprietary datasets and development environment could also be provided based on need and availability.

Best Project Award
At the end of the workshop, all the projects will make a presentation and there will be an evaluation done by an appropriate jury to choose ONE winning project. The winning project will be awarded a seed funding of INR 750,000/- from Microsoft Research India to continue further work on the project.
Based on their performance in the workshop, the students participating in the workshop will be evaluated for internship positions at MSR, if they are interested in this option.

Contact
For any queries on the proposal / eligibility, etc. please write to: msrsw2018@microsoft.com

People

Program Committee:

Harsha Vardhan Simhadri

Manik Varma

Manohar Swaminathan

Prateek Jain

Rahul Sharma

Organising Committee:

Chris Gould Sandhu

Satish Sangameswaran

 

 

Selected Proposals

The following proposals have been selected for participation at the workshop:

  1. A language-based approach for implementing performant data processing pipelines in constrained embedded devices: Principal Investigator – Jayaraj Poroor, Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, India.
  2. Preventive Maintenance for Distributed Energy Resources: Principal Investigator – Tanuja Garg, Co-Founder & CTO, DataGlen Technologies
  3. Low Footprint speech keyword spotting system using ML at the edge: Principal Investigator – Prashant Namekar, Gaia Smart Cities Solutions
  4. Deep Learning at Mote Scale: Principal Investigator – Anish Arora, The Ohio State University

Agenda: Talks/Sessions

Week of 11-15 June

Day 1 – 11-June

10:00 AM – 11:30 AM : Welcome / Setting Context

Introduction to Microsoft Research India,  the IDE project, the workshop format. Introduction of the teams, committee and Research Fellows working on the IDE project. Goals /expectations from the workshop – to build connections between algorithms and practice; to being together domain experts working in the embedded ML space.

2:00 PM – 4:30 PM: Priyan Vaithilingam, Microsoft Research – Azure Hands-on Tutorial

Get participants familiar with using Azure subscriptions and the services available on their Azure accounts.

Day 2 – 12-June 

10:00 AM – 11:15 AM : Harsha Vardhan Simhadri, Microsoft Research – The EdgeML library 

In this talk, we will present two supervised learning algorithms Bonsai (tree-based) and ProtoNN (nearest-neighbors based) suited for resource-constrained devices such as IoT devices, sensors and embedded devices. These algorithms can generate models that are order(s) of magnitudes smaller, faster and power-efficient (for inference) than popular algorithms such as SVM, GBDT, etc. while still providing competitive accuracies. The talk will include the technical details of the algorithms, their availability and use.

2:00 PM – 3:15 PM : Prateek Jain, Microsoft Research – Recent work on FastRNNs and multi-instance learning+early stopping in RNN.

In this talk, we discuss recent results on training RNNs with smaller memory and computation footprint for supervised learning tasks on time-series data such as audio wake-word detection, keyword spotting, and gesture recognition. The first part of the talk will discuss new RNN architectures.The second part of the talk will discuss new results on learning from noisy data in which the true time-series signature of a class is a small subsequence of a larger training example. This kind of training data is typical of time-series tasks. For example, the positive examples in a wake-word detection training data set might include occurrences of the wake-word in a noisy background sample of length much longer than the utterance of the wake-word. We also discuss how to stop RNN steps early on instances where no signature is found.

Day 3 – 13-June

10:00 AM – 11:15 AMApplications: Interactive Cane, and other Time-series tasks

In this talk, we will discuss our experience with the interactive cane and other time-series tasks. The interactive cane is an input device for people with visual impairment to do quick tasks on their smart phones. On the hardware side, it consists of a small pod with a microcontroller and IMU sensor on the cane. We trained and deployed a model that predicts gestures (from a pre-defined set) on the cane and communicates the user intent, as detected through gesture, to the smart phone. We will share our experiences in collecting training data, developing the features, and user-studies of the prototype. Separately, we will share our experience working on small audio tasks on Cortex A series chips.

2:00 PM – 5:00 PM: Presentation from the groups on their plans for the workshop (40 minutes each)

Day 4 – 14-June

10:00 AM – 11:15 AM:  Anish Arora, Ohio State University  – Machine Learning from Limited Signals at Mote-Scale

This talk describes the growing role of machine learning in battery-powered wireless sensor networks being deployed in rural and urban areas for environmental and wildlife projection.  Specifically, we will discuss radar-based HornNet, deployed in a South African rhino reservation for anti-poaching surveillance, and microphone-based SONYC, deployed in New York City for sound complaint mitigation.  Characteristic to these applications are classification and counting tasks with signals that are limited in the sense of being low signal to noise ratio (SNR), with significant spatio-temporal variation across different background clutters, rejection of clutter which may or may be accurate, and potential for interference. We will present how our mote-scale implementations have and are dealing with these challenges.

2:00 PM – 3:15 PM: Nagarajan Natarajan, Microsoft Research – Predictive Maintenance in Manufacturing: A Case-Study

Ensuring uptime of machines in manufacturing plants, and predicting faults early on in the production pipelines are crucial problems, that have a direct impact on productivity and in turn on the revenue of the manufacturing industry. Typically, several sensors are deployed in the factories that help monitor the status of the production, as well as enable fine computerized control of the production machines. The goal in predictive maintenance is to enable the plant to allow for convenient ahead-of-time scheduling of maintenance of equipment (and of maintenance engineers), predict imminent faults (production defects or machine downtimes), and thereby ensure throughput of production pipelines. In this talk, I’ll present lessons, techniques and results from the case-study we did with Magna Corp, a leading manufacturer of automobile parts — nature and issues with data arising in a typical industrial predictive maintenance setting, the type of solutions required (edge vs cloud), challenges in formulating the predictive maintenance problem as a machine learning (anomaly detection, in particular) problem, and our solutions for the same.

Week of 18-22 June

Day 7 – 19-June

10:00 AM – 11:15 AM: Bharadwaj Amrutur – Chairman, Robert Bosch Centre for Cyber Physical Systems (IISc Bangalore)

A case for an open City Data Exchange and Edge Analytics Stack:

The Indian government is attempting an ambitious aspirational move to “smarten” India’s cities. We do a deep dive analysis of a recent RFP for the same, which has been put out by the city of Agra. From this we extract the contours of a generic, but foundational core – which is a data exchange and edge analytics layer – which we believe should be developed as an open stack. We delve into some details of how such a stack might look – with some examples from an ongoing reference implementation and point out to some interesting technology/research challenges.

Bio: Bharadwaj Amrutur’s recent research is in the area of large scale IoT systems, especially to support AI based autonomous systems. His prior work was in the area of low power VLSI. He is an alumnus of IIT Bombay and Stanford and is currently at IISc Bangalore, as a Professor in ECE dept and Chair of the Robert Bosch Center for Cyber Physical Systems.

Day 8 – 20-June

10:00 AM – 11:15 AM: Tanuja Ganu, Co-Founder & CTO, DataGlen Technologies

Bits and Joules: Empowering energy consumers through IoT & AI

The Energy vertical is going through a paradigm shift from centralised conventional generation and distributed consumption to non-conventional distributed generation (with renewables and battery technologies) and distributed consumption. This change is posing challenges and driving opportunities and innovations using new age digital technologies such as IoT and machine learning. This talk would describe various open problems and ongoing work using edge analytics and AI in energy vertical, including decentralised demand response, predictive maintenance for various equipments, renewable energy forecasting and resource optimization.

Bio: Tanuja Ganu is the Co-Founder and CTO of DataGlen Technologies Private Limited, an early-stage startup that focuses on achieving decarbonization and rapid adaption of distributed and renewable energy resources using IoT & Machine Learning technologies. DataGlen was one of the top twelve global energy start-ups selected for FreeElectrons accelerator program and was also selected to participate in the Cisco Launchpad program.
In the past, she has worked as Research Engineer at IBM Research, India. At IBM, her work was focused on developing low cost innovative solutions to address energy shortage and peak demand problems that are applicable for developing as well as developed countries. With research interests in machine learning, embedded analytics and data driven optimisation, she has published more than 20 scientific publications and has 4 granted US patents.
Tanuja has been recognized as MIT Technology Review’s Innovator Under 35 (MIT TR 35) in 2014. She has also served on the judges committee for MIT TR35 2015, 2016 and 2017. She has won IBM Eminence and Excellence award and IBM first invention plateau award. Her work was covered by top technical media (IEEE Spectrum, MIT Technology Review, IBM Research blog and Innovation 26X26: 26 innovations by 26 IBM women). Recently she was also invited as guest speaker for Cisco Women Rock IT TV series, ACM N2Women Event and Cisco SecCon 2017 security conference.
Prior to joining IBM Research, she led SharePoint Center-of-Excellence team at Tata Consultancy Services Ltd. She pursued her Masters in Computer Science at Indian Institute of Science (IISc), Bangalore and Bachelor in Computer Science at Walchand College of Engineering, Sangli, India.

Day 9 – 21-June

10:00 AM – 11:15 AM: Sriram Rajamani, Managing Director, Microsoft Research India

Overview of Microsoft Research with specific focus on the research and other initiatives of the lab in India.

 

Week of 02-06 July

Day 18 – 04th July

3:00 PM – 04:15 PM:  Vyas Sekar , Carnegie Mellon University 

Rethinking IoT Analytics with Universal Monitoring 

Many IoT analytics tasks require accurate estimates of metrics for many applications such as heavy hitters, anomaly detection (e.g., entropy of source addresses), and security (e.g., DoS detection). Obtaining accurate estimates given  CPU,  memory, energy, and bandwidth constraints on IoT devices is a challenging problem. Existing approaches fall in one of two undesirable extremes: (1) low fidelity general purpose approaches such as sampling, or (2) high fidelity but complex sketching algorithms customized to specific application level metrics. Ideally, a solution should be both general (i.e., supports many applications) and provide accuracy comparable to custom algorithms.  In this talk, I will present our recent work on leveraging recent theoretical advances in the area of “universal sketching” to demonstrate that it is possible to achieve both generality and high accuracy.  Our solution called UnivMon uses an application-agnostic data plane monitoring primitive; different (and possibly unforeseen) estimation algorithms run in the control plane, and use the statistics from the data plane to compute application-level metrics. I will describe our experiences  in using this for network-flow monitoring and highlight interesting directions for future research in the IoT analytics domain. 

 I will also provide a brief overview of: (1) a new project effort called CONIX (conix.io) that aims to provide a new middle tier of distributed computing that tightly couples the cloud and edge by pushing increased levels of autonomy and intelligence into the network and (2) interesting applications of machine learning to IoT security  and privacy.

 Bio: Vyas Sekar is the Angel Jordan Early Career Chair Associate Professor in the ECE Department at Carnegie Mellon University, with a courtesy appointment in the Computer Science Department. His research is in the area of networking, security, and systems and spans network appliances or middleboxes, network management, network security, Internet video, and datacenter networks. Vyas received a B.Tech from the Indian Institute of Technology, Madras where he was awarded the President of India Gold Medal, and a Ph.D from Carnegie Mellon University. He is the recipient of the NSF CAREER award and the ACM SIGCOMM Rising Star Award. His work has received best paper awards at ACM Sigcomm, ACM CoNext, and ACM Multimedia, the NSA Science of Security prize, the CSAW Applied Security Research Prize. 

 

Day 20 – 06th July

11:00 AM to 01:30 PM : Final presentations from the groups

 

Videos