Microsoft Research Blog

Artificial intelligence

  1. Temporally Correlated Task Scheduling for Sequence Learning 

    July 17, 2021

    Sequence learning has attracted much research attention from the machine learning community in recent years. In many applications, a sequence learning task is usually associated with multiple temporally correlated auxiliary tasks, which are different in terms of how much input information to use or which…

  2. TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL 

    July 17, 2021 | Clément Romac, Rémy Portelas, Katja Hofmann, and Pierre-Yves Oudeyer

    Training autonomous agents able to generalize to multiple tasks is a key target of Deep Reinforcement Learning (DRL) research. In parallel to improving DRL algorithms themselves, Automatic Curriculum Learning (ACL) study how teacher algorithms can train DRL agents more efficiently by adapting task selection to…

  3. Online Learning with Optimism and Delay 

    July 17, 2021

    Inspired by the demands of real-time climate and weather forecasting, we develop optimistic online learning algorithms that require no parameter tuning and have optimal regret guarantees under delayed feedback. Our algorithms -- DORM, DORM+, and AdaHedgeD -- arise from a novel reduction of delayed online…

  4. Accuracy, Interpretability, and Differential Privacy via Explainable Boosting 

    July 17, 2021

    We show that adding differential privacy to Explainable Boosting Machines (EBMs), a recent method for training interpretable ML models, yields state-of-the-art accuracy while protecting privacy. Our experiments on multiple classification and regression datasets show that DP-EBM models suffer surprisingly little accuracy loss even with strong…

  5. Acceleration via Fractal Learning Rate Schedules 

    July 17, 2021 | Naman Agarwal, Surbhi Goel, and Cyril Zhang

    In practical applications of iterative first-order optimization, the learning rate schedule remains notoriously difficult to understand and expensive to tune. We demonstrate the presence of these subtleties even in the innocuous case when the objective is a convex quadratic. We reinterpret an iterative algorithm from…

  6. Neural Pharmacodynamic State Space Modeling 

    July 17, 2021 | Zeshan Hussain, Rahul Gopal Krishnan, and David Sontag

    Modeling the time-series of high-dimensional, longitudinal data is important for predicting patient disease progression. However, existing neural network based approaches that learn representations of patient state, while very flexible, are susceptible to overfitting. We propose a deep generative model that makes use of a novel…

  7. Characterizing Fairness Over the Set of Good Models Under Selective Labels 

    July 17, 2021 | Amanda Coston, Ashesh Rambachan, and Alex Chouldechova

    Algorithmic risk assessments are used to inform decisions in a wide variety of high-stakes settings. Often multiple predictive models deliver similar overall performance but differ markedly in their predictions for individual cases, an empirical phenomenon known as the "Rashomon Effect." These models may have different…

  8. Quantum algorithms for reinforcement learning with a generative model 

    July 17, 2021

    Reinforcement learning studies how an agent should interact with an environment to maximize its cumulative reward. A standard way to study this question abstractly is to ask how many samples an agent needs from the environment to learn an optimal policy for a \gamma-discounted Markov decision…

  9. Interactive Learning from Activity Description 

    July 17, 2021

    We present a novel interactive learning protocol that enables training request-fulfilling agents by verbally describing their activities. Our protocol gives rise to a new family of interactive learning algorithms that offer complementary advantages against traditional algorithms like imitation learning (IL) and reinforcement learning (RL). We…

  10. Interaction-Grounded Learning 

    July 17, 2021 | Tengyang Xie, John Langford, Paul Mineiro, and Ida Momennejad

    Consider a prosthetic arm, learning to adapt to its user's control signals. We propose Interaction-Grounded Learning for this novel setting, in which a learner's goal is to interact with the environment with no grounding or explicit reward to optimize its policies. Such a problem evades…