Oral Session 1

Small, n=me, Data – Consider a new kind of cloud-based app that would create a picture of an individual’s behavior over time by continuously, securely, and privately analyzing the digital traces they generate 24×7 by virtue of the fact that they mediate, or at least accompany, their lives with mobile and other digital technologies. The social networks, search engines, mobile operators, online games, and e-commerce sites that they access every hour of most every day extensively use these digital traces to tailor service offerings and to improve system performance and in some cases to target advertisements. Most of these services do not make these individual traces available to the person who generated them; but they might begin to do so if we identify the market, technical, and social mechanisms that would derive value from these traces. Our premise is that this broad but highly personalized, data set can be analyzed to draw powerful inferences about an individual, and for that individual. Use of these traces could enhance, and even transform, our experiences as consumers, patients, passengers, customers, family members, as well as users of online media. These traces might fuel apps that offer individuals personalized, data-driven, insights into their habits and habitats. But for this to be realized the raw data sources will require extensive processing in order to generate an actionable representation of someone’s relevant behaviors, e.g.., a personalized “behavioral pulse”.

Scalable Influence Estimation in Continuous-Time Diffusion Networks – If a piece of information is released from a media site, can it spread, in 1 month, to a million web pages? This influence estimation problem is very challenging since both the time-sensitive nature of the problem and the issue of scalability need to be addressed simultaneously. In this paper, we propose a randomized algorithm for influence estimation in continuous-time diffusion networks. Our algorithm can estimate the influence of every node in a network with |Vcal | nodes and |Ecal | edges to an accuracy of ? using n=O(1/? 2 ) randomizations and up to logarithmic factors O(n|Ecal |+n|Vcal |) computations. When used as a subroutine in a greedy influence maximization algorithm, our proposed method is guaranteed to find a set of nodes with an influence of at least (1?1/e)OPT?2? , where OPT is the optimal value. Experiments on both synthetic and real-world data show that the proposed method can easily scale up to networks of millions of nodes while significantly improves over previous state-of-the-arts in terms of the accuracy of the estimated influence and the quality of the selected nodes in maximizing the influence.

Speaker Details

Deborah Estrin is a Professor of Computer Science at Cornell Tech in New York City (http://tech.cornell.edu/deborah-estrin) and a Professor of Public Health at Weill Cornell Medical College. She is co-founder of the non-profit startup, Open mHealth (http://openmhealth.org/). She was previously on faculty at UCLA and Founding Director of the NSF Center for Embedded Networked Sensing (CENS). Estrin is a pioneer in networked sensing, which uses mobile and wireless systems to collect and analyze real time data about the physical world and the people who occupy it. Estrin’s current focus is on mobile health (mhealth), leveraging the programmability, proximity, and pervasiveness of mobile devices and the cloud for health management. She is an elected member of the American Academy of Arts and Sciences and the National Academy of Engineering. She recently presented at TEDMED about small data: https://smalldata.tech.cornell.edu/

Deborah Estrin and Nan Du
Cornell NYC Tech, Georgia Tech
    • Portrait of Jeff Running

      Jeff Running