Portrait of Emre Kiciman

Emre Kiciman

Senior Researcher

About

I am a senior researcher at Microsoft Research in the information and data sciences group.

I’m broadly interested in using social data to help people find what they want and need; and to this end my work drives towards the goal of extracting from social media useful models of how people behave in the world — people’s actions in the world, people’s interactions with each other, and the consequences of people’s decisions. There are three questions I’m addressing with my research:

Foundations and infrastructure for better social media analysis: I’m building tools and frameworks to make it easier and faster for people to deeply analyze and explore social media.
Connecting social media to the real-world: To interpret social media, I work on entity linking and study the systematic biases inherent in social media’s reflection of the world.
Social systems engineering: I’m studying how the affordances and incentives provided by social systems affect the kinds of information we find in social media.

My previous research interests include JavaScript application monitoring and optimization, as well as improving the reliability of Internet services architectures and operations. I received my Ph.D. and my M.S. from Stanford University, and my B.S. in Electrical Engineering and Computer Science from U.C. Berkeley.

Projects

DSoAP – Distributed Social Analytics Platform

Established: June 1, 2015

The Distributed Social Analytics Platform (DSoAP) project is focused on the “Huge Data” problem in social policy research caused by the breadth of data involved. Using aggregate social media data to investigate and validate social issues (such as employment, health and fiscal policy) requires analyzing many months or years of data. DSoAP is applying intelligent compaction, pre-indexing and distribution of data across a server cluster to achieve responsive query times for online data exploration.

Discussion Graph Tool

Established: April 25, 2014

Discussion Graph Tool (DGT) is an easy-to-use analysis tool that provides a domain-specific language extracting co-occurrence relationships from social media and automates the tasks of tracking the context of relationships and other best practices. DGT provides a single-machine implementation, and also generates map-reduce-like programs for distributed, scalable analyses. DGT simplifies social media analysis by making it easy to extract high-level features and co-occurrence relationships from raw data. With just 3-4 simple lines of script, you…

Online and Social Media Data as a Flawed Continuous Panel Survey

Established: April 9, 2014

If search and Twitter data are to be treated as a survey, they would follow a very peculiar methodology: participation is a time-varying, demographically biased sample of the population, participants are effectively continuously answering different “survey” questions, and, finally, participants can choose how often they are allowed to answer the question. In response, we show alternative methods for interpreting and using online and social media data fruitfully. There is a large body of…

Doloto

Established: February 7, 2008

Doloto stands for Download Time Optimizer and is also the Russian word for chisel.

Ajax View

Established: April 30, 2007

Ajax View enables developer to see and control the behaviors of their web applications on user's desktops.     News April 29, 2009: The technology in Ajax View is now available as a Power Tool: Microsoft Visual Studio AJAX Profiling Extensions. This power tool includes a server-side extension to IIS to add profiling code to your JavaScript web applications, and a Visual Studio add-in to investigate this data with Visual Studio's Performance Explorer.

Publications

2017

2016

2015

2014

2013

2012

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

Projects

Link description

Social Computing

Date

May 22, 2014

Speakers

Barbara Poblete, Emre Kiciman, and Fernando Diaz

Affiliation

University of Chile, Microsoft

Downloads

Longitudinal Tweet ID dataset for a selection of Health, Social, and Business Experiences

April 2017

This data set consists of the tweet IDs collected for the propensity-score analysis of longitudinal social media messages posted by people who mention specific health, social and business domains. This data set accompanies the paper, “Distilling the Outcomes of Personal Experiences: A Propensity-scored Analysis of Social Media.”

    Click the icon to access this download

  • Website

Election 2012 Tweet ID dataset

January 2016

    Click the icon to access this download

  • Website

Discussion Graph Tool

June 2014

    Click the icon to access this download

  • Website

Social Web Experience

September 2009

    Click the icon to access this download

  • Website

Ajax View JavaScript Instrumentation Proxy

July 2007

    Click the icon to access this download

  • Website

Other