Learning causality from textual data


July 5, 2011


Kira Radinsky


Technion & MSR


It has been a long time quest of artificial intelligence to develop systems that can emulate human reasoning. Fundamental capabilities of such intelligent behavior are the abilities to understand causality and to predict. Those are essential for many artificial intelligence tasks that rely on human common-sense reasoning, such as decision making, planning, question-answering, inferring user intentions and responses.

Much of the causal knowledge that helps humans understand the world is recorded in texts that express people’s beliefs and intuitions. The World Wide Web encapsulates much of our human knowledge through news archives and encyclopedias. This knowledge can serve as the basis for performing true human-like prediction – with the ability to learn, understand language, and possess intuitions and general world knowledge. In this talk I will present Pundit – a learning system, which given an event, represented in natural language, predicts a possible future event it can cause. During its training, we constructed a semantically-structured causality graph of 30 million fact nodes connected by more than one billion edges, based on 150 year old news archive crawled from the web. We devised a machine learning algorithm that infers causality based on this graph. Using common-sense ontologies, it generalizes the events it observes, and thus able to reason about completely new events. We empirically evaluate our system on the 2010 news, and compare our predictions to human predictions. The results indicate that our system predicts similarly to the way humans do.


Kira Radinsky

Kira Radinsky is a researcher in Microsoft innovation labs and a PhD student at the Technion, focusing on temporal machine learning and its applications to predictions on the web. She has won several prestigious prizes (e.g., Google Anita Borg prize for young women researchers), filed over 10 patents, and serves as a reviewer for major AI conferences, including KDD, ICAPS, SIGIR, WWW and AAAI. She has 10 years of a varied industry experience: developing large-scale computer security infrastructures, open source developing, co-founding and serving as the CTO of a CMS startup, developing semantic recommendation systems in Webshakes, and in the last few years she has been conducting research and leading prototype innovation in Microsoft.