NBLyzer

A static analysis framework for data science notebooks

Notebooks provide an interactive environment for programmers to develop code, analyse data and inject interleaved visualisations in a single environment. Despite their flexibility, a major pitfall that data scientists encounter is unexpected behaviour caused by the unique out-of-order execution model of notebooks. As a result, data scientists face various challenges ranging from notebook correctness, reproducibility and cleaning. In this paper, we propose a framework that performs static analysis on notebooks, incorporating their unique execution semantics. Our framework is general in the sense that it accommodates a wide range of analyses, useful for various notebook use cases.

People

People

Portrait of Pavle Subotić

Pavle Subotić

Snr. Research Software Engineer

Portrait of Jana Kovacević

Jana Kovacević

Software Engineer

Microsoft