{"id":189914,"date":"2013-09-18T00:00:00","date_gmt":"2013-09-20T15:42:53","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/collaborative-large-scale-data-analytics-and-visualization-with-python\/"},"modified":"2016-08-02T06:11:38","modified_gmt":"2016-08-02T13:11:38","slug":"collaborative-large-scale-data-analytics-and-visualization-with-python","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/collaborative-large-scale-data-analytics-and-visualization-with-python\/","title":{"rendered":"Collaborative, Large-Scale Data Analytics and Visualization with Python"},"content":{"rendered":"<div class=\"asset-content\">\n<p>NumPy and recently Pandas have made Python ubiquitous for scientific computing and data analytics. The technical stack for Python works very well for a wide variety of problems that fit in single-address space (RAM of a single computer). For problems that require larger data sets, current solution approaches are to use memory-mapped files, MPI, IPython parallel and\/or a standard map-reduce system like Disco (or Hadoop). These techniques typically significantly complicate the software solution from the simple array (table)-oriented expression that makes NumPy (Pandas) so powerful and popular. These approaches can also result in significant data movement throughout the memory hierarchy (which is the common bottleneck in data-centric computing today).   Blaze, is an array \/ table for python that can be used to manage and manipulate very-large, disjoint, data sets in an array-oriented fashion with Python. It is built on a C++-library (dynd) that provides dynamic, multi-dimensional arrays with flexible data types.   It also leverages Numba, an array-oriented, python compiler that takes a subset of the Python syntax to LLVM IR and optimized machine code. In this talk I will discuss Blaze and Numba design and roadmap. I will also provide an overview and example of web-based visualizations with Bokeh which allows Python developers to easily produce interactive, web-based visualizations leading in to an overview of Wakari which provides easy access to executable IPython notebooks in the cloud.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>NumPy and recently Pandas have made Python ubiquitous for scientific computing and data analytics. The technical stack for Python works very well for a wide variety of problems that fit in single-address space (RAM of a single computer). For problems that require larger data sets, current solution approaches are to use memory-mapped files, MPI, IPython [&hellip;]<\/p>\n","protected":false},"featured_media":197890,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_hide_image_in_river":0,"footnotes":""},"research-area":[],"msr-video-type":[206954],"msr-locale":[268875],"msr-post-option":[],"msr-session-type":[],"msr-impact-theme":[],"msr-pillar":[],"msr-episode":[],"msr-research-theme":[],"class_list":["post-189914","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-video-type-microsoft-research-talks","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/6rlHo2XHfRM\/","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/189914","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":0,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/189914\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/197890"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=189914"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=189914"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=189914"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=189914"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=189914"},{"taxonomy":"msr-session-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-session-type?post=189914"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=189914"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=189914"},{"taxonomy":"msr-episode","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-episode?post=189914"},{"taxonomy":"msr-research-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-theme?post=189914"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}