eScience Workshop 2014

About

Photo: ESA; ENVISAT image of southeastern Brazil

Photo: ESA

eScience is the increasingly pervasive role of computation in modern scientific research. Today’s research problems are very complex and require massive amounts of data. The challenges to deal with that data extend throughout the lifecycle of the projects, including data acquisition, data storage and retrieval, data visualization, data analysis, and modelling.

The 2014 Microsoft eScience Workshop was held in conjunction with the 10th IEEE International Conference on e-Science in Guarujá, Brazil. The workshop focused on presentations of the various technologies that can help scientists accomplish research projects more quickly and efficiently. In addition to investigating various general areas of computation that are valuable to scientific projects, we also presented case studies that demonstrate how scientists are already using these approaches in the field.

Our goal for this workshop was to explore how technologies can assist researchers throughout the various steps of the research lifecycle, turning data into knowledge.

About the workshop

Each year, the eScience Workshop provides a forum for scientists and researchers to share their experiences and expertise with the academic and research communities. The eScience Workshop fosters collaboration, facilitates the sharing of software components and techniques, and defines rich, open scientific challenges. Microsoft has been actively pursuing research in eScience for more than 11 years; the book The Fourth Paradigm: Data-Intensive Scientific Discovery provides a background on its many areas of focus.

Agenda

Held in conjunction with IEEE International Conference on e-Science

Monday, October 20: Microsoft Azure Training

TimeSessionRoom
8:30–10:00

Microsoft Azure for Research Training – AM

Trainer: Mateus Velloso, Microsoft

Princesa Isabel

10:00–10:30

Coffee Break

10:30–12:30

Microsoft Azure for Research Training – AM

Trainer: Mateus Velloso, Microsoft

Princesa Isabel

12:30–14:00

 Lunch

14:00–15:30

 Microsoft Azure for Research Training – PM

Trainer: Mateus Velloso, Microsoft

Princesa Isabel

15:30–16:00

Coffee Break

16:00–18:00

 Microsoft Azure for Research Training – PM

Trainer: Mateus Velloso, Microsoft

Princesa Isabel

18:00–20:00

Poster Session

 

Tuesday, October 21

TimeSessionRoom
8:30–10:00

Welcome

2014 Jim Gray eScience Award Announcement

Chair: Harold Javid, Microsoft Research

Opening Keynote

Princesa Leopoldina&José Bonifácio

10:00–10:30

Coffee Break

10:30–12:30

Data Acquisition

Chair: Chris Mentzel, Gordon and Betty Moore Foundation

Presentations:

Data Analysis in Social Sensing: Perspectives and Opportunities

Antonio A. F. Loureiro, Universidade Federal de Minas Gerais (UFMG)

Building sensing applications with the Owl Platform

Yanyong Zhang, Rutgers University

SensorFly and Beyond: Knowledge Discovery through Ambient Sensing

Pei Zhang, Carnegie Mellon University

Teresa Cristina

12:30–14:00

Lunch

14:00–15:30

Panel: The Strategic Importance of eScience

Moderator: Harold Javid, Microsoft Research

Panelists:

  • Carlos Henrique de Brito Cruz, São Paulo Research Foundation (FAPESP)
  • Jason Rhody, National Endowment for the Humanities
  • Chris Mentzel, Gordon and Betty Moore Foundation

Princesa Leopoldina&José Bonifácio

15:30–16:00

Coffee Break

16:00–18:00

Microsoft Research-FAPESP Joint Research Center Projects

Chair: Daron Green, Microsoft Research

Presentations:

Making Sense of Environmental Data in a Cloud Forest

Antonio A. F. Loureiro, Universidade Federal de Minas Gerais (UFMG)

E-phenology: combining new technologies to monitor plant phenology from leaves to ecosystems

Patricia Morellato, UNESP São Paulo State University at Rio Claro

Advances in Computer Science Towards an Understanding of Tipping Points within Tropical South American Biomes

Ricardo da Silva Torres, University fo Campinas

Teresa Cristina

18:00–19:00

Break

19:00–21:00

IEEE Reception Sponsored by Microsoft

DemoFest with Poster Session

Demos:

Large Scale Study of Urban Societies in Near Real Time

Thiago H. Silva, Universidade Federal de Minas Gerais (UFMG)

SandDance

Daron Green, Microsoft Research

Holograph

Dave Brown, Microsoft Research

Cortana

Juliana Salles, Microsoft Research

WorldWide Telescope/Oculus

Jessika Gebauer, Microsoft Research

Tempe: Quick Answers from Large Data

Danyel Fisher, Microsoft Research

CodaLab

Harold Javid, Microsoft Research

Microsoft Cloud Services for Machine Learning

Mateus Velloso, Microsoft

Live Ocean

Rob Fatland, Microsoft Research

Cloud Forest

Anna Izabel Tostes, Universidade Federal de Minas Gerais (UFMG)

Princesa Leopoldina&José Bonifácio

Wednesday, October 22

Time

Session

Room
8:30–10:00

Panel: Going Native

Moderator: Daron Green, Microsoft Research

Panelists:

  • Paul Watson, Newcastle University UK
  • Antony John Williams, Royal Society of Chemistry
  • Steve Kelling, Cornell Lab of Ornithology

 Princesa Leopoldina&José Bonifácio

10:00–10:30

Coffee Break

10:30–12:30

Joint Microsoft-IEEE e-Science Keynote

Chair: Claudia Medeiros, University of Campinas

Leveraging Computational (e)Social Science to address Grand Societal Challenges

Noshir Contractor, Northwestern University

Princesa Leopoldina&José Bonifácio

12:30–14:00

Lunch

14:00–15:30

eScience and Environment

Chair: Rob Fatland, Microsoft ResearchPresentations:

Bringing the Cloud Down to the Water: Towards Enabling a “Dynamic Information Framework” for Environmental Resource Decisions

Jeffrey Richey, University of Washington and Visiting Professor, University of São Paulo

Estimating the carbon stocks by optimizing LiDAR forest big data

Rosiane de Freitas Rodrigues, Federal University of Amazonas (UFAM)

The Birder Effect: data driven science for biodiversity conservation

Steve Kelling, Cornell Lab of Ornithology

Teresa Cristina

15:30–16:00

Coffee Break

16:00–18:00

Data VisualizationChair: Danyel Fisher, Microsoft ResearchPresentations:

Exploratory Visualization for Big Data

Danyel Fisher, Microsoft Research

“Touching” the Third Dimension—Exploration of Scientific Data on Surfaces

Tobias Isenberg, INRIA

Interactive Network Visualization

Benjamin Bach, INRIA

 Teresa Cristina

20:00

Dinner

Princesa Leopoldina&José Bonifácio

Speakers and Abstracts

Anna Izabel Tostes, Universidade Federal de Minas Gerais (UFMG)

Cloud Forest anna-izabel-tostes75x108

Abstract: Tropical ecosystems are major contributors to the global environment as they control significant exchanges of energy, water and other resources between the atmosphere, land surfaces and belowground. Cloud forests, in particular, in addition to their significant biodiversity, play a key role in the regional water cycle areas of urban areas and typically occur in areas of high population density. We want to understand how key cloud forest processes are affected by changes in land use and climatic variation, temporally and spatially. During this internship, we designed an ontology of a cloud forest in order to understand how micro-climatic variability impacts ecosystem processes using data streaming from sensors. We use two services: Thing Registration Service (TRS), which is a registration or indexing service also known as DNS for things, and Observation System (OS), which indexes data from sensors and produce inferences. Then, we visualize sap flow, vapor pressure deficit, soil moisture, and fog in Worldwide Telescope.

Bio: Anna Izabel João Tostes Ribeiro is a PhD student at the Federal University of Minas Gerais (fourth year). She is supervised by Antonio Loureiro on the topic “Context-aware solutions for traffic congestion using Bing maps data.” From June 30 to October 10, she did an internship at Microsoft Research in Redmond, WA, United States. She won the Brazil Women in Technology Award by Google (2008), before joining her masters. Her research interests are traffic jam, vehicular networks, big data, and data visualization.

Antonio A. F. Loureiro, Universidade Federal de Minas Gerais (UFMG)

Cloud Forest Data Analysis in Social Sensing: Perspectives and Opportunitiesantoni-loureiro75x108

Abstract: The ubiquitous availability of computing technology such as smartphones, tablets, and other easily portable devices, and the worldwide adoption of social networking sites make it increasingly possible for one to be connected and continuously contribute to this massively distributed information 5 publishing process. In this scenario, people act as social sensors, voluntarily providing data that capture their daily life experiences, and offering diverse observations on both the physical world (e.g., location) and the online world (e.g., events). This large amount of social data can provide new forms of valuable information that are currently not available, at this scale, by any traditional data collection methods. In this talk, we will discuss some perspectives on social sensing and some interesting research opportunities.

Making Sense of Environmental Data in a Cloud Forest

Abstract: With the backdrop of concerns over loss of habitat and biodiversity, and disruption to carbon and water cycles and related environmental impacts, it is important to gather fine-grained information on intact versus disturbed mountain forest systems to learn how land management practices can safeguard the functioning of the whole sequence of vegetation types, land forms and land uses which begin in the delicate upper slopes of mountains. In this context, two important research questions are: 1) How does the mixture of functional traits in plants communities combine to influence the exchange of carbon and water between the biosphere and atmosphere at multiple scales? and 2) How will tropical plants and ecosystems respond to climate change and what are the effects of these responses on ecosystem functioning? In this talk, we discuss how the technology can help us to answer those questions.

Bio: Antonio A.F. Loureiro received his B.Sc. and M.Sc. degrees in Computer Science from the Federal University of Minas Gerais (UFMG), Brazil, and the Ph.D. degree in computer science from the University of British Columbia, Canada, 1995. Currently, he is a full professor of Computer Science at UFMG, where he leads the research group in ubiquitous computing, wireless sensor networks and embedded systems. In the last 15 years he has published extensively in international conferences and journals related to those areas, and also presented keynotes and tutorials at international conferences.

Antony John Williams, Royal Society of Chemistry

Panel: Going Nativeantonywilliams75x108

Abstract: The topic of discussion in this panel will be “Going Native”— a reference to a quote from Jim Gray along the lines of “…in order to really understand the computing needs of a scientist you have to go native.” Jim himself did this, immersing himself in astronomy to build what would become the WorldWide Telescope. Bridging the gap between experimental scientists and the computing that underpins their discoveries is an ongoing challenge for eScience. The panel will explore what it means to go native and give examples of where they have seen this work well and share their lesson’s learned from working in this way.

Bio: Antony Williams is the VP of Strategic Development for the Royal Society of Chemistry and manager of the cheminformatics team for the RSC. His scientific expertise is presently focused in the fields of chemical structure representation, analytical data management and prediction, and computer-assisted structure elucidation. His passion for integrated data management and a vision of aggregating chemical compound data on the internet initiated a hobby project to develop the ChemSpider database, acquired by the Royal Society of Chemistry and now providing access to more than 30 million chemicals online. He is widely published with more than 150 publications and book chapters and is the ChemConnector in the social networks. He has worked on the quality of chemistry content on Wikipedia, is a recipient of the Jim Gray award for eScience from Microsoft and is particularly focused at this time in helping scientists understand the power of the web for social networking in the sciences.

Benjamin Bach, INRIA

Interactive Network Visualizationbenjamin-bach75x108

Abstract: This talk presents an overview of interactive visualisations for complex networks. Networks are used to model a wide range of phenomena, from computer networks to similarities between genes, brain activity, and social interactions between individuals (social networks) or organizations. Yet, making sense of these complex networks requires more than modeling and statistics. Network visualization has progressed dramatically in recent years and provides novel and effective ways to make sense of complex networks through effective visual encodings and interactions.

This talk will present an overview of important advances in visualising complex networks, with a special focus on networks that change over time. Based on the use case of analysing functional brain activity, we demonstrate techniques from our own research. However, these techniques are not limited to brain connectivity but can be used to visualise other dense networks with changing connection strengths. The talk concludes with an outlook on our ongoing research as well as future challenges and applications.

Bio: Benjamin Bach is a post-doctoral research fellow in information visualisation, currently working on a joint project between Microsoft Research and Inria, France. His research addresses the design and evaluation of interactive visualisations for temporal data and complex networks, with a strong focus on networks changing over time. His current interdisciplinary collaborations involve helping brain scientists analysing functional brain activity, as well as historians exploring historic social networks. Benjamin received his MS from the University of Technology, Dresden, Germany, and his PhD from the University of Paris Sud, France.

Carlos Henrique de Brito Cruz, São Paulo Research Foundation (FAPESP)

Panel: The Strategic Importance of eSciencebrito-cruz75x108

Abstract: Today’s society faces very complex issues such as climate change and global warming, food production on limited areas for an increasing number of consumers, the functioning of super populated urban areas, and the cure of diseases that affect large numbers of the population. In order to understand and cope with these issues, we need multidisciplinary teams of researchers capturing vast amounts of data with new instruments on a 24/7 basis and developing new techniques to transform it into knowledge and actionable recommendations.

The goal of this panel is to discuss the complex challenges we face and why they require data intensive research, the necessary changes in scientific education, as well as training and practices to broaden eScience.

Bio: Carlos Henrique de Brito Cruz graduated in Electrical Engineering (Inst. Tecn. de Aeronáutica, ITA, 1978), has a MSc in Physics and a DSc in Physics (1980 and 1983, Physics Inst., Univ. of Campinas, Unicamp). He was a researcher at the Quantum Optics Laboratory, at the University of Rome (1981), a resident visitor at AT&T Bell Laboratories in Holmdel, NJ (1986-7) and was a visitor at Bell Labs, Murray Hill, NJ (1990). Brito Cruz has been the Director of the Physics Institute at Unicamp for two terms. He has been the Dean of Research at Unicamp and the President of the São Paulo Research Foundation, FAPESP (1996-2002) and the Rector of Unicamp (2002-05). Since 2005 he is the Scientific Director at the São Paulo Research Foundation, FAPESP. Brito Cruz is a member of the Brazilian Academy of Sciences and a Fellow of the Royal Society of Chemistry.

Chris Mentzel, Gordon and Betty Moore Foundation

Panel: The Strategic Importance of eSciencechris-mentzel75x108

Abstract: Today’s society faces very complex issues such as climate change and global warming, food production on limited areas for an increasing number of consumers, the functioning of super populated urban areas, and the cure of diseases that affect large numbers of the population. In order to understand and cope with these issues, we need multidisciplinary teams of researchers capturing vast amounts of data with new instruments on a 24/7 basis and developing new techniques to transform it into knowledge and actionable recommendations.

The goal of this panel is to discuss the complex challenges we face and why they require data intensive research, the necessary changes in scientific education, as well as training and practices to broaden eScience.

Bio: Chris Mentzel leads the foundation’s Data-Driven Discovery Initiative, a $60 million effort within the Science Program to enable data scientists to turn the scientific data deluge into opportunities to address some of today’s most important research questions. Previously, Chris led the grants administration department and also worked as senior network engineer for the foundation. He has also held positions as a systems engineer and integrator at the University of California, Berkeley, and at various Internet consulting firms in the Bay Area. An active member of the broader big data and open science communities, Chris serves on a number of advisory boards and program committees and speaks frequently at conferences and workshops on topics related to data-driven research. Chris received a B.A. in mathematics from the University of California, Santa Cruz, and is currently pursuing an M.Sc. in management science and engineering at Stanford University.

Danyel Fisher, Microsoft Research

Exploratory Visualization for Big Datadanyel-fisher75x108

Abstract: We’re increasingly living in a world of very big data—but in doing so, we’re losing out on the ability to flexibly explore that data. Interactive, exploratory visualization counts on rapid responsiveness—but our big clusters don’t provide that today. Adding more machines adds more communications overhead – and you can’t add computers as fast as the data is growing. The ability to ask new questions, quickly, is critical.

In this talk, I will give a broad overview of some of the research challenges in big data analysis. We’re going to do a pass across vast swathes of computer science—from visualization, to database research, to distributed systems—to figure out what the new challenges and opportunities in big data visualization are. We’ll look back to techniques developed in the distant past—the 1970s—when data was big and core memory was small; and the techniques evolved with the big energy simulations, where thousands of cores work busily for days at a time. I’ll discuss my own research on progressive analysis of big data: for some types of problems, progressive computation can often let a data scientist get as much detail they need in tractable time. I’ll talk about two different projects exploring progressive big data visualization.

DemoFest: Tempe—Quick Answers from Large Data

Abstract: Tempe is an interactive system for exploring large data sets. It accelerates data science by facilitating quick, iterative feature engineering and data understanding. Tempe based on Trill, a high-speed, temporal, progressive-relational stream-processing engine. Tempe provides progressive queries—providing “best effort” partial answers.

Tempe enables users to try and discard queries quickly, enabling much faster exploration of large data sets.

Bio: Danyel Fisher is a researcher in information visualization and human-computer interaction at Microsoft Research’s VIBE group. His research focuses on ways to help users interact with data more easily. His recent work has looked at ways to make big data analytics faster and more interactive with incremental visualization; his papers Trust Me, I’m Partially Right and Interactions with Big Data Analytics outline the research direction. Outside Microsoft, he has helped organize the “Industry and Government” track at IEEE Info Vis Conference, bringing together practitioners with academics at the premier visualization conference. Danyel received his MS from UC Berkeley, and his PhD from UC Irvine.

Daron Green, Microsoft Research

DemoFest: SandDancedarongreen75x108

Abstract: SandDance is a browser based information visualization system that scales to hundreds of thousands of items. Arbitrary datatables can be loaded and results can be filtered using facets and search and displayed in a variety of layouts. Transitions between the views are animated so that users can better maintain context. Multiple linked views allow for associations between the same items in each view. Multiple devices can simultaneously interact with each other on the same dataset.

Bio: Dr. Green is the senior regional manager of Microsoft Research Outreach responsible for global research investments. Previously he was general manager of Microsoft’s Technology Policy Group, responsible for identifying business opportunities and innovations likely from potential disruptive technologies. In that role, he provided oversight for key mechanisms for Microsoft’s internal processes of innovation and ideation such as ThinkWeek and external efforts such as Microsoft’s Cloud Research Engagements and Microsoft’s Environmental Sustainability program. Prior to this, he was general manager for Microsoft Research’s external engagement and investment strategy. With a global portfolio which included diverse topics such as Health and Wellbeing, Education and Scholarly Communications, Computer Science, and the Environment. Dr. Green’s initial research background was in molecular modeling and equations of state for fluid mixtures—his BSc is in Chemical Physics (1989, Sheffield) and PhD in molecular simulation of fluid mixtures (1992, Sheffield). He went on to do post-doctoral research in simulation of polymer and protein folding (1993–4, UCD). This led to application porting and optimization for large-scale parallel and distributed computing in a range of application domains including computational chemistry (molecular dynamics and quantum mechanical codes), radiography, Computational Fluid Dynamics, and Finite Element analysis.

Dr. Green then moved more fully into HPC and was responsible for some of Europe’s largest HPC Framework V programs for the European Commission, major HPC procurements in the UK for the UK Research Councils and UK Defense clients; he also led detailed investigations into the maturity and adoption for European HPC Software tools (published). From there, Dr. Green went to work for the SGI/Cray—helping to set up the European Professional Services organization from which he span out a small team out to establish the European Professional Services for Selectica Inc. Selectica specialized in online configuration/logic-engine technologies offered via web services. Given an HPC/distributed computing background and familiarity with the then embryonic area of web services, IBM invited Dr. Green to help establish its Grid Computing Strategy and emerging business opportunity (Grid EBO) team. He subsequently moved to British Telecom to head-up its Global Services business incubation and, as part of this, in 2007 he established and launched BT’s Sustainability practice, responsible for BT’s business offerings to commercial customers which help reduce their carbon footprints and establish business practices which are sustainable in terms of their social and economic impact.

Dave Brown, Microsoft Research

DemoFest: Holographdavebrown-75x108

Abstract: A platform for visualizing and exploring spatial and temporal data using Natural User Interaction and 2D or 3D displays.

Bio: Dave Brown is a senior research software development engineer at Microsoft Research. His current project focus is interactive data visualization of complex data sets using natural user interaction and either 2D or 3D displays.

He has a patent for “Dial-based User Interfaces”.

Dave joined Microsoft in 2001. Prior to joining Microsoft Research, he worked for the Microsoft Technology Centre in the UK, working with customers to design and prototype innovative solutions using the latest Microsoft tools and technologies. He received his bachelor’s degree in Chemistry from the University of Oxford, and his PhD in Organic Chemistry from the University of Reading, UK.

Harold Javid, Microsoft Research

DemoFest: CodaLabharold-javid75x108

Abstract: CodaLab is an open-source platform that makes life easier for those conducting data- and computation-intensive experiments. Use existing algorithms and datasets or upload your own (any format, any language). All experiments you do are reproducible and sharable with others. These experiments can then be easily copied, re-worked, and edited by other collaborators in order to advance the state of the art in data-driven research and machine learning.

CodaLab also allows communities to create competitions to focus on some tasks, which results can then be provided back to the community as experiments for further development.

Bio: Harold Javid’s career spans industry and academia. After completing a PhD in EE from UIUC, Harold worked for small companies as electronics division manager and general manager developing real time embedded controls and industrial optimizers. In between, he worked in large companies including GE and Boeing as application engineer, researcher, and research manager. In 1998, after turning around a small company and then supporting its sale, he followed his heart back to his technical love—by joining Microsoft. In Microsoft Research, as director of academic outreach, he leads collaborations between Microsoft Research and universities in North America, Latin America, and Australia. Harold’s team is responsible for events such as the Microsoft Research Faculty Summit and the annual Microsoft eScience Workshop, awards programs such as the Microsoft Research Faculty Fellowship program in addition to funded university collaborations. Harold is actively involved in service to the IEEE as a member of the Industry Advisory Board for the Computer Society and assistant treasurer of its Board of Governors.

Jason Rhody, National Endowment for the Humanities

Panel: The Strategic Importance of eSciencejasonrhody75x108

Abstract: Today’s society faces very complex issues such as climate change and global warming, food production on limited areas for an increasing number of consumers, the functioning of super populated urban areas, and the cure of diseases that affect large numbers of the population. In order to understand and cope with these issues, we need multidisciplinary teams of researchers capturing vast amounts of data with new instruments on a 24/7 basis and developing new techniques to transform it into knowledge and actionable recommendations.

The goal of this panel is to discuss the complex challenges we face and why they require data intensive research, the necessary changes in scientific education, as well as training and practices to broaden eScience.

Bio: Jason Rhody is a senior program officer in the Office of Digital Humanities (ODH) at the National Endowment for the Humanities (NEH), where he facilitates the development and funding of projects that harness emerging technologies to advance humanities research, encourage humanistic inquiry of digital culture, and foster collaboration across international and disciplinary boundaries. He has developed joint grant programs with international partners, such as the United Kingdom and Germany, and continues to cultivate shared initiatives with other funding organizations. Jason received his PhD in English from the University of Maryland, and his scholarly research interests include book and interface design in 20th and 21st century literature, narrative theory, and game studies. Prior to joining NEH in 2003, he managed and advised digital humanities projects at the Maryland Institute for Technology in the Humanities (MITH) and taught courses in literature and digital media.

Jeffrey Richey, Professor, University of Washington and Visiting Professor, University of São Paulo

Bringing the Cloud Down to the Water: Towards Enabling a “Dynamic Information Framework” for Environmental Resource Decisionsjeffreyrichey75x108

Abstract: The goal of a “dynamic information framework” (DIF) is to enable the foundation of tools that would enable scenario analyses for decisions on the environmental resources of (discrete) regions. Such a framework requires time series data sets in state of the art models that can be utilized by staff in National Agencies to analyze the resource base and develop predictive scenarios of, for example, climate and landscape changes with appropriate interventions, The application of modern “landscape/hydrology” models of river basins represents a powerful tool for the analysis of coupled landscape properties, water resources, and future change scenarios. But actually doing this involves addressing a series cyber/technical intertwined with (geo) political issues, requiring that domain scientists can work within the e-science arena to learn the tools necessary to make the process viable. Projects from Bhutan to the Aral Sea to Espírito Santo call out what is needed to move forward.

Bio: Professor in the School of Oceanography and adjunct professor, Department of Civil and Environmental Engineering, University of Washington, visiting professor and São Paulo Excellence Chair, Universidade de São Paulo. B.A. from Stanford University, MSPH from the University of North Carolina, and PhD from the University of California, Davis. Research involves the biogeochemistry and hydrology of large-scale river basins, how to implement geo-information systems for analysis of complex basins, and “dynamic information frameworks” for international resource management, primarily with the World Bank.

Medalha Ademar Cervellini de Merito Academico, University of São Paulo; Zayed International Prize for the Environment, Millennium Ecosystem Assessment; Academia Brasileira de Ciências (Brazilian National Academy of Sciences); Fellow, American Geophysical Union. FAPESP (Fundação de Amparo à Pesquisa do Estado de São Paulo) São Paulo Excellence Chair and University of São Paulo Visiting Professorship, 2013-, World Bank’s Hydrology Expert Facility, Vice-Chairman IGBP Land-Ocean Interactions in the Coastal Zone.

Jessika Gebauer, Microsoft Research

DemoFest: WorldWide Telescope/Oculus

Abstract: Explore space, ocular neurons of the brain and San Francisco all in virtual reality powered by WorldWide Telescope the oculus rift device.

Juliana Salles, Microsoft Research

DemoFest: Cortanajulianasalles75x108

Abstract: Cortana is the world’s first truly personal assistant. Cortana gets to know you, always looks out for you, and keeps you close to the people who matter, all while keeping you in control and being natural, interactive, and easy to use.

Bio: Dr. Juliana Salles is a senior research program manager at Microsoft Research, responsible for academic research partnerships in Brazil. Her primary work is building collaborative projects between Microsoft and academia to better understand tropical ecosystems and their response to climate change. Prior to joining Microsoft Research, Dr. Salles worked for several Microsoft product teams including Visual Studio, Windows Live, and Windows Live Mobile as a user experience researcher. Dr. Salles holds a Ph.D. in Human Computer Interaction, plus a bachelors and masters in Computer Science. Her interests include User Research techniques and methodology and their integration with the software development process.

Mateus Velloso, Microsoft

DemoFest: MAMLmat-velloso75x108

Abstract: Microsoft Azure Machine Learning is a browser-based tool where tasks are graphically represented, providing a flexible and easy-to-use environment for problem solving. Learn how to execute tasks such as importing datasets, data transformation, training, scoring, evaluation, visualizing results, and publishing a web service.

Bio: Mat (Mateus) is originally from Brazil and has 30 years of software development experience. Some of them in Brazil, some in New Zealand where he spent six years, and the rest in Redmond where Mat has worked as an architect at Microsoft IT and now works as a senior developer in a group called TED (Technical Evangelism Development).

Patricia Morellato, UNESP São Paulo State University at Rio Claro

E-phenology: combining new technologies to monitor plant phenology from leaves to ecosystemspatricia-morellato75x108

Abstract: E-phenology is a multidisciplinary project exploring innovative solutions for plant monitoring in the tropics, combining research in Computer Science, Phenology, Remote Sensing, and Ecology. Phenological observations are a key component of climate change studies, tracking the effects of climate on plant phases such as flowering and leafing. Here we address theoretical and practical problems using a combination of digital and hyperspectral imaging phenology monitoring systems, at three spatial scales: on-the-ground, near-surface and airborne, the latter using the emerging technology of Unmanned Aerial System (UAS), or “drones”. On-the-ground phenology precludes observation over large areas and are time consuming. Near-surface remote-phenology using digital cameras, although area-limited, reduces sampling labor. Drones scale up phenological processes to the entire landscape, encompassing multiple scales of observation. We intent to specify and implement novel database, image processing, machine learning, and visualization algorithms to support acquisition, management, integration, and analysis of data from multiscale phenological monitoring systems.

Bio: Dr. Morellato’s main research focus is the phenology and seasonal changes of natural vegetation. She has investigated the patterns of plant reproduction, pollination and seed dispersal, the influence of phylogeny on phenology and methods in phenological research. More recently, Dr. Morellato research has focused on the effects of environmental and climatic changes on plant phenology. The e-phenology research group has applied new technologies and computer science tools to monitor plant phenology from leaves to ecosystems. They scale from ground observations to digital cameras and hyperspectral sensor on towers and also on airborne unmanned vehicles (drones). She participated as a contributing author in the Working Group II (WGII) of the fourth IPCC2007 report, winner of the Nobel Peace Prize 2007.

Paul Watson, Newcastle University UK

Panel: Going Nativepaulwatson75x108

Abstract: The topic of discussion in this panel will be “Going Native”— a reference to a quote from Jim Gray along the lines of “…in order to really understand the computing needs of a scientist you have to go native.” Jim himself did this, immersing himself in astronomy to build what would become the WorldWide Telescope. Bridging the gap between experimental scientists and the computing that underpins their discoveries is an ongoing challenge for eScience. The panel will explore what it means to go native and give examples of where they have seen this work well and share their lesson’s learned from working in this way.

Bio: Paul Watson is professor of Computer Science and Director of the Digital Institute at Newcastle University UK, where he also directs the $20M RCUK Digital Economy Hub on Social Inclusion through the Digital Economy. He graduated in 1983 with a BSc in Computer Engineering from Manchester University, followed by a PhD on parallel graph reduction in 1986. In the 80s, as a lecturer at Manchester University, he was a designer of the Alvey Flagship and Esprit EDS systems. From 1990–5 he worked in industry for ICL as a designer of the Goldrush MegaServer parallel database server.

In August 1995 he moved to Newcastle University, where he has been an investigator on wide range of e-Science projects. His research interest is in scalable information management with a current focus on cloud computing. Professor Watson is a Chartered Engineer and a Fellow of the British Computer Society.

Pei Zhang, Carnegie Mellon University

SensorFly and Beyond: Knowledge Discovery through Ambient Sensingpei-zhang75x108

Abstract: In many fast developing applications (fires, search and situational awareness), deploying, maintaining and operating the system becomes difficult and often dangerous. Especially in in-door environments, responders have traditionally relied on robotic system that are often expensive and difficult to maneuver. The talk will explore sensing systems that are semi-controllable through the SensorFly system. The SensorFly system is a low-cost, miniature controlled-mobile aerial sensor network that aims to be autonomous in deployment, maintenance and adaptation to the environment. Weighing only 30 grams each, it can only carry lightweight/inaccurate sensors and few sensors per-node. The main focus of the talk will be on the challenge on utilizing multiple noisy sensors to discover system information (such as navigation, and localization) as well as sensing the environment (such as localize and sense humans)

Bio: Pei Zhang is an associate research professor in the ECE departments at Carnegie Mellon University. He received his bachelor’s degree with honors from California Institute of Technology in 2002, and his Ph.D. degree in Electrical Engineering from Princeton University in 2008. While at Princeton University, he developed the ZebraNet system, which is used to track zebras in Kenya. It was the first deployed, wireless, ad- hoc, mobile sensor network. His recent work includes SensorFly (focus on groups of autonomous miniature-helicopter based sensor nodes) and MARS (Muscle Activity Recognition). Beyond research publications, his work has been featured on popular media including CNN, Science Channel, Discovery Channel, CBS News, CNET, Popular Science, BBC Focus, etc. He is also a cofounder of the startup Vibradotech. In addition, he has won several awards including the NSF CAREER, Edith and Martin B. Stein Solar Energy Innovation Award, and a member of Department of Defense Computer Science Studies Panel.

Ricardo da Silva Torres, University of Campinas

Advances in Computer Science Towards an Understanding of Tipping Points within Tropical South American Biomesricardotorres75x108

Abstract: Terrestrial ecosystems are currently undergoing unprecedented climate and human-induced disturbances, which are likely to push these systems towards changes in their physiognomies, structure, and functioning. It has been hypothesized that these new configurations may be alternative states of systems comprising vegetation-climate-disturbance interactions. The majority of the studies reporting ecosystem switches considers vegetation-climate-disturbance systems confined to certain spatial scales (local to continental) without accounting for multi-scale interactions and are unable to detect out-of-range changes and/or regime shifts in vegetation due to difficulties in collecting sufficiently long time series to define standard behavior of the system. In this talk, we will present ongoing research initiatives regarding the proposal of novel machine learning and image processing techniques aiming to support the use of multi-scale ecological knowledge in the analysis of vegetation-climate-disturbance systems.

Bio: Ricardo da Silva Torres received a B.Sc. in Computer Engineering from University of Campinas, Brazil, in 2000. He earned his doctorate in Computer Science at the same university in 2004. Dr. Torres has been director of the Institute of Computing, University of Campinas since 2013. Dr. Torres is co-founder and member of the RECOD lab. Dr. Torres is author/co-author of more than 100 articles in refereed journal and conferences. Dr. Torres serves as PC member for several international and national conferences. Dr. Torres has supervised 27 master and 7 PhD projects. His research interests include Image Analysis, Content-Based Image Retrieval, Databases, Digital Libraries, and Geographic Information Systems.

Rob Fatland, Microsoft Research

DemoFest: Live Oceanrobfatland75x108

Science to Marine Industry Forecast

Bio: Rob Fatland is a research software development engineer at Microsoft Research. From a background in geophysics and a career built on computer technology, he works on environmental data science and real-world relevance of scientific results; from carbon cycle coupling to marine microbial ecology to predictive modeling that can enable us to restore health to coastal oceans.

Rosiane de Freitas Rodrigues, Federal University of Amazonas

Estimating the carbon stocks by optimizing LiDAR forest big datarosiane-defreitas75x105

Abstract: To estimate carbon stocks in a given forested region it is important that samples or field plots are placed in locations that have the greatest number of representative trees: dominant (widest) and emergent (tallest). Whereas ground-based inventory focuses on the identification of dominant trees, LiDAR (Light Detection And Ranging) allows for the extraction of tree height thus enabling more precise identification of emergent trees. We are interested in estimating carbon stocks by means of extrapolation and spatialization based on forest inventory using remote sensing LiDAR (Light Detecting And Ranging) technology to determine a set of representative trees, through the application of pattern recognition, graph theory, image retrieval, machine learning and combinatorial optimization techniques. We present preliminary results about the problem of choosing the most representative forest plots using the NP-hard Maximal Covering Location Problem – MCLP. This work is being undertaken in partnership among ICOMP/UFAM (Institute of Computing of the Federal University of Amazonas, Brazil), INPA (the Brazilian National Institute of Amazonian Research), and the University of California, Berkeley-United States.

Bio: Rosiane de Freitas is a computer scientist, Brazilian researcher professor at Institute of Computing of the Federal University of Amazonas (IComp/UFAM), with a PhD in Computer Science and Systems Engineering from the Federal University of Rio de Janeiro (UFRJ), and UNICAMP, Brazil. Developing theoretical and applied research, with expertise in combinatorial optimization and scheduling theory, acting on the following subjects: algorithms, computational complexity, graph theory and mathematical programming, also involving bioinformatics, parallel and distributed systems, networks, software engineering and operations research in general. Partner with renowned researchers and institutions around the world, besides acting in training and coordination of scientific and technological programming contests, and mainly acting in undergraduate and MSc/PhD graduate courses. Also assisting in advancing the careers and goals of women in STEM, in which is involved since the 1st edition in the “International Women’s Hackathon”, sponsored by Microsoft Research. Member of major Brazilian and international scientific societies, reviewer for qualified journals and currently guest editor for a special issue of the Discrete Applied Mathematics journal.

Steve Kelling, Cornell Lab of Ornithology

Panel: Going Nativesteve-kelling75x108

Abstract: The topic of discussion in this panel will be “Going Native”— a reference to a quote from Jim Gray along the lines of “…in order to really understand the computing needs of a scientist you have to go native.” Jim himself did this, immersing himself in astronomy to build what would become the WorldWide Telescope. Bridging the gap between experimental scientists and the computing that underpins their discoveries is an ongoing challenge for eScience. The panel will explore what it means to go native and give examples of where they have seen this work well and share their lesson’s learned from working in this way.

The Birder Effect: Data Driven Science for Biodiversity Conservation

Abstract: Technology has transformed biodiversity conservation; it has enabled a new research scenario where organisms can be studied across broad spatial and temporal scales at high detail. This talk will describe how technology has supported biodiversity conservation in four broad ways. First, numerous organizations are collecting more and higher quality earth and organismal observational data. Second, well-curated data access makes these data more transparent and useable at a much higher frequency. Third, novel data-mining and machine learning techniques identify patterns that emerge from the data making data exploration and visualization a significant part of the scientific process leading to hypothesis generation and testing. Finally, sophisticated analytic processes substantially improve data-driven decision-making. This paper reviews how one project, eBird, a global bird-monitoring project, has taken advantage of these advances in technology to interpret and conserve biodiversity through collection access, visualization and analytics of bird observations.

Bio: Steve Kelling coordinates a team of ornithologists, computer scientists, statisticians, application developers, data managers and project staff to develop programs, tools, and analyses to gather, understand, and disseminate information on birds and the environments they inhabit. His responsibilities include: the management of eBird, a citizen-science project that gathers hundreds of millions of bird observations from around the globe; using unique statistical and computer science strategies to analyze the distribution and abundance of wild bird populations; and the organization of the rich data resources of the global bird-monitoring community and integrating these resources within existing bioinformatic infrastructures.

Thiago H. Silva, Universidade Federal de Minas Gerais (UFMG)

DemoFest: Large Scale Study of Urban Societies in Near Real Timethiago-silva75x108

Abstract: The process of making available massively distributed data through smartphones and social networking sites represents a new source of sensing, which is called participatory sensor network (PSN). Our project aims to show how to use PSNs to help us to better understand urban societies and, build on such understanding, design smarter services to meet people’s needs.

Bio: Thiago H. Silva graduated with a B.Sc. in Computer Science in 2004. He obtained a M.Sc. (2009) and a Ph.D. (2014) in Computer Science from the Federal University of Minas Gerais (UFMG), where he is a post-doc researcher in Computer Science. During his Ph.D., Thiago was a research intern at Telecom Italia, Venezia, Italy, and a visiting Ph.D. student at the University of Birmingham, Birmingham, UK, and at INRIA, Paris, France. Thiago has experience in the industry and academia in the areas of ubiquitous computing, urban computing, social computing, and workload/user behavior modeling.

Tobias Isenberg, INRIA

Touching” the Third Dimension—Exploration of Scientific Data on Surfacestobias-isenberg75x108

Abstract: Since the size and complexity of scientific datasets is growing at a very high rate, people are working on developing techniques to effectively depict and visualize them. However, frequently it is not sufficient to just produce a single static visualization but instead we have to support scientists in discovering aspects about the data that they did not know about it. That means that we have to develop effective interactive visualization tools that support scientists in exploring their data.

In my talk I will address the problem of interactively visualizing data that has an inherent mapping to the 3D spatial domain such as MRI scans, physical simulations, or molecular models. Specifically, I use interfaces on large, touch-sensitive displays because they tend to give people the feeling of “being in control of their data.” That means we face the problem of providing input on a two-dimensional surface which needs to be mapped to manipulations of the three-dimensional data space.

I will talk about FI3D, a technique to navigate in 3D datasets and control 7 degrees of freedom with only one or two fingers being used simultaneously. Next, I will discuss the problem of spatial data selection which is fundamental to further data analysis and also requires to define a 3D selection space with only input on a 2D plane.

Finally, I discuss a case study in which we integrated several different interaction techniques into a tool for fluid mechanics experts to explore their data. I will end my talk by pointing out some open problems and research challenges that we are currently facing.

Bio: Tobias Isenberg is a senior research scientist at INRIA Saclay, France. He received is doctoral degree from the University of Magdeburg, Germany. Previously he held positions as post-doctoral fellow at the University of Calgary, Canada, and as assistant professor at the University of Groningen, the Netherlands. His research interests comprise topics in scientific visualization, in illustrative and non-photorealistic rendering, and interactive visualization techniques.

Yanyong Zhang, Rutgers University

Building sensing applications with the Owl Platformyanyongzhang75x108

Abstract: Recent technology advances have greatly lowered the cost of small sensors, but their widespread use has yet to be realized. In this talk I describe how to build sensing applications using the Owl Platform. It is a software stack designed to lower barriers to entry for developing sensing applications in the cloud, and stands in contrast to bottom-up approaches centered on a particular hardware platform. I first describe Owl’s design abstractions, technology layers, and network protocols. I then present a few real-world deployment experiences of the Owl platform, focusing on our deployment in the laboratory of animal sciences to collect data for the scientists.

Bio: Yanyong Zhang is currently an associate professor in the Electrical and Computer Engineering Department at Rutgers University. She is also a member of the Wireless Information Networks Laboratory (Winlab). Her current research interests are in sensor systems and distributed computing. Her research is mainly funded by the National Science Foundation, including an NSF CAREER award.