If you work in the analytics space, it’s a fact that the data you work with will grow. And grow, and grow. As the volume of data grows, the complexity of manipulating those data and obtaining real insights also grows. Managing when, where, why, and how you move the data, and the tools you use to process and analyze the data, is often a partner’s value proposition. Now, there’s a data integration service in the cloud that can make this straightforward. Azure Data Factory is our topic this month for the Data Platform and Advanced Analytics Partner call.
Microsoft partners are often working in a learning, demo, or proof of concept environment where it is fairly easy to move the subset of data you are working on around to the right pieces of processing. Once you win the business and start working with larger data sets, processing gets more complex, and it becomes unwieldy to operationalize it manually. You need a way to create, automate, update, monitor, and manage the process remotely. Most enterprises have thousands, or tens of thousands, of these processes going on at any one time, and they want to use human intervention only when there is an exception. An email or text alert that something went wrong in one of the processes allows the valuable employee to focus on higher value tasks.
The beauty of Azure Data Factory is in its simplicity of design, balanced with its ability to call almost anything externally to work on remote data sets. Like a modern physical factory, Data Factory uses a series of steps to move something through a defined process, stopping along the way to complete a task. That flow is called a pipeline, and the actions to perform on the data are called activities. Examples of activities:
- Retrieving data and putting it into a data lake
- Transforming data (schema on read)
- Running a subset of data through a machine learning algorithm
- Reporting anomalies daily in a Power BI dashboard
The opportunity to apply these concepts in your customers’ businesses is significant. Azure Data Factory will increase your productivity working with anything in the data space, and help you connect with customers – they will appreciate the simplicity of the solution, and understand the value automation brings. By helping them implement data solutions that are resilient and valuable, and earning their trust, you may be invited to the higher value analytical services.
Learning recommendations for Azure Data Factory
Here are my recommendations for learning more about Azure Data Factory.
Orchestrating Big Data with Azure Data Factory course
This online course is offered on EdX. It’s a MOOC (Massive Open Online Course), giving you the flexibility to complete the coursework as it fits your schedule. EdX also offers the courses that lead to earning the Microsoft Professional Program Certificate in Data Science.
Introduction to Azure Data Factory Service documentation
The Microsoft Azure team recently published a comprehensive and in-depth set of documentation that includes an overview of Azure Data Factory and a look at each of its features. Materials include text, tutorials, and videos.
Microsoft Virtual Academy: Orchestrating Data and Services with Azure Data Factory
This online course is a useful starting point, and comprises six short video modules that explain the basic capabilities of Azure Data Factory. You’ll learn how it can help with your big data and machine learning projects, get an overview of advanced analytics, and see how Azure Data Factory fits into the Cortana Intelligence Suite.
This site brings together Microsoft big data, analytics, and data science on-demand training, virtual training, in-person training, and information about data science certifications.
Community call about Azure Data Factory on Tuesday, February 21
Join my colleagues and me on the February 21 community call for an in-depth discussion and details about how to demo Azure Data Factory for your customers.