Welcome back to our Getting Started with Excel and Power BI series.
Today we will continue with our first Excel feature, Power Query, which allows you to discover, shape and transform your data before importing it into Excel or your Power Pivot data model. In Part I we covered mostly how to connect to different data sources and some quick transformations for your newly found data.
In this post we will cover:
To start this tutorial you'll need:
OK, let's get started. Open the provided Excel file at the end of this post. You can follow the getting started steps with the video or with the text on this post. Once you open the file, select Power Query from the ribbon. If you are on one of the imported tables, you will see the Workbook Queries menu on your right. If you don't, just click the Workbook Queries button on the Power Query ribbon.
First let's do an online search to import and edit its query so we can learn how to create and modify formulas:
1. Create and modify a simple formula
1. Select the Power Query tab and perform the following online search: "largest cities south america"
2. Hover over the public data results: 'Largest Cities..." and select edit, which will take you to the Query Editor
3. Click on the fx icon right next to the formula bar
4. Type the following formula = Table.AddColumn(Source, "Country Code", each Text.Start([Country], 3)) and click on the check symbol to the left of the formula bar - This formula will extract the first 3 characters of Country and place them in a new column
5. You can modify this formula directly in the formula bar or selecting in Applied Steps the step you want to edit and clicking the gears icon
2. Filter, sort and group data
Filtering on the Query Editor is very similar to Excel. For any given column you can click on the down arrow and start filtering or sorting right away. In our specific case:
1. First, let's try automatic filtering. Select the Country column, click on the down arrow, deselect (Select All) and choose Brazil, Chile and Ecuador.
2. When you click OK, you'll have the filtered list.
You can also use the same menu to sort your data (ascending or descending).
Now, let's try grouping by country:
3. Right click on the country column header and select "Group By"
4. You will see the Group menu. In this case, we'll just count how many records we have by country. You can also perform other grouping operation such as sum, average, min or max on a field (in this example we could do that with population):
4. Work with columns
You can perform almost any type of operation with columns on the Query Editor to shape and transform your data any way you like it. We did that on step 1 with a simple formula. Let's try other transformations available within Power Query.
1. First let's remove columns that we don't want to use. In this case, holding down the Ctrl key select the Image and Key fields. Then right click on one of them and select Remove Columns on the menu:
2. Now, let's create a Key with our country code and the city name. In Excel, this would be a text operation with the "&" symbol. In Power Query, this is the same. Click on add custom column on the column menu.
3. On the add custom column menu, write the following expression: [Name]&"-"&[Country Code]. We are joining the city name and the country code with a dash in the middle. Rename the column to "Key"
4. You will see the newly added column after you click OK and the function being showed in the formula bar. The Power Query Formula Language provides a lot of flexibility to shape and transform your data. You can find more information and references here. In any case, you can use the Query Editor to save these steps through the UI and Power Query will take care of the formulas for you.
5. Finally, you will notice that Power Query is saving all of these steps to be replicated on your data refresh or modified in the Applied Steps panel
5. Combine data
There are several ways to combine data in Power Query. We will take a look at 2 of them: Append and Merge
1. First, we will merge this query with another one that has the country population on it. The file provided already has another query (found with Online Search) with the population of countries called "Population by Country". Click on the Merge button on the Power Query menu.
2. On the merge menu, select each query and then click on the country field for both. This will be the the field that will define the merge so we can get country population on our original table. Also select "Only include matching rows" so we get population just for the list of countries we have. Before clicking OK Power Query already tells us that we have 40 matching records. Click OK
3. You will get a new Query Editor window with a new column. Click on the corner of the header for this new column and select the fields that you want to merge. In this case, we want population. Then click OK.
4. Now, we have a new table with all the information we had before, plus the population of the country for each record. If we click on close and load, we'll import this data into Excel.
5. Now, let's try to append data to our original table. There's another query in the file called "Largest Cities in Europe". This query already has the same structure as our original table after adding Country Code and Key, including column names. We achieved this using the Query Editor on a similar online search. You can review the steps editing this provided query. We'll append this data to our first table. Click on the Append button on the Power Query menu.
6. As before, select the 2 queries that we want to Append ("Largest Cities in South America" and "Largest Cities in Europe") and then click OK.
7. You will get a new Query Editor window with the appended queries.
6. Share queries
Finally, if you signed up for a Power BI trial, you have the option of sharing any of these queries with the rest of your organization. As long as the user has access to the underlying data (in this case public data) they will be able to search for the query with the Online Search feature within Excel and load the data from the query you shared. Let's share the first table we created: "Largest cities in South America"
1. First of all, you need to be signed in to your Power BI tenant. On the Power Query ribbon, click on the Sign In button
2. Enter your Power BI / Office 365 credentials and sign in
3. Now, let's make sure the name and description of our query is accurate so other users are able to find it. On the Workbook Queries menu, right click on the "Largest cities in South America" query and select edit
4. Once your are on the Query Editor, click on All Properties on the Query Setting menu on the right. The you will see the query name and description
5. Everything looks good, so let's share the query to the Data Catalog. Close the Query Editor and right click again on the query. Then select the Send to Data Catalog option
6. You will get another opportunity to edit the name and description. You can also add a documentation URL in case people want more info or request access to the data. Click on the Sharing menu. Here you can decide who will have access to query you're sharing. In this case we will go with Everyone in the enterprise. Click Send.
7. Now, your query is live on the Data Catalog and discoverable by anyone based on your description. Let's do an online search for it and filter by organizational data.
8. Finally, if you want to edit your shared queries you can click on the My Data Catalog Queries button on the Power Query ribbon. From this menu you can delete, load, update and even view statistics for your query.