One of the prime objectives of a data analyst is to create a flat table with all the required columns and fields from multiple tables. It enables us to do analysis, run codes, and apply suitable machine learning algorithms.
VLOOKUP (and now XLOOKUP) is easily among the most used Excel formula. And anyone who deals with relational databases such as SQL, MS Access, understands the significance of Joins. These tools and techniques help in creating flat files for the analysis.
It takes time to master these techniques as not everyone is savvy with formulas and writing SQL queries. Power Query offers a no-code solution in this regard, and it is easier to learn.
What are JOINS?
A JOIN is a means for combining fields from two tables by using values common to each.
Power Query allows the following types of JOINS:
The following image provides a high-level understanding of these different JOINS
The most used JOINS is Left Outer Join, which is also equivalent to the VLOOKUP in Excel.
Merging on one Column
Consider the following two tables as an example:
Objective 1: Add the name column in Table 1 from Table 2
First Table: Table 1
Second Table: Table 2
Common Column: EmpID
With traditional Excel Functions, we have used the following formula in Table 1:
Power Query offers a no-code solution and requires the following steps:
With Table 1 Selected > Home > Merge Queries
It brings the Merge Queries Option
Select the Table 2 and select the common field in both the table
Click Ok, and it adds one column in Table 1:
Click on the expand icon in the Column to present the list of all the columns from Table 2:
Select the desired columns (in this case Name) from the dropdown, and it adds a new column with Name in Table 1:
Updating Existing Merge
Making changes in the column selection from the second table is a straightforward process.
Click on the gear icon of the Expand Table and select/deselect required columns
Merging on Multiple Columns
Power Query enables merging based on the values of two or more columns in simple steps.
Objective: Add the score in Table 2 for February 2020
First Table: Table 2
Second Table: Table 1
Common Column: EmpID, Month
Achieving this with traditional Excel formulas can be a bit tricky.
There are two minor changes in the process explained above:
· The first table is Table 2, and the second table is Table 1
. With Table 2 selected, click on Merge Queries
· Select Month columns in both the table along with the Emp ID using CTRL key
Notice the order mentioned in the column headers. It takes the sequence of column selection in both the tables. For example, if you select Emp ID first and then Month, follow the same sequence of selection in the second table. If not, then we get the error:
The rest of the steps remains unchanged. Once done, we get the following results:
This technique stands true for merging on more than two columns as well.
The power of no-code Power Query
Related video: BI Simplified Webinar#1, p4: Merge Queries