Picture this: There is a folder of the year 2019. It has 12 files for each month of the year. All the files have the same structure, except the data points for the respective months. Every month we compile all the files in the folder to make a year-to-date file, and then go ahead with the reporting & analytics work. Sound familiar?
Those who are good with VBA macros can write a bunch of codes to automate the process of data compilation. But not all of us are comfortable with writing codes. So, we end up doing this work manually.
This article covers how we can automate the process of compiling multiple files in a folder using Power Query.
Power Query is a very powerful ETL (Extract, Transform & Load) tool. It comes with the unique capability of VBA macros and Excel formulas. And it is super easy to learn and implement.
Note: Steps covered in this article assumes the following:
The table structure is identical in all the files (Number of columns, name of the columns)
The name of the sheet is identical in all the files. For instance, all the 12 files in the folder should have an identical sheet name, say Sheet1, or Data(as in the current example).
We have 6 files in the folder:
Creating a connection
Data > Get Data > From File > From Folder
Navigate to the folder path and click Ok
It shows the metadata about the files present in the folder:
Click on the dropdown arrow on Combine > Combine & Transform Data
Selecting the Sample file for the template
It is the first step towards combining the files. In this step, Power Query is looking for a sample structure of the data table. Since the table structure in all the files is the same, we select the First File as the Sample File. We select the name of the sheet under Parameter.
Once we click Ok, Power Query compiles all the files in the folder where:
The sheet name in each file is Data
The output table has the same structure as in the First File.
We get the Power Query window with a query named Combine.
This table has data from all the 6 files in the folder:
Note: By default, Power Query only loads the first 1000 rows in the Power Query window. To see the entire data table, we load it to a table
Home > Close & Load To
Alternatively, you can also perform additional data transformations. Please refer to the following article for more details:
Five features you should know about Power Query
Close & Load To option gives us a choice to select how and where we want our output.
Since we need the output as an Excel table, select Table and Existing Worksheet.
Also, uncheck the option on Add this data to the Data Model. This option is for loading the data into Power Pivot. Check this option when we want to do some data modeling in Excel.
You may refer to the following article for details:
Data Modeling with Power Pivot
The Automated Process
As mentioned earlier, Power Query is an amalgamation of VBA macros and Excel functions. All the steps performed listed under the Applied Steps (Just like a macro recording)
When we reload the table by refreshing it, the Power Query perform all the listed steps again:
Hence, when a new file added in the folder, refresh the table to include the newly added file in the compiled table:
In conclusion, if the structure of the files is standard, then we take the benefit of this method.
Using this feature, we can significantly reduce our time and effort required in mundane tasks like copy-pasting.
In the next article, we will see a couple of scenarios of folder compilation where the file structure is not standard.