Combining multiple files in a Folder: Power Query

Picture this: There is a folder of the year 2019. It has 12 files for each month of the year. All the files have the same structure, except the data points for the respective months. Every month we compile all the files in the folder to make a year-to-date file, and then go ahead with the reporting & analytics work. Sound familiar?

Those who are good with VBA macros can write a bunch of codes to automate the process of data compilation. But not all of us are comfortable with writing codes. So, we end up doing this work manually.

This article covers how we can automate the process of compiling multiple files in a folder using Power Query.

Power Query is a very powerful ETL (Extract, Transform & Load) tool. It comes with the unique capability of VBA macros and Excel formulas. And it is super easy to learn and implement.

Note: Steps covered in this article assumes the following:

  • The table structure is identical in all the files (Number of columns, name of the columns)

  • The name of the sheet is identical in all the files. For instance, all the 12 files in the folder should have an identical sheet name, say Sheet1, or Data(as in the current example).

Data/Folder structure

We have 6 files in the folder:

Creating a connection

Data > Get Data > From File > From Folder

Navigate to the folder path and click Ok

It shows the metadata about the files present in the folder:

Click on the dropdown arrow on Combine > Combine & Transform Data

Selecting the Sample file for the template

It is the first step towards combining the files. In this step, Power Query is looking for a sample structure of the data table. Since the table structure in all the files is the same, we select the First File as the Sample File. We select the name of the sheet under Parameter.

Once we click Ok, Power Query compiles all the files in the folder where:

  • The sheet name in each file is Data

  • The output table has the same structure as in the First File.

The Output

We get the Power Query window with a query named Combine.

This table has data from all the 6 files in the folder:

Note: By default, Power Query only loads the first 1000 rows in the Power Query window. To see the entire data table, we load it to a table

Home > Close & Load To

Alternatively, you can also perform additional data transformations. Please refer to the following article for more details:

Five features you should know about Power Query

Close & Load To option gives us a choice to select how and where we want our output.

Since we need the output as an Excel table, select Table and Existing Worksheet.

Also, uncheck the option on Add this data to the Data Model. This option is for loading the data into Power Pivot. Check this option when we want to do some data modeling in Excel.

You may refer to the following article for details:

Data Modeling with Power Pivot

The Automated Process

As mentioned earlier, Power Query is an amalgamation of VBA macros and Excel functions. All the steps performed listed under the Applied Steps (Just like a macro recording)

Each step has certain sets of actions performed

When we reload the table by refreshing it, the Power Query perform all the listed steps again:

Hence, when a new file added in the folder, refresh the table to include the newly added file in the compiled table:

Refresh Table to re-run the compilation process

In conclusion, if the structure of the files is standard, then we take the benefit of this method.

Using this feature, we can significantly reduce our time and effort required in mundane tasks like copy-pasting.

In the next article, we will see a couple of scenarios of folder compilation where the file structure is not standard.


Drop Me a Line, Let Me Know What You Think

+91 9871-641-146

Join WhatsApp group: BI Simplified