Aim : This post is specifically for those geeks who wants to learn about SSIS Aggregate Transformation. Microsoft provides a vast list of Data Flow transformations in SSIS, and this (aggregate transformation) is one of the most used transformations in SSIS.
Description : Aggregate transformation is a blocked and Asynchronous transformation. With Asynchronous, I mean to say that Output rows will not be equal to the Input rows. Also, this is categorized as Blocked transformation because aggregation is performed on all the column values. So, it will not release any row until it processes all rows. For more information on Dataflow transformation categories and examples, visit this – Dataflow transformations in SSIS
It will work in the same way as the aggregate functions perform in the SQL server. Generally, we have different types of Aggregate functions (MIN, MAX, SUM, AVG, COUNT and COUNTDISTINCT). The aggregate transformation also provides GROUP BY clause, which you can use to specify groups to aggregate across.
Now we are going to learn the functionality of each function in detail. You can visit, Source – Microsoft’s website to learn more on this.
- SUM: Used as SUM(). This function Sums up the values present in a column. Only columns with numeric data types can be summed.
- AVG: Used as AVG(). This function Returns the average of all the column values present in a column. Only columns with numeric data types can be averaged.
- MAX: Used as MAX(). Returns the maximum value present in a group. In contrast to the Transact-SQL MAX function, this operation can be used only with numeric, date, and time data types.
- MIN: Used as MIN(). Returns the minimum value present in a group. In contrast to the Transact-SQL MIN function, this operation can be used only with numeric, date, and time data types.
- COUNT: Used as COUNT(). Returns the number of items present in a group.
- COUNT DISTINCT:Returns the number of unique nonnull values present in a group.
- GROUP BY: Used as GROUP BY(). Divides data sets into groups. Columns of any data type can be used for grouping.
The SSIS Aggregate transformation will handle null values in the same way as the SQL Server relational database engine handles. The behavior is defined in the SQL-92 standard. Following are the rules which are applied:
- In a GROUP BY clause, nulls are treated like other column values. If the grouping column contains more than one null value, the null values are put into a single group.
- In the COUNT (column name) and COUNT (DISTINCT column name) functions, nulls are ignored and the result excludes rows that contain null values in the named column.
- In the COUNT (*) function, all rows are counted, including rows with null values.
I am sure this information provided by Microsoft must have helped you to clear your basics. Let’s go ahead to my favorite session i.e. Step by Step Example to show working of Aggregate transformation in SSIS.
Step by Step example on SSIS Aggregate transformation
Step 1. SSIS Package Configuration.
- Create a SSIS package with name “Aggregate”.
- In your SQL Server Data tool (abbreviated as SSDT is known as BIDS of 2012), drag and drop Data Flow Task (DFT) to control flow window.
- Double click on Data Flow Task. It will take you to the Data flow tab where you can perform your ETL functions i.e. E – Extract data from heterogeneous source, T – Transform it with Data flow transformations and L – Load transformed data to Data warehouse.
Step 2. OLEDB Source Configuration.
- Drag and drop OLEDB source to Data flow window.
- Now, configure OLEDB source as per your settings. For this example, I am creating a connection manager to OLEDB source. I have given server name as (local). I am choosing database as AdventureWorks2008. Add this connection manager to OLEDB source. I am selecting a table named [Sales].[SalesOrderDetail].
- Here, I have not shown you the configuration steps of OLEDB source. I hope you all are very familiar with it. Trust me it is as simple as shopping from flipkart.
Step 3. Aggregate Dataflow Transformation Configuration.
- Drag and drop Aggregate transformation to Dataflow window.
- Create a connection between OLEDB source and Aggregate transformation. To create connection, hold path (green arrow) from OLEDB source and drag it onto Aggregate transformation.
- To configure Aggregate transformation, double click on it. An Aggregate transformation editor window will pop up asking for settings.
- Select the required Columns and choose appropriate Operation on that selected column.
- In this example, I am going to find total rows in the table by using COUNT function and maximum modified date by using MAX function, the sum of unit price by using SUM function and the minimum line total by using MIN function.
- Below screen shot states that I have implemented all the mentioned operations in my aggregate transformation.
- Finally click on OK button to confirm the end of configuration for “SSIS Aggregate Transformation”.
Step 4. OLEDB Destination Configuration.
- Drag and drop OLEDB destination to the Dataflow window.
- Create a connection between Aggregate transformation and OLEDB destination.
- Configure OLEDB destination as per your settings.
- I am attaching Data viewer to view the output while executing the SSI package.
- Finally it will look like picture shown below.
Note :- Create a “Data Viewer” between Aggregate transformation and OLEDB destination for observing the flow of data as well as Output data too.
Kudos!!! We are done with the creation of SSIS package which will perform aggregate transformation on data. Now, I am going to run our SSIS package to view the results.
Conclusion – By observing about screen shot, we can conclude our SSIS aggregate transformation operations on given table ([Sales].[SalesOrderDetails]) is performed successfully. Final output is – One single row with Four columns. Below is the complete analysis of this result.
- First column tells us the total count of rows in given table.
- Second column is showing the maximum Modified date.
- Third column is showing the sum of Unit price.
- Fourth column gives the minimum Line total.
With this we come to an end of article on Aggregate transformation in SSIS. I hope you all must have enjoyed this step by step example on aggregate transformation. Please drop/leave your valuable feedback below as comments. It is highly appreciated.