Advanced Excel Manipulation in SSIS: Part I

     This is a 2 part post of some advanced techniques that I am hoping to demonstrate this weekend in my IndyTechFest presentations. The premise is that you need to take an Excel workbook and process all of the sheets within the workbook in some way. Now SSIS is normally cool if the naming of the sheets is standardized and the same each time…but ….what if they were not? What if you only knew that you would have n number of sheets per workbook to process and no idea of how they would be named.  It actually turns out to be pretty easy to handle once you know what you are doing. First lets assume that you have the connection to your Excel workbook already made. You would merely do the following.

1. Create a variable to hold the current sheet name of the sheet to be processed.

2. Drop a For Each Loop onto the design surface with a Data Flow object in the center for processing each of the sheets as shown below.

Image 1: Set up For Each Loop

3. Now edit the For Each Loop. Select the Collections tab and pick the ‘Foreach ADO.NAT Schema Rowset Enumerator’. Select your connection and set the schema to Tables. This will make the For Each Loop run through the tables within the workbook. This setup is shown below….

Image 2: For Each Edit Window

4. Lastly, you would want to set a variable mapping in order to capture the table name for processing. This is a little different from the normal file For Each Loop presentation in that you need to select index 2(as shown below) in order to capture the table name instead of 0.

Image 3: Mapping a Variable

That’s it! Now the process will loop through all of the tables and capture the table name in the variable. All that you have to do now is to hook up your Excel source to use the variable as the table name and you are through. In the next post, I will look at how you can use some custom script code to actually perform a kind of custom pivot table on your excel data.