SSIS Package Optimization for thousands of excel files and archival on the basis of business logic
I have developed a scenario where ssis package will retrieve the excel file name using foreach loop and search whether that file name is part of business logic. If it isn't I need to archive that, I have done that using File system task.Now the problem here is, for each loop is taking approximately 5 seconds for each file.Now there are thousands of fies in the folder how can i optimize the code to run faster? I have developed SSIS package containing a procedure will run and then inside foreachloop there is execute sql task to identify whether that filename is part of business logic if no then i am passing flag as 0 and if yes flag as 1.And then I used file system task to move the file to the archive folder.
Is this a one-off thing that's not urgent? If so, just let it take a while. Who cares? Otherwise I think you need to identify which part is taking the time so you can improve that part. You could manually move the biggest file to see how long that takes. You could test calling the stored procedure and time that. You could run the For Each file loop with the processing part commented out or disabled to see if iterating over a lot of files is the problem. Once you know where the time is being used up, you will have a better idea of what you need to do.
Another option would be to introduce parallel processing into the package. SSIS is great for that. In other words, if the files are able to be broken up into chunks in some fashion (iE: name, created date, modified date, etc), then you could have multiple foreach loops running against their respective collection of files. Don't know if the files could be segmented like that but it's an option.