question

red68 avatar image
red68 asked

Fastest way to load text files SSIS

My team loads source files on a weekly basis. Some are small (20,000) records and some are up to 30 million records or more in a few instances. What is the fastest way to load with least amount of effort. Most of our files are pipe delimited but could be comma delimited or sometimes fixed width. The problem with SSIS is that the metadata is static and you must trick it or recreated it each time. I wish it had a way to read the metadata from excel or header record and then load the data. I have tried an approach suggested by TechBrothersIT website using a script task but it loads millions of records slowly. Any ideas are greatly appreciated if anyone has run into this or can point me to some really good sites. Thanks!
ssis2012
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

red68 avatar image red68 commented ·
[link text][1] [1]: /storage/temp/4316-cscript.txt
0 Likes 0 ·
cscript.txt (4.8 KiB)

1 Answer

·
Oleg avatar image
Oleg answered
I believe that using bulk insert will help to speed up the load process. If this question is related to the [SSIS 2012 Visual C# Script Task][2] question then you already have everything in place to change it based on the following: 1. Instead of reading the file line by line and then create and execute the insert statement which inserts just one row into the table, read just the first line from the file (to figure out how it is delimited) 2. Close and dispose the open stream at this time. 3. Using C# code generate the bulk insert statement. This will be one statement per table, not per row. 4. Execute the statement Here is the example of the line of code generating the bulk insert statement: // The line here refers to just one line, which could be any line, no need to read them all string delimiter = line.Contains("|") ? "|" : ","; string query = "bulk insert dbo.[" + filenameonly + "] from '" + filePath + "' with (fieldterminator = '" + delimiter + "', firstrow = 2)"; Here is the example of T-SQL solution which:reads first bunch of characters from the file to figure out delimiter; executes the bulk insert statement. The C# is easier to implement though, so the the script below is just an example showing how to figure out the way the data is delimited and then quickly insert it via bulk insert. -- Read the first 100 characters or so (change the number if needed), just enough -- to figure out whether the file is pipe-delimited or not. If not then lets -- assume that it is comma-delimited for the sake of this example declare @contents varchar(100) = ( select t.BulkColumn from openrowset(bulk 'C:\Temp\SomeTable.csv', single_clob) t ); declare @delimiter char(1) = case when charindex('|', @contents) > 0 then '|' else ',' end; -- This could be a dynamic statement created based on the delimiter, but let it be static. -- The assumption here is that the file has column headers in first row, so the row is excluded if @delimiter = '|' bulk insert dbo.SomeTable from 'C:\Temp\SomeTable.csv' with (fieldterminator = '|', firstrow = 2); else bulk insert dbo.SomeTable from 'C:\Temp\SomeTable.csv' with (fieldterminator = ',', firstrow = 2); go The script is attached [updated cs script][1] Hope this helps. Oleg [1]: /storage/temp/4317-script.txt [2]: https://ask.sqlservercentral.com/questions/144688/ssis-2012-visual-c-script-task.html

script.txt (7.7 KiB)
10 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

red68 avatar image red68 commented ·
Attached my script. Can you show me where to insert the above to make it work. Thanks!
0 Likes 0 ·
Oleg avatar image Oleg commented ·
@red68 This is done, updated script is in the answer attachment. All I had to do is comment out the looping through the lines part and add the bulk insert piece. The logic is remained as is. It looks like you need to have it so if any file in the middle of the bunch errors out then the processing is stopped at this point (this is why you have try/catch ***outside*** of the looping through the files in the folder). Please let me know if this works.
0 Likes 0 ·
red68 avatar image red68 commented ·
Thanks, i will give it a shot.
0 Likes 0 ·
red68 avatar image red68 commented ·
That worked! Is there a way in this script to account for the single quote in the name like the other script that you modified? Thanks!
0 Likes 0 ·
Oleg avatar image Oleg commented ·
@red68 I don't think that there is a need for handling the presence of single quotes with bulk insert. I believe that Easy does it already. I am not sure why I got downvoted though.
0 Likes 0 ·
Show more comments

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.