x

Fastest way to load text files SSIS

My team loads source files on a weekly basis. Some are small (20,000) records and some are up to 30 million records or more in a few instances. What is the fastest way to load with least amount of effort. Most of our files are pipe delimited but could be comma delimited or sometimes fixed width. The problem with SSIS is that the metadata is static and you must trick it or recreated it each time. I wish it had a way to read the metadata from excel or header record and then load the data. I have tried an approach suggested by TechBrothersIT website using a script task but it loads millions of records slowly. Any ideas are greatly appreciated if anyone has run into this or can point me to some really good sites. Thanks!

more ▼

asked Oct 05 at 02:42 AM in Default

avatar image

red68
388 11 15 22

cscript.txt (5.0 kB)
Oct 09 at 02:01 PM red68
(comments are locked)
10|1200 characters needed characters left

1 answer: sort voted first

I believe that using bulk insert will help to speed up the load process. If this question is related to the SSIS 2012 Visual C# Script Task question then you already have everything in place to change it based on the following:

  1. Instead of reading the file line by line and then create and execute the insert statement which inserts just one row into the table, read just the first line from the file (to figure out how it is delimited)

  2. Close and dispose the open stream at this time.

  3. Using C# code generate the bulk insert statement. This will be one statement per table, not per row.

  4. Execute the statement

Here is the example of the line of code generating the bulk insert statement:

 // The line here refers to just one line, which could be any line, no need to read them all
 string delimiter = line.Contains("|") ? "|" : ",";
 
 string query = "bulk insert dbo.[" + filenameonly + "] from '" +
     filePath + "' with (fieldterminator = '" + delimiter + "', firstrow = 2)";

Here is the example of T-SQL solution which:reads first bunch of characters from the file to figure out delimiter; executes the bulk insert statement. The C# is easier to implement though, so the the script below is just an example showing how to figure out the way the data is delimited and then quickly insert it via bulk insert.

 -- Read the first 100 characters or so (change the number if needed), just enough
 -- to figure out whether the file is pipe-delimited or not. If not then lets
 -- assume that it is comma-delimited for the sake of this example
 declare @contents varchar(100) = (
     select
         t.BulkColumn
         from openrowset(bulk 'C:\Temp\SomeTable.csv', single_clob) t
 );
 
 declare @delimiter char(1) = case when charindex('|', @contents) > 0 then '|' else ',' end;
 
 -- This could be a dynamic statement created based on the delimiter, but let it be static.
 -- The assumption here is that the file has column headers in first row, so the row is excluded
 if @delimiter = '|'
     bulk insert dbo.SomeTable 
         from 'C:\Temp\SomeTable.csv'
         with (fieldterminator = '|', firstrow = 2);
 else
     bulk insert dbo.SomeTable 
         from 'C:\Temp\SomeTable.csv'
         with (fieldterminator = ',', firstrow = 2);
 go

The script is attached updated cs script

Hope this helps.

Oleg

script.txt (7.9 kB)
more ▼

answered Oct 05 at 01:14 PM

avatar image

Oleg
19k 3 7 28

Attached my script. Can you show me where to insert the above to make it work. Thanks!

Oct 09 at 02:01 PM red68

@red68 This is done, updated script is in the answer attachment. All I had to do is comment out the looping through the lines part and add the bulk insert piece. The logic is remained as is. It looks like you need to have it so if any file in the middle of the bunch errors out then the processing is stopped at this point (this is why you have try/catch outside of the looping through the files in the folder). Please let me know if this works.

Oct 09 at 03:42 PM Oleg

Thanks, i will give it a shot.

Oct 09 at 04:48 PM red68

That worked! Is there a way in this script to account for the single quote in the name like the other script that you modified? Thanks!

Oct 09 at 05:23 PM red68

@red68 I don't think that there is a need for handling the presence of single quotes with bulk insert. I believe that Easy does it already. I am not sure why I got downvoted though.

Oct 09 at 05:49 PM Oleg
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x34

asked: Oct 05 at 02:42 AM

Seen: 64 times

Last Updated: Oct 09 at 11:21 PM

Copyright 2017 Redgate Software. Privacy Policy