I am being asked to load data for the data quality assurance team. Someone else is extracting the data from the application data source for me. I receive the data as .xls/.xlsx files or sometimes as a delimited file if there are more than 300k records.
So when anywhere from 20 to 60 files are handed over to me, I am expected to load the data in a 'few hours' to a half day. What would be your first approach the task described above? There are no answers out of scope. I have all the tools you'd probably mention. I am only limited by my experience.... what do you think SSIS? The data quality team likes the data loaded with the appropriate data type (not all just VARCHAR(255)).
Just a few requirements: Each file needs loaded as it's own table(that does not exist) and to use the file's header row as the fields and name the table using the filename. Here are the two approaches I've been using:
1.) I have been using an VB module which converts all xls files in a directory to Access objects then uses an ODBC connection to upload the each file as a table into SQL. The drawback is that it runs slowly from my local machine. Will it run faster if the file is run on the server and possibly change the connection from ODBC to something else. Code below:
2.) Another method I used was BULK INSERT where I a.) use the header row to create the table and then b.) load the file with BULK INSERT
What is a better approach? Is there a SSIS tutorial on how to load a directory of files each as its own table into SQL Server?
I created a package from this tutorial: http://beyondrelational.com/modules/12/tutorials/24/tutorials/9011/getting-started-with-ssis-part-6-import-export-wizard-part-1.aspx
Could a loop be added to repeat the process for multiple files?
I would link to them either as an Excel or Text(CSV) source. Sure, it isn't as fast as BCP, but you are keeping all that import stuff at arms length and don't then need staging tables. You can perform SQL Queries on the linked tables in order to import the data, and do all your transformations in SQL. Robyn and I explained quite a bit about how to do it here http://www.simple-talk.com/sql/t-sql-programming/sql-server-excel-workbench/ and I recently showed how to do CSV format imports via a linked server here http://www.simple-talk.com/sql/t-sql-programming/the-tsql-of-csv-comma-delimited-of-errors/
answered Sep 12 '12 at 04:45 PM
This is what I would do:
BULK INSERT DBO.LIST FROM '\\Server\Data Projects\LIST.txt' WITH (FIELDTERMINATOR = ',', FIRSTROW=2 )
p.s. One option is to use xp_cmdshell when working with OS files