I am being asked to load data for the data quality assurance team. Someone else is extracting the data from the application data source for me. I receive the data as .xls/.xlsx files or sometimes as a delimited file if there are more than 300k records.
So when anywhere from 20 to 60 files are handed over to me, I am expected to load the data in a 'few hours' to a half day. What would be your first approach the task described above? There are no answers out of scope. I have all the tools you'd probably mention. I am only limited by my experience.... what do you think SSIS? The data quality team likes the data loaded with the appropriate data type (not all just VARCHAR(255)).
Just a few requirements: Each file needs loaded as it's own table(that does not exist) and to use the file's header row as the fields and name the table using the filename. Here are the two approaches I've been using:
What is a better approach? Is there a SSIS tutorial on how to load a directory of files each as its own table into SQL Server?
I created a package from this tutorial: http://beyondrelational.com/modules/12/tutorials/24/tutorials/9011/getting-started-with-ssis-part-6-import-export-wizard-part-1.aspx
Could a loop be added to repeat the process for multiple files?
I would link to them either as an Excel or Text(CSV) source. Sure, it isn't as fast as BCP, but you are keeping all that import stuff at arms length and don't then need staging tables. You can perform SQL Queries on the linked tables in order to import the data, and do all your transformations in SQL. Robyn and I explained quite a bit about how to do it here http://www.simple-talk.com/sql/t-sql-programming/sql-server-excel-workbench/ and I recently showed how to do CSV format imports via a linked server here http://www.simple-talk.com/sql/t-sql-programming/the-tsql-of-csv-comma-delimited-of-errors/
answered Sep 12, 2012 at 04:45 PM
This is what I would do:
BULK INSERT DBO.LIST FROM '\\\Server\Data Projects\\LIST.txt' WITH (FIELDTERMINATOR = ',', FIRSTROW=2 )
p.s. One option is to use xp_cmdshell when working with OS files