|
Hi Guys, I have a file say sample.txt or a .csv file and this file contains a header, data and a footer. Based on the information in header i need to put the data into the database. In other words i need to extract the data based on the header FYI, everything will be a single string of data and the calculation of header, data and footer should be done on basis of Which is the best approach for this? Edit
(comments are locked)
|
|
I'd use SSIS. You can read read the file in, put in conditional logic, pretty much everything you'd need to make this work. It should be a lot easier than trying to do it all in TSQL too. hey Grant, but how will i break the file to separate the header, data and footer ... and eventhough if i have them separated then how should i decode the file ... i think i should develop a file parser or so .... please advice me how shd i go abt it
Jan 18 '12 at 07:27 AM
AbhiD
(comments are locked)
|
|
I agree with @Grant Fritchey to use SSIS. However using SSIS can be also tricky, if the same file can come with different header/footer and have different internal structure as SSIS requires that the structures of the sources and destinations are constant. However with right conditional logic and separate data flows and/or packages for different file structure and a control mechanism to launch the right one for particular structure the SSIS way will be probably the easiest one to maintain. hey Pavel, can u suggest me something by which i can put the whole internal structure of file into SSIS and then decode my file based on that.....
Jan 18 '12 at 07:29 AM
AbhiD
Based on sample.. I suggest to develop a single package per file structure and a master package which will decide what child package to execute based on the header row. For the determination a script task can be used to open the flat file, read first row to determine file type. According the file type appropriate child package for processing can be executed. In the child package there could be in general two possibilities. 1) Use only a flat file connection, which skipping first row and defining all the columns for particular file structure with eventual error rows redirection - the footer row should be redirected to error. 2) Use flat file connection, which will skip first header row. Then define a single column which will contain whole row. After that a script component can be used for particular custom splitting. This splitting can be handled by eg. Regular Expression or whatever method necessary to process the row correctly. When a footer row comes to the script component it will be simply ignored by the code.
Jan 18 '12 at 12:29 PM
Pavel Pawlowski
thanks pavel .... i will start working on these lines ..lets c how it goes ...
Jan 19 '12 at 07:04 AM
AbhiD
(comments are locked)
|
|
can you post any part of the file? It is possible that LogParser will be able to help you with this situation but I would need to see the header and the first few lines to confirm. hi all, this the sample file ....all the X's and rest will be constant in header n footer but every time the date will be different .... each record begins with a A or C ....
Jan 18 '12 at 07:02 AM
AbhiD
(comments are locked)
|
|
It depends on how much processing needs to be done. For basic things, I concur with Grant that SSIS is the way to go. For more advanced pre-processing I personally think Python or Perl are more malleable, but SSIS can accomplish just about any prepocessing you might wish with the right conditional logic and separting out the flows. I personally find Python far more intuitive for that sort of advanced conditional processing. [Expanded slightly for clarity]
(comments are locked)
|


let me correct myself .. everything will be based on the number of characters .. 700 characters for each component.