question

Zelenov_Alexander avatar image
Zelenov_Alexander asked

A bug when I import flat file in a table using DTEXEC when the table has float type field and the file has '.' float delimiter in this field.

Hello! My question is about a bug that appears when I import a flat file having float-type field with "." delimiter. A sample of this file: appln_id,nace2_code,weight 1,"26.3",0.5 1,"27.33",0.125 1,"28.23",0.375 2,"21",0.7894737 2,"26.5",0.21052632 3,"32.5",1.0 When I loading data from this flat file using SQL Server Import Export Wizard or saving a package and run it later using Execute Package Utility, there is no problem. But when I run this package (earlier saved in Wizard) in SSMS using "exec master..xp_cmdshell 'dtexec...'", there appears message such as **Description: Data conversion failed. The data conversion for column "weight" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".** What is it? I bug of SQL Server? Why it cannot convert in float type both ',' and '.' delimited values in every case regardless of flat file import method? It's so difficult? Sorry for my English, I'm not English-speaking. Thank You in advance for answer. Alexander.
importdtexecfloat
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
@Zelenov_Alexander There is no bug in SQL Server. I am not sure what you mean stating "**why it cannot convert in float type** ...", your sample data does not have any numbers which include both "**,**" and "**.**". There is, however, a potential for loss of data of the 0.21052632 value. It all depends which float you use. If it is float(24) which uses 4 bytes for storage and has a grand total of just 7 digits for precision then this value cannot possibly fit in it without loss of data (there are 8 digits past the decimal point and the float(24) can only accommodate 7). If you mapped it to float(53), which takes 8 bytes for storage and can accommodate up to 15 digits then the problem is with something else. I am not sure why some people are obsessed with float type and do not use the proper decimal instead, which gives a complete control over both precision and scale. Float was never meant to be used for storage of normal numbers, it is only good for storing ***approximate numeric values*** when the precision is not that important and the numbers are rather big (distance between stars or something). Please try to map your column to decimal type.
1 Like 1 ·

1 Answer

·
Oleg avatar image
Oleg answered
I tried 2 different ways of importing the data from the sample file in question and both ways worked fine without any errors. In order to try to duplicate the problem, I first created a table to store the data from the flat file. Here is the script I used to create such table (please note that in the real life scenario I would never create a heap table, but because this table is only for demo purposes, I guess it is OK for it to lack a clustered index): set ansi_nulls on; set quoted_identifier on; go create table dbo.SampleFromFlatFile ( appln_id int not null, nace2_code varchar(50) not null, -- Not sure about float, should use decimal instead. -- Please note that float as such implies float(53), so -- it will accommodate up to a grand total of 15 digits. -- If there is a need to reduce the precision for whatever reason -- then the column should be defined explicitly as float(24) [weight] float not null ); go Now that the table exists, it is possible to import data into it without the need to wheel out any SSIS packages. This script is pretty flaky, because as written, it will not properly handle the double quotes used as text qualifier, and will just include them as a part of the second column value. There are ways to deal with this problem, but for now, assuming that the values in the second column do not include commas as a part of the value (which is the true reason why the text qualifier is even needed), the bulk insert statement below, coupled with the update, will work: -- Method 1, using BULK INSERT bulk insert dbo.SampleFromFlatFile from 'c:\temp\sample.txt' with (fieldterminator = ',', firstrow = 2); -- extra step to remove the double quotes from nace2_code column update dbo.SampleFromFlatFile set nace2_code = replace(nace2_code, '"', ''); go Running this script, based on the sample flat file in question, produces the following results when ***select \* from dbo.SampleFromFlatFile*** is executed afterwards: appln_id nace2_code weight ----------- ------------ ---------------------- 1 26.3 0.5 1 27.33 0.125 1 28.23 0.375 2 21 0.7894737 2 26.5 0.21052632 3 32.5 1 As can be seen, because the column in question is defined as float (which defaults to float(53)), there is no any data loss. Should the column be defined as float(24), the rows would be inserted still, but the value of the weight column in 5th record would be 0.2105263 (dropping last number or rather rounding to the grand total of 7 digits) The second method involves going through that Import Data Wizard in SSMS using the following options: On the "**Choose a Data Source**" page: **Data Source**: Flat File Source; **Format**: delimited; **Text qualifier**: "; **Header row delimiter**: {CR}{LF}, **Header rows to skip**: 0; **Column names in the first data row** checkbox is checked **Choose a Destination** page: instance and database name are specified as needed **Select Source Tables and Views** page: Destination is set to dbo.SampleFromFlatFile table If I delete the rows from the table prior to running the package and then run it, the rows are imported back in, without any data loss as expected. Finally,deleting the rows from the table and then invoking the saved package works as well: exec master..xp_cmdshell 'dtexec /file "c:\temp\ImportFlatFile.dtsx"'; go I am not sure about the true source of the problem with the column lovely named **weight** after a reserved T-SQL keyword, but please try going through the wizard again and check the **data mapping** page just in case. I could not reproduce the problem even when deliberately changing the column type to something which for sure does not fit, the rows are still inserted with some data loss due to rounding taking place. Oleg
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.