question

cotsjadixon avatar image
cotsjadixon asked

Insert vs Update with Unicode

I have a table where all the fields are declared as nvarchar. If I do an insert into the table all unicode characters come into the fields fine. But when I do an update I am losing some of the characters. Here is my udpate statement: UPDATE TableII SET LOCATION = CAST(( COALESCE([CITY], N'') + NCHAR(13) + NCHAR(10) + COALESCE([INSTALLATION], N'') + NCHAR(13) + NCHAR(10) + LEFT(LOCATION.LATLONG, 2) + N'°' + SUBSTRING(LOCATION.LATLONG, 3, 2) + N'''' + SUBSTRING(LOCATION.LATLONG, 5, 2) + NCHAR(34) + N' ' + SUBSTRING(LOCATION.LATLONG, 7, 1) + N' ' + SUBSTRING(LOCATION.LATLONG, 8, 3) + N'°' + SUBSTRING(LOCATION.LATLONG, 11, 2) + N'''' + SUBSTRING(LOCATION.LATLONG, 13, 2) + NCHAR(34) + N' ' + SUBSTRING(LOCATION.LATLONG, 15, 1) ) AS NVARCHAR(MAX)) FROM TABLEII INNER JOIN LOC ON TABLEII.REC = LOC.REC INNER JOIN LOCATION ON LOC.LOCRECNUM = LOCATION.LOCRECNUM WHERE ( LOC.LOCSEQ = '1' ) As you can see I have tried a variety of things to force the unicode characters to not be lost, but it does not work. Any ideas why the insert works fine but this update does not?
updateinsertunicode
2 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Can you explain what you mean by "I am losing some of the characters" ? Is the string truncated? Are characters being corrupted/translated somehow? What is happening?
1 Like 1 ·
Hmm. Which version of SQL Server are you using?
0 Likes 0 ·
cotsjadixon avatar image
cotsjadixon answered
I figured out a way around this issue. I basically drop my target database (the one that holds my data) and copy the structure (empty database) back via restore. Before I load any data in I determine the codepage of the data (which is stored in the header of the text data file I load in) and I do an "alter database [mydatabase] collate " and the desired codepage. Then I load in my data. The real trick though is to execute this alter statement against the Master database, not the target database. That is (for sql server express) I connect to the Master.mdf file to execute the alter database statement above. If I tried to run it just against the target database it threw an error "Resetting the connection results in a different state".
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Grant Fritchey avatar image
Grant Fritchey answered
Is the City column NVARCHAR? If not, the COALESCE will take that column as it's data type and use it, regardless of what you do with the rest of the code.
4 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Well, speaking of collation, any chance one of the tables got accidently updated with a different one? I've seen that happen before.
1 Like 1 ·
All the columns (input and output) are nvarchar. That is part of the reason I am confused.
0 Likes 0 ·
Including those in the other tables? (Sorry, just making sure...)
0 Likes 0 ·
Yes. I am messing with the database collation right now, but since they work in one instance and not the other I am not sure that is the proper solution.
0 Likes 0 ·
cotsjadixon avatar image
cotsjadixon answered
I have checked the tables and all have the same collation and the columns are all nvarchar. I am using sql server express 2008. What I mean by losing the characters is that I have some data that is in the Central Europe codepage. When I imported the data using bulk insert all the data shows the proper characters. When I have done a sql statement inserting that data into another table I also see the proper characters. But when I do the update, some still show up and others are changed. For example, there is a c with an accent that converts to a plain c on the update, but it is there in all other cases. I also tried working with some cyrillic data and had similar results. The bulk insert works file and I can see the data properly. But when I use sql statements to insert or update (this time it is both) the cryrillic turns to ???. When I try to modify the database for cyrillic collation I then see all the characters properly but my sql statements error out because some of the statements compare the column values to strings. I would prefer not to change all my statements to have N before the strings. It also worries me that I would have to change my collation for each differnt data set. Is there some collation that is "generic" enough to handle all sets? Seems the nvarchar can handle it ok. Or is there some other property or sql statement parameter I can use to get around this?
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.