question

ashok2012 avatar image
ashok2012 asked

how to insert without prefix N' in nvarchar value in SQL

Hi, I am using SQL 2008 R2

CREATE TABLE dbo.CollationCheck(PKColumn int NOT NULL IDENTITY (1, 1),
  ID int identity(1,1),
  NVarcharColumn nvarchar(50) NULL,
   )

db has SQL_Latin1_General_CP1_CI_AS collation

Statement 1:

insert into CollationCheck (NVarcharColumn) values (N'えましょう')

it Works

Statement 2:

insert into CollationCheck (NVarcharColumn) values ('えましょう')

it did nt work. throws ????? as result

How can we do database level changes like collation changes to work statement 2 without

prefix N'value ?

Note: I did nt want to do code level changes while write Statement. I want it on database level

collationnvarchar
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

This web site runs by votes. For all the helpful answers below, please indicate this by clicking on the thumbs up next to each of those answers. If any one answer lead to a solution, please indicate this by clicking the check mark next to that answer.
0 Likes 0 ·
srutzky avatar image
srutzky answered

insert into CollationCheck (NVarcharColumn) values ('えましょう') > > it didn't work. shows ????? as result

Yes, that is the expected behavior given the Database's current default Collation of SQL_Latin1_General_CP1_CI_AS.

How can we do database level changes like collation changes to work statement 2 without prefix N'value ?

ORIGINAL ANSWER (not entirely correct):

You can't. If you have a string literal without the `N` prefix, then that string is `VARCHAR`. It doesn't matter what characters you put into that string, the string can only hold an 8-bit character set.

Just try a simple `SELECT` of that same string:

SELECT 'えましょう';

returns:

?????
Note: I didn't want to do code level changes while write Statement. I want it on database level

If you are creating the SQL dynamically in the app code, then you have no choice but to prefix the strings with `N`.

REVISED ANSWER (entirely correct):

Sorry to have previously given less-than-fully-accurate info before, but there was something that I was unaware of at the time. While I was correct that string literals that do not have the upper-case "N" prefix are 8-bit encoded / VARCHAR, I was incorrect that there is no possible way to allow for these characters in a VARCHAR column / literal / variable / parameter.

There are, in fact, a few 8-bit encodings that do support these characters. These encodings are double-byte character sets (DBCS) and are variable-size encodings. For standard ASCII characters they use a single byte. But for going beyond the typical 256 character limit of most 8-bit encodings, they use two bytes in combination for the rest of the characters that they support. There are 4 such encodings supported by Windows:

  • Code Page 932: Japanese
  • Code Page 936: Chinese Simplified *
  • Code Page 949: Korean
  • Code Page 950: Chinese Traditional *

Prior to the release of SQL Server 2000, the only Collations available were what are now the SQL Server Collations (those with names starting with "SQL_"). None of the SQL Server Collations support any of the 4 DBCS encodings / code pages listed above. But the Windows Collations (those with names not starting with "SQL_", which were introduced in SQL Server 2000, do support all 4 of those DBCS code pages. AND, three of those four code pages support all of the characters noted in the question: Japanese (932), Chinese Simplified (936), and Korean (949).

CREATE DATABASE [DBCS_test] COLLATE Korean_100_CI_AS;
GO
USE [DBCS_test];
SELECT 'えましょう';

returns:

えましょう

So, since string literals / variables / parameters use the default Collation of the current / active database, a database change that could be done to fix this situation is:

ALTER DATABASE [DBCS_test] COLLATE Korean_100_CI_AS;

Just keep in mind the following:

  • This might not the best approach since it will affect object name resolution, etc and thus the code change of prefixing all strings with an upper-case "N" is still the best approach
  • This might not even be possible since changing a Database's default Collation has certain requirements such as no schema-bound objects, etc and might require dropping and recreating objects, etc that prove to be a larger burden / project than doing the code change.

-- OR --

Since, again, it is the active / current database (that the query is executing in) that controls the Collation of string literals / variables / parameters, IF your INSERT / UPDATE statement(s) specify a fully-qualified table name (meaning: DatabaseName.SchemaName.TableName ), then an easier approach might be to simply:

  1. Create a Database for the purpose of connecting to that has one of these 3 types of Collations as its default Collation:

    CREATE DATABASE [HandleJapaneseCharacters] COLLATE Japanese_XJIS_100_CI_AS_SC;
    	
  2. Change the Connection String for the application to use:

    Database=HandleJapaneseCharacters;
    	

PLEASE BE AWARE

  1. The two options noted above do not work for ALL Unicode characters. Those options work for the characters noted in the question, and many others. Just keep in mind that there are some (many) characters that exist in Unicode that do not exist in any 8-bit code page. If you are dealing with any characters that do not exist within any of the code pages supported by SQL Server, then you have no choice but to prefix the string literal with "N".
  2. While this question was about SQL Server 2008 R2 specifically, starting in SQL Server 2019 it became possible to work with Unicode characters in VARCHAR columns, variables, and string literals via the new _UTF8 collations. For VARCHAR variables (including parameters) and string literals to hold Unicode-only characters, the default collation of the "current" database needs to be a _UTF8 collation. VARCHAR columns using a _UTF8 collation can accept Unicode-only characters from any NVARCHAR source.

    For more info on the new UTF-8 collations in SQL Server 2019, please see:
    Native UTF-8 Support in SQL Server 2019: Savior or False Prophet?

Please see the following 2-part series of mine regarding exactly how Database-level and Column-level Collations impact non-Unicode string literals and variables / parameters:

  1. Which Collation is Used to Convert NVARCHAR to VARCHAR in a WHERE Condition? (Part A of 2: “Duck”)
  2. Which Collation is Used to Convert NVARCHAR to VARCHAR in a WHERE Condition? (Part B of 2: “Rabbit”)

For more info on working with Collations, please visit: Collations.info

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Alvin Ramard avatar image
Alvin Ramard answered
You're trying to insert 'えましょう' into a VARCHAR column. You cannot do that! The SQL column has to be capable of storing Unicode characters for this, so you need to use a NCHAR or NVARCHAR datatype. You can do whatever you want with collation. 'えましょう' will not fit into a VARCHAR column!
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

oops. I'm sorry. I thought I saw somewhere where it was varchar. I can't see it now, so I guess I misread it.
1 Like 1 ·
Hi Alvin Ramard it is actually nvarchar column data type only my problem is consider statement 1 has N'record it works but when i am going to insert without N'record it did nt work so can i have anything in database level to fix we can insert without put N' prefix on record
0 Likes 0 ·

Regarding the statement of "'えましょう' will not fit into a VARCHAR column": that is incorrect. Those Japanese characters can definitely fit into a CHAR / VARCHAR column, as long as the column is using a Collation that uses one of the 3 double-byte character set code pages that support Japanese characters. Please see my answer for more details.

0 Likes 0 ·
Magnus Ahlkvist avatar image
Magnus Ahlkvist answered

Short answer is: No, you can't write unicode data without the N-prefix on the string. What happens when you do it without the N-prefix is you get a conversion to varchar from nvarchar if you write it without N-prefix.

So you need to update code to be able to write unicode data into the database. This is a common problem and usually comes from using string-concatenation for variable data instead of parametrizing queries in application level.

When using for example SqlCommand object in .NET, with a parameterized query, .NET takes care of adding the N-prefix for us. And as a bonus, we get application code which is more secure (because it's more SQL Injection safe).

1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

This is not entirely correct. The characters in the question are not Unicode-only characters; they are Japanese characters that can be found in 3 of the 4 double-byte character set (DBCS) code pages supported by SQL Server. Please see my answer for more details.

0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.