(SqlServe) issue truncated string length

1. Problem Description

When you synchronize data will often find a mistake: the truncated string or binary data .

2. Problem Cause

The reason for this problem are: the length of the value field to be inserted beyond the length field in the database. For example: the insertion string length is 40 bytes, the length field in the database provided for varchar (36) This error will be reported.

3. Question extension

. A string length and how to calculate the byte length, either: datalength difference and len.
len: Returns the length of the string
datalength: Returns the string length in bytes

select len(convert(varchar(50),N'狮lion')) --5 N''表示Unicode格式字符串
select datalength(convert(varchar(50),N'狮lion')) --6
select len(N'狮lion') --5
select datalength(N'狮lion') --10

Next, we analyze why this is so.
Len returns the length of a string of characters, either: a Chinese character 'lion' +4 letters 'lion' = 5 characters. Each letter has a market share of Chinese characters and a character length.
datalength returns the byte length of the string. For the byte length to be understood that: In general, a coding mode, Chinese characters and English letters are not the same bytes occupied. In general, Chinese characters occupy two bytes, one byte English characters. For Unicode encoding, Chinese characters and English letters are 2 bytes.
Varchar and nvarchar explain the difference, ordinary varchar encoded string are Unicode string nvarchar corresponding example is the Chinese character 'lion', the length of the string varchar format is a string format nvarchar length is 2.

len(convert(varchar(50),N'狮lion')) --返回字符长度,'狮lion',一共5个字符,varchar格式字符
len(N'狮lion') --返回字符长度,'狮lion',一共5个字符,nvarchar格式字符
datalength(convert(varchar(50),N'狮lion')) --返回字节长度,varchar格式字符'狮lion','狮'占2个字节,
                                              --'lion'中每个字母占用一个字节,共占用4个字节
datalength(N'狮lion') --返回字节长度,Unicode格式字符,汉字'狮'占2个字节,
                               --'lion'中每个字母占用2个字节,共占用8个字节

The article cites: https://blog.csdn.net/oncealong/article/details/37573927

b. how to view the encoding format database.

--查看sqlserver数据库的编码格式
SELECT COLLATIONPROPERTY('Chinese_PRC_Stroke_CI_AI_KS_WS', 'CodePage');
查询结果: 
936 简体中文GBK 
950 繁体中文BIG5 
437 美国/加拿大英语 
932 日文 
949 韩文 
866 俄文 
65001 unicode UFT-8

Difference between c. Varchar, and char and nvarchar three

  1. char is a fixed length (fixed length), efficiency is higher than VARCHAR; that is, when you enter the characters less than the number you specify, for example: char (8), the character you type is less than 8, it will then back up null . When you enter a character larger than the specified number, it will exceed the interception of character.
  2. varchar [n] is a variable length and non-unicode character data types, n for a value of between 1 to 8000, the type of one byte English characters, Chinese characters occupy two bytes. Advantages: more rational use of space, will not cause too much waste.
  3. nvarchar [n-] is becoming long and unicode character data type, n for a value of between 1 and 4000, regardless of the type in English characters are two bytes obtain a
    wherein varchar nvarchar two fields and field values respectively: Hello hello
    then varchar field representing 2 × 2 + 5 = 9 bytes of storage space, and a field representing nvarchar 7 × 2 = 14 bytes of storage space.
    The field values in English only Alternatively varchar, while the presence of more-byte field value (Chinese, Korean, etc.) characters nvarchar

The article cites: https://www.cnblogs.com/flqcchblog/p/4560781.html

Guess you like

Origin www.cnblogs.com/littlewu/p/12153368.html