Detailed CHAR and VARCHAR

Preface: 

Wrote an earlier article describes the int type, always wanted to write an article on a string field type of article, I have been dragged to have no idea how to start. And more recently attention an article under this area, decided to drag a long article taking it. This article will explain the main difference between usage and type the string of char and varchar.

In this paper the experimental environment for MySQL 5.7.23 version, storage engines Innodb, sql_mode strict mode, character set is utf8.

▍1.CHAR Type Description

We usually use the char type defined field, you tend to specify the length of M, i.e. char (M). In fact, M refers to the number of characters, that is, the number of characters in this field to store up, M not specified, the default is 1, the range [0,255], single letters, numbers, Chinese are all occupied by a character. Utf8 character set under a Chinese character takes 3 bytes. Here's a brief test at:

# 假设以如下建表语句创建测试表
CREATE TABLE `char_tb1` (
  `col1` char DEFAULT NULL,
  `col2` char(5) DEFAULT NULL,
  `col3` char(10) DEFAULT NULL
) ENGINE=InnoDB  DEFAULT CHARSET=utf8;

# 进入数据库查询建表语句如下 发现char(M) M可不指定,默认为1
mysql> show create table char_tb1\G
*************************** 1. row ***************************
       Table: char_tb1
Create Table: CREATE TABLE `char_tb1` (
  `col1` char(1) DEFAULT NULL,
  `col2` char(5) DEFAULT NULL,
  `col3` char(10) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

# 插入数据 可以看出M表示保存的最大字符数,字母、数字、中文等都是占用一个字符
mysql> insert into char_tb1 (col1) values ('a'),('1'),('王'),(']');
Query OK, 4 rows affected (0.01 sec)
mysql> insert into char_tb1 (col1) values ('aa'),('12');
ERROR 1406 (22001): Data too long for column 'col1' at row 1
mysql> select * from char_tb1;
+------+------+------+
| col1 | col2 | col3 |
+------+------+------+
| a    | NULL | NULL |
| 1    | NULL | NULL |
| 王   | NULL | NULL |
| ]    | NULL | NULL |
+------+------+------+
4 rows in set (0.00 sec)
mysql> insert into char_tb1 (col2) values ('abcd'),('王-123'),('^*123'),('12'),('一二三四五');
Query OK, 5 rows affected (0.01 sec)
mysql> insert into char_tb1 (col2) values ('abcdef');
ERROR 1406 (22001): Data too long for column 'col2' at row 1
mysql> select * from char_tb1;
+------+-----------------+------+
| col1 | col2            | col3 |
+------+-----------------+------+
| a    | NULL            | NULL |
| 1    | NULL            | NULL |
| 王   | NULL            | NULL |
| ]    | NULL            | NULL |
| NULL | abcd            | NULL |
| NULL | 王-123          | NULL |
| NULL | ^*123           | NULL |
| NULL | 12              | NULL |
| NULL | 一二三四五      | NULL |
+------+-----------------+------+
9 rows in set (0.00 sec)

# 下面测试发现M的范围是[0,255] 
mysql> alter table char_tb1 add column col4 char(0);
Query OK, 0 rows affected (0.10 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table char_tb1 add column col5 char(255);
Query OK, 0 rows affected (0.11 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table char_tb1 add column col5 char(256);
ERROR 1074 (42000): Column length too big for column 'col5' (max = 255); use BLOB or TEXT instead

▍2.VARCHAR Type Description

The same, varchar M (M) indicates the number of stored maximum character, single letters, numbers, Chinese are all occupied by a character. varchar may store byte length in the range of 0-65535, in addition, requires the use of varchar 1 or 2 additional record length byte strings: if the maximum length of the column is 255 bytes or less, using only 1 byte indicates otherwise use two bytes. For Innodb engine, UTF8 character sets, a single Chinese character occupies 3 bytes, the VARCHAR (M) M is not more than 21,845 maximum, i.e. M is the range [0,21845), and M must be specified. Further provisions MySQL: single byte field length is not greater than 65535; 65535 to limit the maximum one-way, this does not include TEXT, BLOB fields. Varchar i.e. all fields defined in a single table in length and not greater than 65535, so not all varchar (M) field M can be taken to 21844, we verify the following:

# 假设以如下建表语句创建测试表
CREATE TABLE `varchar_tb1` (
  `col1` varchar(0) DEFAULT NULL
) ENGINE=InnoDB  DEFAULT CHARSET=utf8;

# 查看建表语句 增加字段 发现M必须指定
mysql> show create table varchar_tb1\G
*************************** 1. row ***************************
       Table: varchar_tb1
Create Table: CREATE TABLE `varchar_tb1` (
  `col1` varchar(0) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> alter table varchar_tb1 add column col2 varchar;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1

# 下面测试证明M最大可取到21844
mysql> CREATE TABLE `varchar_tb2` (col1 varchar(21844));
Query OK, 0 rows affected (0.04 sec)

mysql> CREATE TABLE `varchar_tb3` (col1 varchar(218445));
ERROR 1074 (42000): Column length too big for column 'col1' (max = 21845); use BLOB or TEXT instead

# 下面测试证明单行最大限制为65535字节
mysql> CREATE TABLE `varchar_tb3` (col1 varchar(10));
Query OK, 0 rows affected (0.04 sec)

mysql> alter table varchar_tb3 add column col2 varchar(21844);
ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs
mysql> alter table varchar_tb3 add column col2 varchar(21834);
ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs
mysql> alter table varchar_tb3 add column col2 varchar(21833);
Query OK, 0 rows affected (0.09 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show create table varchar_tb3\G
*************************** 1. row ***************************
       Table: varchar_tb3
Create Table: CREATE TABLE `varchar_tb3` (
  `col1` varchar(10) DEFAULT NULL,
  `col2` varchar(21833) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

▍3.CHAR VARCHAR compared with

CHAR type is a fixed length, MySQL always allocate enough space defined according to string length. When CHAR values ​​stored in their right padded with spaces to the specified length, when the value of CHAR retrieved, trailing spaces are removed.

VARCHAR type for storing variable length strings, when storing, if the character does not reach the defined number of digits, it will not fill the space in the back. However, the longer the line is, when the row UPDATE may become longer than the original, which led to additional work needs to be done. If space is occupied by a row of growth, and there is no more space to store, InnoDB in this case need to split pages so that the row can be put inside pages in the page, this will increase the fragmentation.

Following is a brief summary under CHAR and VARCHAR field types of application scenarios:

CHAR adapted to store a very short string, or all values ​​are close to the same length. For example, CHAR is stored for the password MD5 value, since this value is a fixed length. For frequently changing data, CHAR is also better than VARCHAR, because the fixed-length CHAR type is not prone to fragmentation. For very short columns, CHAR over VARCHAR in the storage space more efficiently. For example CHAR (1) to store only the values ​​Y and N, if the single byte character set only one byte, but VARCHAR (1) it takes two bytes, the byte length because there is an additional recording .

Under these circumstances the use VARCHAR following are suitable: a long string or strings to be stored in different lengths, very different; the maximum length of a string column is much greater than the average length; update column so few fragments not problem.

Additional note, we should demand assignment when defining the maximum length of the field, well ahead of estimates. Especially for VARCHAR field, it was felt anyway VARCHAR data type is assigned to the length according to the actual needs, you might as well give it a little big. But that is not the case, for example, is now a need to store address information, based on the assessment, as long as the use of 100 characters on it, we can use VARCHAR (100) or VARCHAR (200) to store, although they are used to store 90 characters data, the same memory space, but for different memory consumption. Longer columns may consume more memory as MySQL usually fixed size block of memory allocated to store the internal value, especially when used will be particularly bad memory temporary table arranged or operate. So we still can not be too generous in allocating VARCHAR data type. To assess the length of the actual or desired, and then select a field to set the longest character length. If in order to consider redundancy may be left about 10% of the character length. We do not think VARCHAR is allocated according to the actual length of storage space, and the length of random assignment, or simply use the maximum character length.

to sum up: 

This paper describes the CHAR field type VARCHAR with use, and gives both comparative and application scenarios. In actual production, we need specific conditions, the right is the best, I hope this article can give to your reference.

Guess you like

Origin blog.51cto.com/10814168/2450732