Detailed explanation of MySQL field length, characters, and bytes
I. Introduction
After searching the relevant content, I found that the good and the bad are mixed. Most people are also vague about the content of this aspect. First, attach a diagram of commonly used types of MYSQL.
Two, bytes
The first thing to reach a consensus is: 1个Byte字节等于8个bit位
. bit
It is the smallest level of information unit, which can represent a 0 or 1 (that is, binary);
then we can calculate that a byte can actually represent 256 values.
计算方式如下:
一个Byte是8位二进制数,既然是二进制,所以每一位只有两种可能:0、1
取值从[00000000]→[11111111]根据排列组合,2^8 = 256 所以有256种取值。
3. Characters
Characters are actually a general term. Letters, numbers, arithmetic symbols, punctuation marks, and other symbols, as well as some functional symbols, all belong to characters. For example: &, middle, A.
So how many bytes does a character take up? This question actually depends on the encoding format used by the database.
Generally speaking, an English character (A, b) occupies 1 byte. Under UTF8 encoding, a Chinese character occupies 3 bytes; under GBK encoding, a Chinese character occupies 2 bytes.
完了,gg了! ==> 七个字符
hello,世界 ==> 八个字符
4. Field length
This concept comes from when creating or editing a table, design fields will be involved. For example, int(11) and varchar(255) in the following table, 11 and 255 in brackets refer to the field length.
CREATE TABLE `sys_test` (
`t_id` int(11) NOT NULL COMMENT '主键ID',
`name` varchar(255) DEFAULT NULL COMMENT '姓名',
`age` tinyint(4) DEFAULT NULL COMMENT '年龄',
PRIMARY KEY (`t_id`)
)
Note: The field length here accurately refers to the displayed width, and different data types have different processing of the width here.
This is also where many people are confused, here are some practical operations and examples to illustrate.
4.1 例 int(1) 、int(4) 、int(11)
Numerical types here take int(1), int(4), and int(11) as examples. In fact, the range of numbers that can be stored by these three is the same , and they can only store 负2^31 到 2^31-1
or 无符号的 0 到 2^32-1
.
- Create a test table with the following structure.
- At the same time, insert 123456789 into the three fields of this table, which does not exceed the value range of the int type, and the insertion is successful.
It can be seen that no matter whether it is int(1) or int(4), inserting a number with 9 digits can be inserted successfully, and the display is normal.
- At the same time, insert 12345678910 into the three fields of this table,
超过了int类型的取值范围
and the insertion fails.
- We re-modify the table structure and
填充零
check . After saving, you will find that the unchecked无符号
will also be checked. The padding 0 here means that when you set the length, such as int(4), 4 lengths and you only insert a 3-digit value such999
as front of the missing digits, and the query results The result becomes0999
. In the business scenario, the previous zero padding operation is only meaningful when the value is a natural number. So when you check it填充零
, this field will automatically become无符号
a field, which can only store natural numbers, and the value range: 0~2^32-1.
- After filling with zeros is checked, only natural numbers are allowed to be stored, and an error will be reported when negative numbers are inserted.
- At the same time, insert 999 into the three fields of this table, and the insertion is successful. In the query result, if the number of digits in the field is not enough, 0 will be filled.
CREATE TABLE `tb_int_test` (
`int1` int(1) DEFAULT NULL,
`int4` int(4) DEFAULT NULL,
`int11` int(11) DEFAULT NULL
)
-- 插入成功
INSERT into tb_int_test VALUES(123456789,123456789,123456789)
-- 查询结果
SELECT * FROM tb_int_test
-- 插入失败 越界
INSERT into tb_int_test VALUES(12345678910,12345678910,12345678910)
-- 插入失败 越界
INSERT into tb_int_test VALUES(-1,-2,-3)
-- 插入成功。查询出来的结果里,字段不够的位数会补0。
INSERT into tb_int_test VALUES(999,999,999)
In summary, we can draw a conclusion:
In the numeric type: under the premise of meeting the value range, the length of the field refers to the maximum number of zeros when the field is set to be filled with zeros or unsigned.
int(1) can be filled with at most 1 0, int(4) can be filled with at most 4 0s... int(M) is often used together with zerofill to achieve the function of zero filling display.
4.2 Why int is often set to int(11)
Through the above conclusions, we know that in order to make the value of M in int(M) meaningful, it must be carried out under the premise of satisfying the value range of int type.
The int type can store 负2^31 到 2^31-1
or 无符号的 0 到 2^32-1
. ie -2147483648 ~ 2147483647
or 0 ~ 4294967295
.
When unsigned is not considered, -2147483648
the number of digits is 11 (the negative sign here also has one), and the meaning of setting it as int(11) is here, but when considering unsigned, only 10 bits are needed, and setting int( 10) That's it.
4.3 varchar(5) 与 varchar(255)
- Create a table with the following structure:
- At the same time, insert 7 characters into both fields of this table. varchar5 field insertion failed, out of bounds.
- At the same time, insert 5 characters into both fields of this table. The insertion was successful.
CREATE TABLE `tb_varchar_test` (
`varchar255` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`varchar5` varchar(5) COLLATE utf8_unicode_ci DEFAULT NULL
)
-- 插入7个字符插入失败
INSERT into tb_varchar_test VALUES("钓鱼岛是中国的","钓鱼岛是中国的")
-- 插入5个字符插入成功
INSERT into tb_varchar_test VALUES("我是中国人","我是中国人")
In summary, we can draw a conclusion:
In the character type, char(1), varchar(255), the character length here refers to the maximum width that can insert data, and an error will be reported when inserting beyond this width.
4.4 char and varchar
The length of char is fixed, while the length of varchar is variable.
That is to say, define a char(10) and varcha(10), if it is stored in csdn
, and the length consumed by varchar is 4, the length occupied by char is fixed at 10, and csdn
six spaces will be added in addition to characters character, when fetching data, use trim() to remove extra spaces for char type, but varchar is not needed.
The access efficiency of char is much faster than that of varchar, because its length is fixed, which is convenient for program storage and search; varchar has high space utilization, while char reads fast, but it trades space for time efficiency, because its length is fixed, So it is inevitable that there will be redundant space placeholders occupying space.
So char is suitable for storing fixed-length strings such as: ID number, mobile phone number, etc.; varchar is suitable for storing variable-length strings such as: address, personal profile, etc. Choosing the right type in the right scene is the kingly way.