What type of database is used to store articles in the backend management system? How to store large paragraphs of text in the mysql database? (More than 1000 Chinese characters)

question

For a small article system website, I chose varchar(8000) as the field used to store content. It can hold up to 4000 Chinese characters, which should be enough for most users. But one problem is allowing the user to enter code (HTML, JS, CSS, etc.), which actually becomes more if the code is stored

So should I use varchar or text? Or something else?

in conclusion

usetext

material

TEXT

TEXT has 4 types:

  • TINYTEXT 256bytes
  • TEXT 64kb
  • MEDIUMTEXT 16Mb
  • LONGTEXT 4GB

Use annotations

@Lob
@Basic(fetch=FetchType.LAZY)
@Column(columnDefinition="TEXT",nullable=true)
public String getContent() {
    
    
    return Content;
}

BLOB

TINYBLOB

  • BLOB、
  • MEDIUMBLOB
  • LUNGBLOB

Blob saves binary data. Using this feature, pictures can be stored in the database. text can only store text.

Use annotations

@Lob
@Basic(fetch=FetchType.LAZY)
@Column(columnDefinition="BLOB",nullable=true)
public String getContent() {
    
    
    return Content;
}

TEXT and BLOG

TEXT:Stores variable-length non-Unicode data, with a maximum length of 65535 (2^16-1) characters.

MEDIUMTEXT:Stores variable-length non-Unicode data, with a maximum length of 16777215 (2^24-1) characters.

LONGTEXT :Stores variable-length non-Unicode data, with a maximum length of 2147483647 (2^32-1) characters.

TINYTEXT :Stores variable-length non-Unicode data, with a maximum length of 255 (2^8-1) characters.

A similar data type to TEXT is BLOG, the difference is

  • BLOB stores binary data, and TEXT stores character data.

  • The advantage of using BLOB is that both text and images can be stored in the database in binary form. However, unfortunately, most of the images are now introduced to the front-end with tags, and the emergence of image beds and CDN directly leads to only text data being stored in our own database, which means that TEXT is more commonly used.

  • BLOB columns do not have a character set, and sorting and comparison are based on the numerical value of the column value bytes. The TEXT column has a character set, and values ​​are sorted and compared according to the collation rules of the character set. Then, when storing files containing Chinese characters, it is recommended to use TEXT.

What TEXT and BLOG have in common:

  • There are four data types with different character length limits

  • Trailing spaces are not removed when saving or retrieving values ​​from BLOB and TEXT columns.

  • For indexes on BLOB and TEXT columns, the length of the index prefix must be specified.

  • BLOB and TEXT columns cannot have default values.

  • Only the first max_sort_length bytes of the column are used when sorting. The default value of max_sort_length is 1024.

  • Another way to use GROUP BY or ORDER BY on BLOB or TEXT columns containing long values ​​is to convert the column values ​​into fixed-length objects when you want to make sense of bytes exceeding max_sort_length. The standard approach is to use the SUBSTRING function.

  • The maximum size of a BLOB or TEXT object is determined by its type, but the actual maximum size that can be passed between the client and server is determined by the amount of available memory and the communication buffer size. You can change the size of the message buffer by changing the value of the max_allowed_packet variable, but you must modify both the server and client programs.

Usage: If it does not involve storing binary data mixed with pictures and text, or storing Chinese text, it is recommended to use TEXT

CHAR and VARCHAR

CHAR Used to store fixed-length data, the index on the CHAR field is highly efficient, but it is not suitable for data with uncertain character length. For example, if you define char(10), no matter whether the data you store reaches 10 bytes, it will occupy 10 bytes of space.

VARCHARIn order to solve the above-mentioned problems, SQL is designed to specifically store variable-length data types, but this results in a corresponding loss of storage efficiency. If the possible values ​​of a field are not of fixed length, we only know that it cannot exceed 10 characters, and it is most cost-effective to define it as VARCHAR(10).

The actual length of a VARCHAR type is the actual length of its value + 1. Why "+1"? This byte is used to save the actual length used.

用法: From a space perspective, varchar is appropriate; from an efficiency perspective, char is appropriate. The key is to find the trade-off point based on the actual situation.

NCHAR、NVARCHAR、NTEXT

Add N before the previous types. It indicates that characters of Unicode data type are stored.

  • English generally only requires an encoding table consisting of an alphabet and some symbol characters, because only one byte is needed to store characters. However, each Chinese character in Chinese is not a combination of letters and requires more storage space, generally occupying two bytes.

  • In order to be compatible with characters in different languages, the Unicode character set needs to be used. All its characters are represented by two bytes, that is, English characters are also represented by two bytes.

  • It can be seen that when using nchar and nvarchar data types, you don't have to worry about whether the input characters are English or Chinese characters, which is more convenient, but there is some loss in quantity when storing English.

Usage: If it contains Chinese characters, use nchar/nvarchar. If it contains pure English and numbers, use char/varchar.

The difference between text, longtext and mediumtext in mysql

1. Overview

In MySQL, text, mediumtext, and longtext are all data types used to store large amounts of text data.

  • TEXT: The TEXT data type can be used to store text data with a maximum length of 65,535 (2^16-1) characters. If the stored data exceeds this length, MySQL will throw an error.
  • MEDIUMTEXT: The MEDIUMTEXT data type can be used to store text data with a maximum length of 16,777,215 (2^24-1) characters. Compared to the TEXT type, the MEDIUMTEXT type can store more data.
  • LONGTEXT: The LONGTEXT data type can be used to store text data with a maximum length of 4,294,967,295 (2^32-1) characters. It stores the most data of all text types.

Among these data types, the larger the data stored, the larger the storage space occupied. Therefore, when designing a database, you should choose the appropriate data type based on the actual situation to avoid wasting storage space.

In addition, it should be noted that these data types are encoded in the Unicode character set (UTF-8). If you need to store data in a non-Unicode character set, you can choose another data type, such as CHAR or VARCHAR.

In addition to the difference in storage capacity, there are some other differences between these text types.

  • Storage space: When storing the same data, the LONGTEXT type takes up more storage space than the MEDIUMTEXT and TEXT types.
  • Performance: Because the LONGTEXT type takes up more storage space, LONGTEXT type data takes longer to perform operations such as querying and sorting.
  • Index: Since text type data is relatively large, special attention needs to be paid when using indexes. If you want to index text-type data, you need to use technologies such as prefix indexing or full-text indexing to avoid performance issues.
    Data type: Although these text types can store large amounts of text data, their data types in MySQL are different. The TEXT type is the smallest type of data storage among TINYTEXT, TEXT, MEDIUMTEXT and LONGTEXT. Therefore, if you need to store smaller text data, you can use the TINYTEXT type.

In short, when designing a database, you should choose the appropriate data type based on actual needs. If you need to store small text data, you can use the TINYTEXT type; if you need to store a large amount of text data, you can use the MEDIUMTEXT or LONGTEXT type. When using indexes, care should be taken to avoid performance issues.

2. Different byte limits

1. Text field type: The byte limit of the text field type is 65535 bytes.

2. Longtext field type: The byte limit of the longtext field type is 2147483647 bytes.

3. Mediumtext field type: The byte limit of the mediumtext field type is 16777215 bytes.

3. I/O is different

1. Text field type: Text field type is less likely to cause redundant I/O than longtext and mediumtext field types.

2. Longtext field type: Longtext field type is more likely to cause redundant I/O than text and mediumtext field types.

3. Mediumtext field type: The mediumtext field type is more likely to cause redundant I/O than the text field type, and is less likely to cause redundant I/O than the longtext field type.

4. Row migration is different

1. Text field type: Text field type is easier to perform row migration than longtext and mediumtext field types.

2. Longtext field type: Longtext field type is less easy to perform row migration than text and mediumtext field types.

3. Mediumtext field type: The mediumtext field type is less easy to perform row migration than the text field type, and easier to perform row migration than the longtext field type.

grateful

Reference for the above content
https://blog.csdn.net/strivenoend/article/details/80462044

https://blog.csdn.net/u011262253/article/details/105587518

https://blog.csdn.net/qyj19920704/article/details/123924672

Guess you like

Origin blog.csdn.net/Gabriel_wei/article/details/130743603