MySQL Alibaba Specification

1.MySQL Alibaba Protocol [Reprint]

Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here

Insert image description here

2.Mysql development specifications

2.1 Design specifications

1. [Recommended] Fields allow appropriate redundancy to improve query performance, but data consistency must be considered. Redundant fields should be:
• Fields that are not frequently modified.
• It is not a varchar long field, let alone a text field.
Positive example: Employee names are used frequently, the field length is short, and the name is basically unchanged. It can be stored redundantly in related tables to avoid related queries.

2. [Recommendation] Database and table sharding is only recommended when the number of rows in a single table exceeds 5 million or the capacity of a single table exceeds 2GB.
Note: If the data volume is not expected to reach this level in three years, please do not divide the database into tables when creating the table.

3. [Recommendation] ID must be a primary key. Each table must have a primary key and maintain a growing trend. Small systems can rely on MySQL's auto-incrementing primary key. Large systems or only use the built-in ID generator when sub-databases and tables are needed. .

4. [Mandatory] There are no special requirements for the ID type. It is best to use bigint unsigned and try to avoid using int, even if the current amount of data is small. If the id is of bigint unsigned type, it must be 8 bytes.
• Facilitates connection with external systems, and may generate a lot of waste data
• Avoids the impact of waste data on system IDs
• In the future, sharding of databases and tables will automatically generate IDs, which are generally 8 bytes.

5. [Recommended] Try to set the field to NOT NULL and provide a default value for the field. For example, the default value of the character type is an empty character string EMPTY STRING; the default value of the numerical type is the numerical value 0; the default value of the logical type is the numerical value 0; (date format to be determined)

6.【Mandatory】Clear comments must be provided for each field and table

7. [Recommended] Unified time format: 'YYYY-MM-DD HH:MM:SS'

8. [Mandatory] When updating data table records, the value of the updated_at field corresponding to the record must also be updated to the current time.

9. [Mandatory] The table storage engine must use InnoDB

10. [Mandatory] The table character set uses utf8mb4 by default. utf8mb4 is a superset of utf8. It can be used to store 4 bytes such as emoticons.

2.2 Naming convention

1. [Mandatory] Fields that express the concept of yes or no must be named in the is_xxx manner, and the data type is unsigned tinyint(1) (1 means yes, 0 means no).
Note: Any field must be unsigned if it is a non-negative number.
Positive example: The field name expressing logical deletion is_deleted, 1 means deleted, 0 means not deleted.

2. [Mandatory] Table names and field names must use lowercase letters or numbers. It is forbidden to start with numbers, and it is forbidden to only have numbers between two underscores. Modification of database field names is very costly, because pre-release is not possible, so field names need to be carefully considered.
Note: MySQL is not case-sensitive under Windows, but it is case-sensitive by default under Linux. Therefore, no capital letters are allowed in database names, table names, and field names to avoid unnecessary complications.
• Positive example: health_user, rdc_config, level3_name
• Negative example: HealthUser, rdcConfig, level_3_name

3. [Mandatory] Do not use plural nouns in table names. Note: The table name should only represent the entity content in the table, not the number of entities. The corresponding DO class name is also in singular form, which conforms to the expression convention.

4. [Mandatory] Disable reserved words, such as desc, range, match, delayed, asc, status, etc. Please refer to MySQL official reserved words MySQL keywords.

5. [Mandatory] The primary key index name is pk_field name; the unique index word is uk_field name; the ordinary index name is idx_field name.
Note: pk is the primary key; uk is the unique key; idx is the abbreviation of index.

6. [Mandatory] The decimal type is decimal, and float and double are prohibited.
Note: There is a problem of precision loss when storing float and double, and it is possible to obtain incorrect results when comparing values. If the range of stored data exceeds the range of decimal, it is recommended to split the data into integers and decimals and store them separately.

7. [Mandatory] If the lengths of the stored strings are almost equal, use the char fixed-length string type.

8. [Mandatory] Varchar is a variable-length string. No storage space is allocated in advance. The length should not exceed 5000. If the storage length is greater than this value, define the field type as text, create a separate table, and use the primary key to correspond to avoid impact. Index efficiency of other fields.

9. [Mandatory] Required fields for the table: id, deleted_at, created_at, updated_at.
Note: The id must be the primary key, and the type is unsigned bigint or int. When it is a single table, it will increase automatically with a step size of 1. The types of deleted_at, created_at, and updated_at are all timestamp types.

10. [Mandatory] All names must use full names, except for those with default conventions. If it exceeds 30 characters, use abbreviations. Please try to make the name easy to understand and short, such as information --> info; address --> addr, etc.

11. [Recommendation] It is best to name the table with "business name_function of the table".
Positive example: health_user / trade_config

12. [Recommended] The library name and application name should be as consistent as possible. Such as health.

13. [Recommendation] If you modify the meaning of a field or add the status represented by a field, you need to update the field comments in a timely manner.

2.3 Type specifications

1. Indicate the use of TINYINT UNSINGED for status fields (0-255). The use of enumeration types is prohibited. The comments must clearly explain the meaning of each enumeration, and whether there are multiple selections, etc.

2. Use TINYINT(1) to represent boolean types, because mysql itself does not have a boolean type. When automatically generating code, the fields of the DO object are of boolean type, such as is_delete; TINYINT(4) is used at all other times.
Note: TINYINT(4), the value in this bracket does not indicate how much storage space is used, but the maximum display width, and is only useful when the field specifies zerofill. Without zerofill, (m) is useless, such as id BIGINT ZEROFILL NOT NULL , so just use the default when creating a table, and there is no need to add parentheses, unless there are special requirements, such as TINYINT (1) representing the boolean type.
TINYINT(1) and TINYINT(4) both store one byte and will not change due to the number in the brackets. For example, if TINYINT(4) stores 22, it will display 0022 because the maximum width is 4. If it cannot be reached, use 0 to supplement it.

3. [Reference] Appropriate character storage length not only saves database table space and index storage, but more importantly, improves retrieval speed.
Insert image description here

2.4 Index specifications

1. [Mandatory] Fields with unique business characteristics, even if they are a combination of multiple fields, must be built into a unique index.
Note: Don’t think that the unique index affects the insert speed. This speed loss can be ignored, but the increase in search speed is obvious; in addition, even if very complete verification control is done at the application layer, as long as there is no unique index, according to Murphy’s law, There must be dirty data generated.

2. [Mandatory] Joining more than three tables is prohibited. The data types of the fields that need to be joined must be absolutely consistent; when performing multi-table related queries, it is ensured that the related fields need to have indexes. Even if you join a double table, you must pay attention to table indexes and SQL performance.

3. [Mandatory] When creating an index on a varchar field, the index length must be specified. It is not necessary to index the entire field. The index length can be determined based on the actual text distinction.
Note: Index length and discrimination are a pair of contradictions. Generally, for string type data, the index with a length of 20 will have a discrimination of more than 90%. You can use count(distinct left(column name, index length))/ Determined by the distinction of count(*).

4. [Mandatory] Try to avoid left blur or full blur when searching the page. If necessary, please use the search engine to solve it.
Note: The index file has the leftmost prefix matching feature of B-Tree. If the value on the left is undetermined, this index cannot be used.

5. [Recommended] If there is an order by scenario, please pay attention to the orderliness of the index. The last field in order by is part of the combined index and is placed at the end of the index combination order to avoid file_sort and affect query performance.
Positive example: where a=? and b=? order by c; Index: a_b_c Counterexample: If there is a range search in the index, the ordering of the index cannot be used, such as: WHERE a > 10 ORDER BY b; Index a_b cannot be sorted.

6. [Recommended] Use covering indexes to perform query operations to avoid table returns.
Explanation: If you need to know the title of Chapter 11 of a book, will you open the page corresponding to Chapter 11? Just browse the table of contents. This table of contents serves as a covering index.
Positive example: The types of indexes that can be created: primary key index, unique index, ordinary index, and covering index is a query effect. With the explain result, the extra column will appear: using index.

7. [Recommendation] Use delayed association or subquery to optimize multi-page paging scenarios.
Note: MySQL does not skip offset rows, but takes offset+N rows, then returns the offset rows before giving up, and returns N rows. When the offset is particularly large, the efficiency is very low, or the total number of pages returned must be controlled. , or perform SQL rewrite on the number of pages exceeding a certain threshold.
Positive example: First quickly locate the id segment that needs to be obtained, and then associate:

8. [Recommendation] The goal of SQL performance optimization: at least reach the range level, the requirement is ref level, if it can be consts, it is best.
• Consts There is at most one matching row (primary key or unique index) in a single table, and the data can be read during the optimization phase.
• ref refers to using a normal index.
• range performs a range search on the index.
Counter example: The result of the explain table, type=index, is a full scan of the index physical file, which is very slow. This index level is lower than the range, and is dwarfed by a full table scan.

9. [Recommendation] When building a combined index, the most differentiated one is on the far left.
Positive example: If where a=? and b=? and column a is almost unique, then you only need to build the idx_a index alone.
Note: When there is a mixed judgment condition of non-equal sign and equal sign, when building the index, please put the column of the equal sign condition in front. For example: where a>? and b=? Then even if a has a higher degree of distinction, b must be placed at the forefront of the index.

10. [Recommended] Prevent implicit conversion caused by different field types, resulting in index failure.

11. [Reference] When creating an index, avoid the following extreme misunderstandings: It is
better to overload than to lack. It is believed that a query requires an index.
It is better to lack than to overdo it. It is believed that indexes will consume space and seriously slow down updates and new additions.
Resist unique indexes. It is believed that the uniqueness of business must be solved at the application layer through the "check first and then insert" method.

12. [Mandatory] It is recommended that the number of indexes on a single table should be controlled within 5, and the number of combined index fields should not exceed 5.

13. Summary
• Indexes occupy disk space, do not duplicate indexes, and be as short as possible• Only
add indexes to commonly used query conditions
• Create indexes for columns with high filterability, and do not create indexes for columns with a fixed value range
• Add unique indexes to unique records
• Do not build indexes on frequently updated columns
• Do not perform operations on index columns
• Under the same filtering effect, keep the index length to a minimum
• Make reasonable use of combined indexes and pay attention to the order of index fields
• Multi-column combined indexes, fields with high filterability come first
• Order by Fields are indexed to avoid filesort
• Combined index, different sort orders cannot use indexes
• <>!= Index cannot be used

2.5 SQL specifications

1. [Mandatory] Do not use count (column name) or count (constant) instead of count( ). count( ) is the standard syntax for counting rows defined by SQL92. It has nothing to do with the database, and has nothing to do with NULL and non-NULL.
count(*) will count rows with NULL values, while count(column name) will not count rows with NULL values ​​in this column.

2. [Mandatory] count(distinct col) counts the number of unique rows in this column except NULL.
count(distinct col1, col2) If one of the columns is all NULL, then even if the other column has a different value, it will return 0. .

3. [Mandatory] When the values ​​of col in a certain column are all NULL, the return result of count(col) is 0, but the return result of sum(col) is NULL, so you need to pay attention to the NPE problem when using sum().
Note: You can use the following method to avoid the NPE problem of sum:
SELECT IF(ISNULL(SUM(g)),0,SUM(g)) FROM table;

4. [Mandatory] Use ISNULL() to determine whether it is a NULL value.
Note: A direct comparison of NULL with any value is NULL.
• The return result of NULL<>NULL is NULL, not false.
• The return result of NULL=NULL is NULL, not true.
• The return result of NULL<>1 is NULL, not true.

5. [Mandatory] When writing paging query logic in the code, if the count is 0, it should be returned directly to avoid executing subsequent paging statements.

6. [Mandatory] Foreign keys and cascades are not allowed. All foreign key concepts must be solved at the application layer.
Note: Take the relationship between students and grades as an example. The student_id in the student table is the primary key, and the student_id in the grade table is a foreign key. If you update the student_id in the student table and trigger the update of the student_id in the grades table at the same time, it is a cascade update. Foreign keys and cascade updates are suitable for low concurrency on a single machine, but are not suitable for distributed and high-concurrency clusters; cascade updates are strongly blocking and have the risk of database update storms; foreign keys affect the insertion speed of the database.

7. [Mandatory] The use of stored procedures is prohibited. Stored procedures are difficult to debug and expand, and have no portability.

8. [Mandatory] When revising data, when deleting or modifying records, you must select first to avoid accidental deletion, and then execute the update statement only after confirming that it is correct.

9. [Recommended] If the in operation can be avoided, avoid it. If it cannot be avoided, you need to carefully evaluate the number of collection elements after in and control it within 1,000.

10. [Reference] If there is a need for globalization, all character storage and representation will be encoded in UTF-8. Pay attention to the difference in character statistics functions.
Description:
SELECT LENGTH("Easy Work"); returns 12
SELECT CHARACTER_LENGTH("Easy Work"); returns 4 If you need to store expressions, select utfmb4 for storage, and pay attention to the difference between it and utf-8 encoding.

11. [Reference] TRUNCATE TABLE is faster than DELETE and uses less system and transaction log resources. However, TRUNCATE has no transactions and does not trigger triggers, which may cause accidents. Therefore, it is not recommended to use this statement in development code. Description: TRUNCATE TABLE is functionally equivalent to the DELETE statement without a WHERE clause.

12. [Recommendation] Do not write a large and comprehensive data update interface. If it is passed in as a POJO class, no matter whether it is your own target update field, update table set c1=value1,c2=value2,c3=value3; This is wrong. When executing SQL, do not update unmodified fields. First, it is error-prone; second, it is inefficient; and third, binlog storage is increased.

13. Summary
• WHERE conditions that can quickly narrow the result set are written first. If there are constant conditions, try to put them first, such as where 1=1
• Avoid using GROUP BY, DISTINCT and other statements, and avoid joint table queries and subqueries
• Try to arrange the fields that can be indexed in an effective and reasonable manner
• Use >, >=, =, <, <=, IF NULL and BETWEEN for index fields. The index will be used. If a LIKE query is performed on an index field, use LIKE. '%abc%' cannot use the index, use LIKE 'abc%' to be able to use the index
• If some of MySQL's built-in functions are used in SQL, the index will be invalid
• Avoid using select * directly, only take the required fields, increase usage Possibility of using covering index
• For queries with large amounts of data, try to avoid using order by clauses in SQL statements
• In the case of join table queries, ensure that the data types of associated conditions are consistent and avoid nested subqueries
• For continuous numerical values , use between instead of in
• Try not to use CASE conditions in where statements
• Use LIMIT 1 when there is only one row of data

Guess you like

Origin blog.csdn.net/Blue_Pepsi_Cola/article/details/131460311