Mysql database schema design and optimization

Mysql database schema design and optimization

Disclaimer: This article is a blogger original article, follow the  CC 4.0 BY-SA  copyright agreement, reproduced, please attach the original source link and this statement.
This link: https://blog.csdn.net/qq_37735385/article/details/89480900

 


All of the above specifications not totally contrary to, but if not match, and to confirm whether the company dba team can do related operations

 

Database design specifications

1. database naming convention

  • All database object names must be lowercase letters and separated by underscores
    mysql is case-sensitive, except sql. Different database name: Dbname, dbname; different table: Table, table
  • All database object names, mysql prohibit the use of reserved keywords
    such as the following, from the list of queries have a keyword, the job must use single quotes to escape, otherwise it will complain!
    select id,username,`from`,age from tb_user;
    Common mysql keyword
  • Name the database objects to be able to see the name do know Italy, and preferably not more than 32 characters
    example: user account table user_account
  • Temporary library table must tmp prefixed and suffixed by date
  • Backup repository, backup table must bak as a prefix and suffix to date
  • All data set out and stored in the same column types must match
    Here Insert Picture Description
    , such as above and below the two tables: customer_id set out, exactly the same type of column and
    Here Insert Picture Description

2. The basic database design specifications

  • All tables must be used no special circumstances Innodb engine
    5.6 after the default storage engine
    supports transactions, row-level locking, better and better recovery, performance under high concurrency
  • Databases and tables using the UTF8 character set uniform
    compatibility utf8 better
    not saying that we use utf8, but the library table character set must be unified, unified character set can be avoided due to the garbled character set conversion generated
    in mysql utf8 character set of Chinese characters 3 bytes (such as the definition: user_name varchar (255) takes up 765 bytes), ascII occupies one byte code.
  • All tables and fields should add comments
    using the comment clause adding tables and notes columns
    to maintain the data dictionary from the beginning, not pay attention to maintenance project dictionary project early, late in the project if the personnel changes, I do not know will result in a field meaning, etc. is difficult to maintain, must be read to understand the meaning of the code.
  • Try to control the amount of data the size of a single table, recommended control in less than 5 million
    5 million is not the limit mysql database
    mysql data can be stored up to tens of thousands, depending on the storage system settings and files
    means you can use historical data archiving, sub-library sub-table, etc. to control the size of the amount of data
  • Use caution mysql partition table
    partition table performance for multiple files on the physical, the performance of a table logically
    choose carefully partitioning key, cross-partition query efficiency may be lower
    form of a recommendation by physical management of large data points table
  • Data as far as possible separation of hot and cold, reducing the width of the table (cold data: unusual transactions; thermal data: common data)
    MySQL limit stored for up to 4,096, and each row of data can not exceed 65535 bytes.
    Reducing the disk io, to ensure that memory data cache hit rate of heat
    more efficient use of the cache, to avoid unnecessary cold read data (Admittedly, a lot of people like to select *, cold can cause a lot of useless data to be loaded)
    to columns in the table vertical split, are often used together into a column of the table, to avoid excessive association operation.
  • Prohibit the establishment reserved field in a table
    reserved field name is very difficult to see the name to know Italian.
    Reserved field can not confirm the type of data storage, it is impossible to select the appropriate data type.
    Type of modification reserved field, the table will be locked.
    Modify a field overhead, a larger increase than the field.
  • Prohibit pictures stored in the database, data files and other binary
    files are generally large, it will greatly increase the cost of the database
  • Forbidden to do stress tests on-line database
    will generate a lot of junk data, to cause unnecessary trouble later
  • Banned from the development environment, test environment directly connected to the production database environment
    If you connect directly from the development environment, production environment, it is likely to cause damage to the integrity of the data production environment.

3. Database design specifications Index

  • Limit index of a table number, number of indexes recommended a single table no more than five
    proportional to the number of columns and the general index, limit the number of columns in front but also to ways to reduce the index.
    The index is not better! Indexes can improve efficiency, but also reduce efficiency.
    The index will occupy some physical space.
    Prohibition to each field of the table to establish a separate index.
  • Each Innodb table must have a primary key
      Innodb is the order of the primary key index organized tables! If no primary key for the table, mysql prefers the first non-null unique index as the primary key, if even non-null unique index no, mysql generates a occupy 6 bytes primary keys, and the performance of the primary key is not the most Ok.
      Do not use frequently updated column as the primary key, do not use multi-column primary key.
      Not used UUID, MD5, HASH, character string as a primary key column. Because they can not guarantee that the resulting values are sequential, newly generated data may be sorted before a position to increase io operation, waste database overhead.
      Primary Key ID value is recommended to use the self-energizing
  • Common recommendations index column
    select, update, delete statement where clause in the column
    field is included in order by, group by, distingct in
    multi-table join the association columns
  • How to choose the order of the index column
    index column is used in accordance with the sequence from the left
    most column of discrimination on the leftmost index joint
    as far as possible on a small field length leftmost column index joint
    most frequently used the left column into the joint index of
    attention to proper selection order in line with the key index
  • Avoid creating redundant indexes and index repeated (redundant indexes: a plurality of indexed column is contained; duplicate index: exact duplicate index column)
    as: primary key (id), index (id), unique index (id) ---- - for id is completely repetitive.
    Such as: index (a, b, c ), index (a, b), index (a) ----- comprising three indexes a, generates redundant.
  • For frequent queries priority to the use of covering indexes
    covering index: the index contains all the query fields.
    Avoid Innodb secondary index table lookup
    can put into a random order io io accelerate query efficiency
  • Try to avoid using foreign keys
    do not recommend the use of foreign key constraints, but it must be indexed on a key table associated with the table.
    Foreign key can be used to ensure the referential integrity of data, but it is recommended to achieve the business end.
    Foreign key affects writes the parent and child tables, which could affect performance.

4. Database Design Specifications Field

  • Preference data type matches the minimum required memory
    string into digital type memory
    INT_ATON('255.255.255.255')=4294967295
    INET_NTOA(4294967295)='255.255.255.255'
    for non-negative integer, the priority is stored using unsigned integer. Because: Unsigned relative symbol may have more than double the storage space.
    SIGNED INT -2147483648~2147483647
    UNSIGNED INT 0~429467295
    Note:
    N represented in the mysql VARCHAR (N) is the number of characters, rather than bytes. Other relational databases may be bytes, N / 3 can accommodate the number of Chinese operator.
    Characters stored in UTF8, VARCHAR (255) = 765 bytes.
    Excessive length will consume more memory.
  • Avoid using TEXT, BLOB data type
    text into: TinyText, Text, MidumText, LongText . Text data may be stored in the 64k.
    It suggested isolated into separate extension table BLOB or TEXT column.
    TEXT or BOLB type can only use the prefix index, and this type do not have default values.
  • Avoid using ENUM Data Type
    First: enum value of the first type is similar type varchar value converted to type int stores, to reduce the memory overhead that it is good, but the death than good.
    (1) modify ENUM, use an ALTER statement. There is an operation table level of risk of misuse, unpredictable consequences.
    (2) enum type of order by low operating efficiency, the need for additional operations. Because: if the value of enum type varchar first be converted to type int varchar reordering consumption performance comparison.
    (3) prohibits the use of enum value as enumerated value. Because: enum itself is outside the int type is converted to int storage, if their value is stored int would be likely to cause confusion in logic, if we store is usually an integer, directly stored as an integer.
  • Try to put all NULL column is defined as the NOT
    (1) Index null columns require additional space to keep them, so to take up more space.
    (2) To do a special deal for null values are compared.
  • Do not use string data stored date type
    (1) can not be calculated and compared with date function
    (2) storing a string take up more space, if the string to occupy at least 16 bytes, but only with datetime It requires 8 bytes
  • The TIMESTAMP or DATETIME type storage time
    (1) timestamp storage section 00 is 1970-01-01: 00: 01 ~ 2038-01-19 03:14:07
    (2) and 4 bytes timestamp int the same, but readability than int
    (3) section beyond timestamp type using datetime
  • With the financial data related to the amount of the class, you must use the decimal type
    (1) decimal precision floating-point type, in the calculation will not lose precision.
    (2) determined by the width of the space occupied defined
    (3) may be used for larger integer than the data stored bigint

5. sql development specification

  • Prepared statement is recommended for database operations
    Here Insert Picture Description
    (1) only passing parameters, more efficient than passing sql statement
    (2) may prevent risks sql injection
    (3), the same statement can be resolved once, multiple use, improve the processing efficiency
  • Avoid implicit conversion data types
    (1) implicit conversion, would lead to failure of the index.
  • There is a reasonable full use of the index, rather than blindly increasing index
    query conditions (1) to avoid the use of a double number%, such as: a like '% abc%' , may be utilized to remove the left index%.
    (2) to comply with a sql can only use an index range searches.
    (3) use left join or not exists to optimize not in, because not in the index will lead to failure.
  • To connect different databases use different accounts, prohibit cross-database query
    (1) for the database migration and sub-library sub-table to leave room
    (2) reduce the degree of coupling business
    (3) avoid excessive rights arising from security risks
  • * Prohibit the use of the SELECT  , you must use select <field list> query
    (1) of the cup and consume more network bandwidth and IO resources, query more useless data.
    (2) can not be used to cover index
    (3) to reduce the impact of the change table structure
  • 禁止使用不含字段列表的INSERT语句
    (1)insert into t values('a',1);应该明确字段列表insert into t(c1,c2) values('a',1);
    (2)可以减少表结构变更带来的影响
  • 避免使用子查询,可以把子查询优化为join操作
    (1)子查询的结果集无法使用索引
    (2)子查询会产生临时表操作,如果子查询数据量大则严重影响效率。
    (3)子查询的临时表无索引,会产生大量慢查询,消耗过多cpu以及io资源。
  • 避免使用JOIN关联太多的表
    (1)每join一个表会多占用一部分内存(join_buffer_size)
    (2)会产生临时表操作,影响查询效率
    (3)mysql最多允许关联61个表,建议不超过5个
  • 减少同数据库的交互次数
    (1)数据库更适合批处理操作
    (2)合并多个相同的操作到一起,可以提高处理效率
alter table t1 add column c1 int,change column c2 int ... 
  • 1
  • 使用in代替or
    (1)in的值不要超过500个 。
    (2)in操作可以有效利用索引,而用or大多数情况下很少利用到索引。
  • 禁止使用order by rand()进行随机排序
    (1)会把表中所有符合条件的数据转载到内存中进行排序
    (2)会消耗大量的cpu以及io和内存资源
    (3)对于这种获取随机数据可以采用:在程序中获取一个随机值,可以是id,然后按照这个随机值从数据库中获取数据,也即这种随机获取的业务在代码中去实现。
  • WHERE从句中禁止对列进行函数转换和计算
    (1)对列进行函数转换或计算会导致无法使用索引
where date(createtime)='20190423'--这样会导致在createtime列上的索引失效 where createtime>='20190423' and createtime<'20190423'--这样索引不会失效 
  • 1
  • 2
  • When obvious or may not allow duplicate values duplicate values, use UNION ALL instead of the UNION
    (. 1) Union all of the data will then be placed in a temporary table to re-operation
    (2) union all the results will not be set deduplication operation
  • Sql large complex split into a plurality of small sql
    (. 1) can only use a MySQL a sql calculated cpu
    (2) after the split sql processing efficiency can be improved by performing parallel

6. Database practices

  • More than one million lines of batch write operation to operate multiple batches
    (1) high-volume operation may cause serious delays from the master
    (2) binlog log is row format will generate a lot of log
    (3) to avoid large transactional operations
  • Modification of large tables of data must be careful, it may cause serious lock table operations, especially in the production environment, can not be tolerated
  • Using the modified table structure pt-online-schema-change tool for the large tables
    (1) to avoid the modified master table generated from the large delay
    for the lock table when the table to avoid modification (2)
  • Prohibit a program that uses the super authority given account
    (1) when the number of connections reaches the maximum limit, the connection allows a super user privileges.
    (2) super dba privileges can only be left to deal with the problem as the use of the account.
  • Program database connection account, follow the principle of least privilege
    (1) program uses a database account can only be used in a db, are not allowed to cross-database.
    (2) program uses a database account, in principle, there are quasi-drop privileges.

other

  • We can look at the database three paradigms

Guess you like

Origin www.cnblogs.com/apolloren/p/11985210.html