DBA operation and maintenance and management operation specifications

DBA operating specifications

1. Modification/deletion of data related to the business can only be executed after approval by the business party and CTO by email, and backup before execution, and reversible if necessary.


2. All online requirements must go through the work order system, and oral notifications are considered invalid.


3. When making changes to the table structure of a large table, such as modifying the field attributes will cause the table to be locked, and will cause delays in the library, which will affect the online business. It must be executed during the low peak period of the business after 0:00 in the morning, and the other unified use The tool pt-online-schema-change avoids table locks and reduces delayed execution time.

Usage example:

#pt-online-schema-change  --alter="add index   IX_id_no(id_no)"  \
--no-check-replication-filters  --recursion-method=none  --user=dba    \  
--password=123456  D=test,t=t1 --execute

For MongoDB, create indexes in the background to avoid locking tables.

Usage example:

db.t1.createIndex({idCardNum:1},{background:1})

4. All online business libraries must build an MHA high-availability architecture to avoid single-point problems.


5. When granting authority to the business side, the password must be encrypted with MD5, at least 16 digits. If there are no special requirements for permissions, they are all select query permissions and are subject to database table-level restrictions.


6. Delete the default empty password account.

delete from mysql.user where user='' and password='';

flush privileges;

7. The summary library opens the Audit audit log function so that problems can be traced.


Code of Conduct

8. It is forbidden to store multiple business databases in one MySQL instance, which will result in high business coupling. Once a problem occurs, it will hurt the pond and increase the difficulty of locating faults. Multi-instance is usually used to solve the problem, one instance has one service library, and there is no mutual interference.


9. It is forbidden to perform background management and statistical functional queries on the main library. This complex type of SQL will cause an increase in CPU, which will affect business.


10. To clean data in batches, the development and DBA need to review it together, and should avoid execution during peak business hours, and observe the service status during execution.


11. Promotion activities, etc. should be communicated face-to-face with the DBA in advance for traffic evaluation, such as increasing machine memory or expanding the architecture a week in advance to prevent DB performance bottlenecks.


12. It is forbidden to do database stress testing online.


Basic specification

13. It is forbidden to store plaintext passwords in the database.


14. Use InnoDB storage engine.

  • Support transactions, row-level locks, better recoverability, and better performance under high concurrency.
  • InnoDB tables avoid the use of COUNT(*) operations. Because there are no internal counters, they need to be calculated line by line. The real-time requirements for count statistics are strong. Memcache or Redis can be used.

15. The table character set uniformly uses UTF8.

  • No risk of garbled characters.

16. All tables and fields need to add Chinese comments.

  • Convenient for others and yourself.

17. Do not store big data such as pictures and files in the database.

  • Pictures and files are more suitable for GFS distributed file system, and hyperlinks can be stored in the database.

18. Avoid using stored procedures, views, triggers, and events.

MySQL is an OLTP application. It is best at simple addition, deletion, modification, and query operations, but it is not suitable for logical calculation and analysis applications, so this part of the demand is best achieved through programs.


19. Avoid using foreign keys. Foreign keys are used to protect referential integrity and can be implemented on the business side.

  • Foreign keys will cause coupling between the parent table and the child table, which greatly affects SQL performance, excessive lock waits, and even deadlocks.

20. Businesses that do not require high transaction consistency, such as log tables, are preferred to be stored in MongoDB.

  • The sharding sharding function supported by itself enhances the ability of horizontal expansion, and the development does not need to adjust the business code too much.

Library table design specification

21. The table must have a primary key, such as a self-incrementing primary key.

  • This can ensure that the data rows are written in order. For SAS traditional mechanical hard disks, the write performance is better, and the performance of associative queries based on the primary key will be better, and it also facilitates the data warehouse extraction. From a performance point of view, using UUID as the primary key is the worst method, it will make the insertion become random.

22. The use of partition tables is prohibited.

  • The advantage of a partitioned table is that for development, there is no need to modify the code. Through the settings of the back-end DB, such as splitting the time field, the table can be split easily. But this involves a problem, the query field must be the partition key, otherwise it will traverse all the partition tables, and will not bring performance improvement. In addition, the partition table is still a table in terms of physical structure. At this time, we change the table structure without any performance improvement. Therefore, the split table should be used for splitting. If you need to query historical data in the program, you can associate the query through union all. In addition, as time goes by, the historical data table is no longer needed, just dump it from the library, that is, easily migrate to the backup machine.

Field design specification

23. Use DECIMAL instead of FLOAT and DOUBLE to store accurate floating-point numbers.
The disadvantage of floating-point numbers is that they can cause precision problems. Please see the following example:

mysql> CREATE TABLE t3 (c1 float(10,2),c2 decimal(10,2));       
Query OK, 0 rows affected (0.05 sec)
>mysql> insert into t3 values (999998.02, 999998.02);    
Query OK, 1 row affected (0.01 sec)
>mysql> select * from t3;
+-----------+-----------+
| c1        | c2        |
+-----------+-----------+
| 999998.00 | 999998.02 |
+-----------+-----------+
1 row in set (0.00 sec)

You can see that the value of the c1 column has changed from 999998.02 to 999998.00, which is caused by the imprecision of the float type. Therefore, data that is sensitive to accuracy such as currency should be expressed or stored in fixed-point numbers.


24. Use TINYINT instead of ENUM type.

  • Using enum enumeration type, there will be expansion problems, such as user online status, if you increase this time: 5 means do not disturb, 6 means in a meeting, 7 means invisible and visible to friends, then add a new ENUM value to do DDL Modify the table structure operation.

25. The field length should be allocated according to actual needs as much as possible, and a large capacity should not be allocated randomly.

The general principle of selecting a field is to keep it small but not big, and fields with fewer bytes do not need large fields. For example, the primary key is strongly recommended to use int integer instead of uuid. Why? Save space. What is space? Space is efficiency! According to 4 bytes and 32 bytes to locate a record, it is too obvious who is fast and who is slow. When several tables are involved in join, the effect is more obvious. Smaller field types take up less memory, take up less disk space and disk I/O, and also take up less bandwidth.

Many developers use int for all numeric types when designing table fields, but this is not necessarily appropriate, such as the user's age. Generally speaking, most of the ages are between 1 and 100 years old, and the length is only 3. , Then using int is not suitable, you can use tinyint instead. Another example is the user's online status. 0 means offline, 1 means online, 2 means away, 3 means busy, 4 means invisible, etc. In fact, it is not necessary to use int and waste space. Using tinyint can completely meet the needs. , Int occupies 4 bytes, while tinyint occupies 1 byte.

Int integer signed (signed) maximum value is 2147483647, and unsigned (unsigned) maximum value is 4294967295, if your needs do not store negative numbers, then it is recommended to change to signed (unsigned), you can increase the int storage range.

There is no difference between int(10) and int(1). 10 and 1 are just the width, which is useful when setting the zerofill extended attribute, for example:

root@localhost(test)10:39>create table test(id int(10) zerofill,id2 int(1));
Query OK, 0 rows affected (0.13 sec)
root@localhost(test)10:39>insert into test values(1,1);
Query OK, 1 row affected (0.04 sec)
root@localhost(test)10:56>insert into test values(1000000000,1000000000);
Query OK, 1 row affected (0.05 sec)
root@localhost(test)10:56>select * from test;
+------------+------------+
| id         | id2        |
+------------+------------+
| 0000000001 |          1 |
| 1000000000 | 1000000000 |
+------------+------------+
2 rows in set (0.01 sec)

26. The field is defined as NOT NULL to provide a default value.
From the perspective of the application layer, the program judgment code can be reduced. For example, if you want to query a record, if there is no default value, do you have to first determine whether the variable corresponding to the field is set? If not, you have to set the variable through java If the default value is set, the judgment condition can be skipped directly.

NULL value is difficult to optimize query, it will make index statistics more complicated, but also need special processing within MySQL.


27. Do not use TEXT and BLOB types as much as possible.

  • Increase storage space occupation, slow reading speed.

Index specification

28. Indexes are not as many as possible. Create them according to actual needs.

  • Index is a double-edged sword, it can improve query efficiency but also reduce the speed of insert and update and take up disk space. Appropriate indexes are critical to application performance, and it is extremely fast to use indexes in MySQL. Unfortunately, indexes also have related overhead. Each time you write to a table (such as INSERT, UPDATEH, or DELETE), if there are one or more indexes, MySQL also needs to update each index, so that the index increases the overhead of writing to each table. Only when a column is used in the WHERE clause, can you enjoy the benefits of index performance improvement. If you don't use an index, it has no value and will bring maintenance overhead.

29. The query field must be indexed.

  • Such as: 1. WHERE condition column of SELECT, UPDATE, DELETE statement; 2. Field of multi-table JOIN.

30. Do not perform mathematical operations and function operations in the index column.
Unable to use the index, resulting in a full table scan.
Example: SELECT * FROM t WHERE YEAR(d) >= 2016;
Since MySQL does not support functional indexes like Oracle, even if the d field has an index, it will scan the entire table directly.
Should be changed to ----->
SELECT * FROM t WHERE d >= '2016-01-01';


31. Do not create indexes on low cardinality columns, such as'gender'.
Sometimes, full table browsing is faster than having to read the index and data table, especially when the index contains an evenly distributed data set. A typical example of this is gender, which has two evenly distributed values ​​(male and female). Passing gender requires reading about half of the rows. In this case, a full table scan is faster.


32. Do not use% leading queries, such as like'%xxx'.
Unable to use the index, resulting in a full table scan.

低效查询
SELECT * FROM t WHERE name LIKE '%de%';
----->
高效查询
SELECT * FROM t WHERE name LIKE 'de%';

33. Do not use reverse query, such as not in / not like.
Unable to use the index, resulting in a full table scan.


34. Avoid redundant or duplicate indexes.
The joint index IX_a_b_c(a,b,c) is equivalent to (a), (a,b), (a,b,c), then the indexes (a) and (a,b) are redundant.


SQL design specification

* 35. SELECT is not used  , only necessary fields are obtained. **
Consumes CPU and IO, and consumes network bandwidth;
covering index cannot be used.


36. Replace OR with IN.

低效查询
SELECT * FROM t WHERE LOC_ID = 10 OR LOC_ID = 20 OR LOC_ID = 30;
----->
高效查询
SELECT * FROM t WHERE LOC_IN IN (10,20,30);

37. Avoid inconsistent data types.

SELECT * FROM t WHERE id = '19';
----->
SELECT * FROM t WHERE id = 19;

38. Reduce the number of interactions with the database.

INSERT INTO t (id, name) VALUES(1,'Bea');
INSERT INTO t (id, name) VALUES(2,'Belle');
INSERT INTO t (id, name) VALUES(3,'Bernice');
----->
INSERT INTO t (id, name) VALUES(1,'Bea'), (2,'Belle'),(3,'Bernice');

Update … where id in (1,2,3,4);

Alter table tbl_name add column col1, add column col2;

39. Reject big SQL and split into small SQL.

低效查询
SELECT * FROM tag
JOIN tag_post ON tag_post.tag_id = tag.id
JOIN post ON tag_post.post_id = post.id
WHERE tag.tag = 'mysql';
可以分解成下面这些查询来代替
----->
高效查询
SELECT * FROM tag WHERE tag = 'mysql'
SELECT * FROM tag_post WHERE tag_id = 1234
SELECT * FROM post WHERE post_id in (123, 456, 567, 9098, 8904);

40. It is forbidden to use order by rand()

SELECT * FROM t1 WHERE 1=1 ORDER BY RAND() LIMIT 4;
---->
SELECT * FROM t1 WHERE id >= CEIL(RAND()*1000) LIMIT 4;

Guess you like

Origin blog.csdn.net/ytp552200ytp/article/details/107971968