Reproduced Database Performance Optimization Strategy

Blog description link: https: //www.cnblogs.com/studynote/p/8079154.html

 

First, the database design paradigm six

We've all heard: There are several database design paradigm, most notably in third normal form. 
1. The first paradigm (1NF): attribute inseparable
first paradigm is the most basic paradigm. If all the fields in the database table values are not decomposed atomic value, it shows that the first database table satisfies the paradigm.

2. Second Normal Form (2NF): meet 1NF, fully functional dependency
second paradigm need to ensure that the database table and each column primary key related, not related only to a certain part of the primary key (primary key for the main terms). That in a database table, a table can only preserve a data, we can not put a variety of data stored in the same database table.

3. The third paradigm (3NF): meet 2NF, eliminating transfer dependency
third Paradigm ensures that each column of data and the data in the table are directly related to the primary key, rather than indirectly.

4. BC Paradigm (BCNF): 3NF compliance, and, the main property is not dependent on the primary attribute.
If relations to a third paradigm, and it is only one candidate code, or it is a single attribute for each candidate code, the relationship between natural Model BC reached.

5. Fourth normal form (4NF): comply with BCNF, asked to delete many relationship in the same table.

6. Fifth Normal (5NF): 4NF compliance, the table is divided into a block as small as possible, to eliminate all redundant data in the table.

Second, the anti-normalization and standardization

Not the best design, only the most suitable design, so do not be too much emphasis on theory.
In the database design, data should be organized according to two categories: frequently accessed data and the frequent changes of data. 
For frequently accessed but not frequently modified data, data table design should denormalization. 
For frequent changes but not frequently accessed data, data table design should be standardized. 
The normalized tables sometimes need a logical database design basis, and then required the entire application system, anti-normalization.
Standard and anti norms are established constraints in the actual operation basis, from both the practical no sense. Only a reasonable combination of the two together to complement each other to play their respective advantages.

Third, the anti-normalization method

In the design table should table taking certain anti-standardized methods are the following:

First, the partition table. 
Partition table can be divided vertically split-level table and a partition table are two: 
the level segmentation is based on a table divided into a plurality of tables, each of which can speed up the search table, but the query to select a different table updates, Statistics when you want to summarize multiple tables, so the application would be more complicated. 
A vertical split for many table column, some columns if the access frequency is much higher than the other columns may be the primary key and the columns as a table, the primary key and other columns as another table. By reducing the width of the column, increases the number of lines per page of data, one I / O lines can be scanned more, thereby increasing the speed of access to each table. However, due to the resulting multi-table joins, so it should be used at the same time query or update the partition table in a different column of relatively few cases.

The second is to retain redundant column. When two or more tables in the query often need to be connected, it can be increased in a table wherein a plurality of redundant columns, in order to avoid too frequent connection between tables, data redundant columns typically change infrequently the case under use.

The third is to increase the derived column. Derived from other column is a calculated plurality of columns in the table, the column can be reduced to increase the derived statistical operation, when data collection can greatly reduce the computation time.

 

Fourth, the optimization of field types

Optimization principles:
1. Try to use fixed-length type, because the fixed length table will be faster.
If all the fields in the table are "fixed length", the entire table is considered a "static" or "fixed-length". For example, the following table is no type of field: VARCHAR, TEXT, BLOB. As long as you include one of these fields, then the table is not a "fixed-length static table", so, MySQL engine will use another method to deal with.

Fixed length table will improve performance, because MySQL search will get faster, because the fixed length is easy to calculate the next offset in the data, it will naturally be read quickly. And if the field is not a fixed length, so every time to find the next one, then the program needs to find the primary key.

And, fixed length tables are also more likely to be cached and reconstruction. However, the only side effect is that fixed-length fields will waste some space, because the fixed-length field whether you use do not, he is to be allocated so much space.

Use "vertical split" technique, you can split into two your table is a fixed length is a variable length.

2. Use small type, because it is more space-saving storage, and the smaller the column faster.
For example:
. A possible use MEDIUMINT, SMALLINT TINYINT or less instead of the INT;
B for one day only accurate to the type of data, instead of using DATETIME DATE better.
C TIMESTAMP type instead of using DATETIME type, because it is. DATETIME requires only half the storage space of the type
d. the IP address is saved as the INT UNSIGNED
E. ENUM used instead VARCHAR. ENUM type is very fast and compact. In fact, its preservation is TINYINT, but its appearance on the show as a string. As a result, use this field to do some of the options list becomes quite perfect.

 

Fifth, the rational design of data tables

1. Each table is always set an ID for
we owe it to each table in the database are set to an ID as its primary key, and preferably an INT type (recommended UNSIGNED), and set the automatic increase in the AUTO_INCREMENT flag .
  
2. The rational use of the index
for query optimization to avoid full table scan, should first consider indexing by the column involved in where and order. The index is not possible, the corresponding index can certainly improve the efficiency of select, but also reduces the efficiency of insert and update, because it is possible to rebuild indexes when insert or update, so the need to carefully consider how to build the index, as the the case may be.

3. NULL as much as possible using the NOT 
NULL type is rather special, SQL optimization is difficult, such as the difficulty indexing. But it may also require more memory than the EMPTY space NULL value is stored.

Sixth, the rational use of SQL statements

1. Avoid SELECT * command returns only the required fields

2. Use LIMIT 1 made unique row
when there are times when you query the table, you already know the result will only be a result, but because you might need to go fetch cursor, or you might go check the number of records returned.
In this case, the performance can be increased together LIMIT 1. As such, MySQL database engine will stop the search after finding a piece of data, rather than continue to check back next line with less data record.

3. fields should be avoided to a null value is determined in the where clause, will cause the engine to give up using the index and a full table scan, such as:
SELECT ID from T where NUM IS null

4 should be avoided! = Or <> in the where clause operator and not in, or will result in a full table scan.

5. If there is a field where there is no index field clause, will cause the engine to give up using the index and full table scan.

6. Search for the string column LIKE '% a' can not be used to retrieve the advantages of index, column LIKE 'a%' to use the index to retrieve advantage.

7. Should be avoided fields operations expression in the where clause, which would cause the engine to give up using the index and full table scan. Such as:
SELECT NUM ID from where T / 2 = 100
should read:
SELECT ID from where NUM = 100 * T 2
8. The fields should be avoided for the function operated in the where clause that will cause the engine to give up using the index a full table scan. Such as:
SELECT ID WHERE T from the substring (name, l, 3) = 'ABC' --name id abc beginning to
select id from t where datediff (day , createdate, '2005-11-30') = 0 - '2005-11-30' - id generated
should read:
SELECT id from T WHERE name like 'ABC%'
SELECT id from CreateDate WHERE T> = '2005-11-30' and CreateDate < '2005-12- 1 '
9. Do not carry out functions, arithmetic operations, or other expressions in the where clause "=" left, or the system may not work properly indexed.

10. Avoid real number and date / time = operator using other types, because the results may not be practical.

11. Use or avoid as:
SELECT ID from T WHERE NUM = 10 or the Name = 'ADMIN'
can this query:
SELECT ID NUM WHERE T = 10 from
Union All
SELECT T WHERE ID from the Name = 'ADMIN'



Author: Jingle Guo
Source: http://www.cnblogs.com/studynote/
if the title of "reprint" then this article belongs to original author. If there is no word reprint this article belongs to the author of all, welcome to reprint, but without the author's consent declared by this section must be retained, and given the original connection in the apparent position of the article page, otherwise rights reserved legal liability.

Guess you like

Origin www.cnblogs.com/twuxian/p/11357235.html