Study summary of those things about database design

Video link:
those things about database design

Q: What is database design?
According to the specific needs of the business system, combined with the DBMS (database management system) we have chosen, we construct an optimal data storage model for this business system. And the process of establishing the table structure in the database and the relationship between the table and the table. It can effectively store the data in the application system, and can efficiently access the stored data

Q: What are the good and bad database design settings?
Excellent:
1. Reduce data redundancy
2. Avoid abnormal data maintenance
3. Save storage space
4. Efficient access

Bad:
1. There is a large amount of data redundancy
2. There is abnormal data insertion, update, and deletion
3. A lot of storage space is wasted
4. Data access is inefficient

Q: What are the four steps of database design?
1. Demand analysis
2. Logic analysis
3. Physical design
4. Maintenance optimization

Q: What is the database requirement analysis?
1. What is the data
2. What are the attributes and storage characteristics of the
data 3. What are the characteristics of the data and attributes
4. The life cycle of the data

Q: What does the database logic design do?
1. Transform the demand into a logical model of the database
2. Display the logical model in the form of an ER diagram
3. It has nothing to do with the specific DBMS system selected

Q: What are the considerations for database physical design?
1. Select the appropriate database management system
2. Define the naming conventions for databases, tables and fields
3. Select the appropriate field type according to the selected DBMS system
4. De-normalized design

Q: What to do for database maintenance and optimization?
1. Maintain the data dictionary
2. Maintain the index
3. Maintain the table structure
4. When appropriate, split the table horizontally or vertically

Q: What characteristics of the data need to be clarified in the database requirements analysis?
1. Entity and the relationship between entities (1 to 1, 1 to many, many to many)
2. What are the attributes contained in the entity?
3. Which attributes or combination of attributes can uniquely identify an entity

Noun explanation:
relationship: a relationship corresponds to a table that is commonly referred to as
tuple: a row in the table is a tuple
Attribute: a column in the table is an attribute; each attribute has a name, called the attribute name
Candidate code: A certain attribute group in the table can uniquely determine a tuple.
Master code: There are multiple candidate codes for a relationship, and one of them is selected as the main code.
Domain: the value range of the attribute.
Component: an attribute in the tuple value

ER legend description:
Rectangle: indicates the entity set, and the name of the entity set is written in the rectangle.
Diamond: indicates the contact set.
Ellipse: indicates the attribute of the entity.
Line segment: connects the attribute to the entity set or connects the entity set to the contact set

Q: What is the insert exception of database operation?
If an entity exists with the existence of another entity, that entity cannot be represented without an entity

Q: What is the update exception of database operation?
If you change the individual attribute of an entity instance corresponding to the table, you need to update multiple rows

Q: What is the deletion exception of database operation?
If a row of the table is deleted to reflect an entity instance, the information of another entity instance will be lost when it fails

Q: What is the data redundancy of the database?
The same data exists in multiple places, or a column in the table can be calculated from other columns

Q: What is the first normal form (1NF) of the database?
All fields in a database table are single attributes and cannot be divided. This single attribute is composed of basic data types, such as integers, floating-point numbers, strings, etc.
That is, the first normal form requires that all tables in the database are two-dimensional tables

Q: What is the second normal form (2NF) of the database?
There is no partial functional dependence
of non-key fields on any candidate key field in the database table. Partial functional dependence refers to the fact that a certain keyword in the combined key determines the non-keyword,
that is, the table of all single key fields. Both comply with the second normal form
. Examples of non-compliance:

Since the relationship between the supplier and the product is many-to-many, only the product name and supplier name can be used to uniquely identify a product. That is, the product name and the supplier name are a set of combined keywords. The following partial functional dependencies exist in the above table:
(product name) -> (price, description, weight, product validity period)
(supplier name) -> (supplier phone)
There are problems: insert exception, delete exception, update exception, Data redundancy
solution:

Q: What is the third normal form (3NF) of the database?
Defined on the basis of the second normal form, if there is no transfer function dependence of the non-key field on any candidate key field in the data table, it conforms to the third normal form
. Example of non-conformity:

There is the following transitive dependency:
(product name)- > (Classification) -> (Classification Description)
That is to say, there is a transfer function dependency of the non-key field "Classification Description" on the key field "Commodity Name".
There is a problem:
(Classification, Classification Description) will be performed for each commodity Records, so there is data redundancy, and there are also data insertion, update and deletion exceptions.
Solution:

Q: What is the BC Paradigm (BCNF) of the database?
On the basis of the third paradigm, database table, if there is no field transfer function according to any one candidate keyword section BC is in line with the paradigm
that is, if a composite key, the composite key can not exist between the functional dependency
is not Compliant example:

Assuming that a supplier contact can only be employed by one supplier, and each supplier can supply multiple products, the following decision relationship exists:
(supplier, product ID) -> (contact, product quantity)
(Contact person, product ID) -> (supplier, product quantity)
has the following relationship and therefore does not meet the BCNF requirements:
(supplier) -> (supplier contact)
(supplier contact) -> (supplier)
and the presence of abnormal operation and data input redundancy
solution:

Q: What are the storage engines commonly used in MySQL?

Q: What principles should be followed for object naming?
1. The principle of readability: use uppercase and lowercase to format library object names for good readability
. 2. The principle of ideogram: the name of an object should be able to describe the object it identifies. That is, the name of the table can reflect the content of the data stored in the table; the name of the stored procedure can reflect the function of the stored procedure
3. Long name principle: as little as possible or not use abbreviations

The selection principle of field type The
data type of the column affects the cost of data storage space on the one hand, and also affects the performance of data query on the other hand. When multiple data types can be selected for a column, numeric types should be given priority, date or binary types second, and character types last. For data types of the same level, data types that take up less space should be preferred

Mainly considered from the following two perspectives:
1. When comparing data (query conditions, join conditions and sorting) operations: the same data, character processing is often slower than number processing
2. In a database, data processing is based on pages , The smaller the length of the column, the better the performance

Q: How to choose between char and varchar?
1. If the length of the data to be stored in the column is almost the same, you should consider using char; otherwise, you should consider using varchar
2. If the maximum data length in the column is less than 50Byte, generally consider using char (of course, if this column Rarely used, based on the consideration of saving space and reducing IO, you can still choose to use varchar)
3. Generally, it is not advisable to define a char type larger than 50Byte.
Note that the byte of UTF-8 is 8Byte

Q: How to choose decimal and float?
1. Decimal is used to store precise data, while float can only be used to store inaccurate data. Therefore, accurate data can only choose decimal type
2. Because the storage space overhead of float is generally smaller than decimal (only 4 bytes are required to be accurate to 7 decimal places, and only 8 bytes are needed to be accurate to 15 decimal places), so inaccurate data Float type preferred

Q: How to store the time type?
1. The advantages and disadvantages of using int to store the time field
Advantages: The field length is smaller than datetime
Disadvantages: Inconvenient to use, function conversion is required.
Restrictions: Can only be stored to 2038-1-19 11:14:07, which is 2^32 for 2147483648
2. The time granularity that needs to be stored
year month day hour minute second week

Q: How to choose the primary key?
1. Distinguish between business primary key and database primary key: business primary key is used to identify business data and to associate tables with tables; database primary key is used to optimize data storage (Innodb generates 6-byte implicit primary key)
2. According to database Type, consider whether the primary key should grow sequentially: some databases are logically stored in the order of the
primary key 3. The field type of the primary key should occupy as little space as possible: for tables stored using a clustered index, a primary key will be appended after each index information

Q: Why do you avoid using foreign key constraints?
1. Reduce the efficiency of data import
2. Increase maintenance costs
3. Although foreign key constraints are not recommended, indexes must be established on the associated columns

Q: Why do you avoid using triggers?
1. Reduce the efficiency of data import
2. Unexpected data anomalies may appear
3. Make the business logic complicated

Q: Regarding reserved fields
1. Cannot accurately know the type of reserved field
2. Can not accurately guide the content stored in the
reserved field 3. The cost of maintaining the reserved field in the future is the same as the cost of adding a field The same
4. Reserved fields are strictly prohibited

Q: What is de-normalization?
In terms of normalization, the third normal form of database design was introduced above. The so-called de-normalization is to appropriately violate the requirements of the third normal form for performance and read efficiency considerations, and allow a small amount of data Redundancy
, denormalization is the use of space for time

Q: Why is it de-normalized?
1. Reduce the number of associations in the table
2. Increase the efficiency of data reading
3. De-normalization must be moderate

Q: How to maintain the data dictionary
1. Use third-party tools to maintain the data dictionary
2. Use the memo field of the database itself to maintain the data dictionary. Take MySQL as an example

3. Export data dictionary

Q: How to choose the appropriate column to create an index?
1. Columns appearing in the where clause, group by clause, and order by clause
2. Columns with high selectivity should be placed in front of the
index 3. Do not include too long data types in the index
Note:
1. The index is not over As many as possible, too many indexes will not only reduce write efficiency but also reduce read efficiency
2. Regularly maintain index fragmentation
3. Do not use mandatory index keywords in SQL statements

Q: How to maintain the table structure?
1. Use online tools to change the table structure
2. Maintain the data dictionary at the same time
3. Control the width and size of the table

Q: Suitable operations in the database
1. Batch operation is better than one-by-one operation
2. It is forbidden to use select * such queries
3. Control the use of user-defined functions
4. Do not use the full-text index in the database

Q: The vertical split principle of the table?
1. Columns that are often queried together are put together
2. Large fields such as text, blob, etc. are split into additional tables

The width of the control table can be split vertically
. The size of the control table can be split horizontally.

Q: How to split the table horizontally?
Use the method of hashing the primary key

Guess you like

Origin blog.csdn.net/u011703187/article/details/105202924