Mysql advanced - database design specifications (2)

8. ER model

There are three elements in the ER model, namely entities, attributes and relationships.

Entities can be regarded as data objects, which often correspond to real individuals in real life. In the ER model, it is represented by a rectangle. Entities are divided into two categories, namely strong entities and weak entities. A strong entity refers to an entity that does not depend on other entities; a weak entity refers to an entity that has a strong dependence on another entity.

Attributes refer to the characteristics of entities. For example, the supermarket's address, contact number, number of employees, etc. It is represented by an ellipse in the ER model.

Relationship refers to the connection between entities. For example, when a supermarket sells goods to customers, it is a connection between the supermarket and the customer. It is represented by a diamond in the ER model.

Note: Entities and properties are not easily distinguished. Here is a principle: We should look at it from the perspective of the system as a whole. Entities can exist independently, and attributes cannot be divided. That is, properties cannot contain other properties.

8.1 Types of relationships

Among the three elements of the ER model, relationships can be divided into three types, namely one-to-one, one-to-many, and many-to-many.
One-to-one: refers to the one-to-one relationship between entities. For example, the relationship between an individual and ID card information is a one-to-one relationship. A person can only have one ID card information, and one ID card information only belongs to one person.

One-to-many: It means that the entity on one side can correspond to multiple entities on the other side through relationships. On the contrary, through this relationship, the entity on the other side can only correspond to the entity on the only side. For example, we create a new class table, and each class has multiple students, and each student corresponds to a class. There is a one-to-many relationship between classes and students.

Many-to-many: means that the entities on both sides of the relationship can correspond to multiple entities of the other party through the relationship. For example, in the purchase module, the relationship between suppliers and supermarkets is a many-to-many relationship. One supplier can supply to multiple supermarkets, and one supermarket can also purchase goods from multiple suppliers. Another example is a course selection schedule. There are many subjects, each subject has many students to choose, and each student can choose multiple subjects. This is a many-to-many relationship.

8.2 Modeling analysis

The ER model seems cumbersome, but it is very important for us to control the overall project. If you are just developing a small application, perhaps simply designing a few tables is enough. Once you want to design an application of a certain scale, it is very critical to establish a complete ER model in the initial stage of the project. The essence of developing application projects is actually modeling.

The case we designed is an e-commerce business. Since the e-commerce business is too large and complex, we simplified the business, such as focusing on the meaning of SKU (StockKeeping Unit) and SPU (Standard Product Unit). , we used SKU directly and did not mention the concept of SPU. This e-commerce business design has a total of 8 entities, as shown below.

  • address entity
  • user entity
  • shopping cart entity
  • Comment entity
  • Commodity entity
  • Product classification entity
  • order entity
  • Order details entity

Among them, user and product classification are strong entities because they do not need to depend on any other entities. Others are weak entities, because although they can exist independently, they all depend on the user entity, so they are all weak entities. Knowing these elements, we can
create an ER model for the e-commerce business, as shown in the figure:

Insert image description here

In this figure, the added relationship between addresses and users is a one-to-many relationship, while products and product details show a one-to-one relationship, and products and orders are a many-to-many relationship. This ER model includes 8 relationships between 8 entities.

(1) Users can add multiple addresses on the e-commerce platform;
(2) Users can only have one shopping cart;
(3) Users can generate multiple orders;
(4) Users can post multiple comments;
(5) One item A product can have multiple comments;
(6) Each product category contains multiple products;
(7) An order can contain multiple products, and a product can be in multiple orders.
(8) The order contains multiple order details, because an order may contain different types of goods.

8.3 Refinement of the ER model

With this ER model, we can understand the e-commerce business as a whole. The ER model just now shows the framework of the e-commerce business, but it only includes eight entities: order, address, user, shopping cart, comment, product, product category and order details, and the relationship between them, which cannot be mapped yet. Specific tables, and the relationships between tables. We need to add attributes and represent them with ellipses, so that the ER model we get will be more complete.

Therefore, we need to further design each part of this ER model, that is, refine the specific business processes of e-commerce, and then integrate them together to form a complete ER model. This can help us clarify the design ideas of the database.

(1) The address entity includes user number, province, city, region, recipient, contact number, and whether it is the default address.
(2) User entities include user number, user name, nickname, user password, mobile phone number, email, avatar, and user level.
(3) The shopping cart entity includes the shopping cart number, user number, product number, product quantity, and image file URL.
(4) The order entity includes order number, consignee, recipient phone number, total amount, user number, payment method, shipping address, and order time.
(5) The order details entity includes order details number, order number, product name, product number, and product quantity.
(6) Product entities include product number, price, product name, classification number, whether it is for sale, specifications, and color.
(7) The comment entity includes comment id, comment content, comment time, user number, and product number
(8) The product classification entity includes category number, category name, and parent category number

Insert image description here

8.4 Convert ER model diagram into data table

By drawing the ER model, we have clarified the business logic. Now, we are about to take a very important step: convert the drawn ER model into a specific data table. The principles of conversion are introduced below:

(1) An entity is usually converted into a data table;
(2) A many-to-many relationship is usually converted into a data table;
(3) A 1-to-1, or 1-to-many relationship is often converted through the outside of the table. keys to express rather than designing a new data table;
(4) Convert attributes into table fields.

In fact, any database-based application project can complete the database design work by first establishing an ER model and then converting it into a data table. Creating an ER model is not the purpose. The purpose is to sort out the business logic and design an excellent database. I suggest that you don't model for the sake of modeling, but use the process of creating an ER model to organize your thoughts so that creating an ER model makes sense.

Insert image description here

9. Design principles of data tables

Based on the above content, the general principles of data table design are summarized: "Three less and one more"

  1. The fewer the number of data tables, the better
  2. The fewer fields in the data table, the better
  3. The fewer the number of joint primary key fields in the data table, the better.
  4. The more primary and foreign keys you use, the better

Note: This principle is not absolute. Sometimes we need to sacrifice data redundancy in exchange for data processing efficiency.

10. Suggestions for writing database objects

10.1 About the library

  1. [Mandatory] The name of the library must be controlled within 32 characters. Only English letters, numbers and underscores can be used. It is recommended to start with an English letter.
  2. [Mandatory] All library names in Chinese and English must be lowercase, and different words should be separated by underscores. You must see the name and know the meaning.
  3. [Mandatory] The name format of the library: business system name_subsystem name.
  4. [Mandatory] It is forbidden to use keywords (such as type, order, etc.) in the library name.
  5. [Mandatory] The character set must be explicitly specified when creating a database, and the character set can only be utf8 or utf8mb4. SQL example for creating a database: CREATE DATABASE crm_fund DEFAULT CHARACTER SET 'utf8';
  6. [Recommendation] For programs to connect to database accounts, follow the principle of least permissions and use the database account only under one DB, and cross-databases are not allowed. In principle, the account used by the program is not allowed to have drop permissions.
  7. [Recommendation] The temporary library is prefixed with tmp_ and the date is the suffix; the backup library is prefixed with bak_ and the date is the suffix.

10.2 About tables and columns

  1. [Mandatory] Table and column names must be controlled within 32 characters. Table names can only use English letters, numbers and underscores. It is recommended to start with an English letter.

  2. [Mandatory] Table names and column names must be lowercase, and different words should be separated by underscores. You must see the name and know the meaning.

  3. [Mandatory] The table name must be strongly related to the module name. Table names in the same module should use a unified prefix as much as possible. For example: crm_fund_item

  4. [Mandatory] The character set must be explicitly specified as utf8 or utf8mb4 when creating a table.

  5. [Mandatory] Keywords (such as type, order, etc.) are prohibited in table names and column names.

  6. [Mandatory] The table storage engine type must be explicitly specified when creating a table. If there are no special requirements, it will always be InnoDB.

  7. [Mandatory] Comments are required when creating a table.

  8. [Mandatory] Field naming should use English words or abbreviations that express the actual meaning whenever possible. For example: company ID, do not use corporation_id, just use corp_id.

  9. [Mandatory] The field of Boolean value type is named is_description. For example, the field on the member table that indicates whether a member is enabled is named is_enabled.

  10. [Mandatory] It is prohibited to store large binary data such as pictures and files in the database. Usually the file is very large, causing the data volume to grow rapidly in a short period of time. When the database reads the database, a large number of
    random IO operations are time-consuming. Usually stored on a file server, the database only stores file address information.

  11. [Suggestion] About the primary key when creating a table: The table must have a primary key (1) It is mandatory that the primary key is id, the type is int or bigint, and it is auto_increment. It is recommended to use unsigned unsigned type. (2) The field that identifies the subject of each row in the table should not be set as the primary key. It is recommended to set it as other fields such as user_id, order_id, etc., and establish a unique key index. Because if it is set as the primary key and the primary key value is randomly inserted, it will cause innodb internal page splits and a large number of random I/O, resulting in performance degradation.

  12. [Recommendation] The core table (such as the user table) must have the creation time field (create_time) and the last update time field (update_time) of the row data to facilitate troubleshooting.

  13. [Recommendation] All fields in the table should have NOT NULL attributes as much as possible, and the business can define DEFAULT values ​​as needed. Because using NULL values ​​will cause problems such as each row occupying additional storage space, data migration is prone to errors, and aggregate function calculation results are biased.

  14. [Recommendation] All column names and column types that store the same data must be consistent (generally used as related columns, if the related column types are inconsistent during query, the data type will be automatically converted implicitly, which will cause the index on the column to become invalid, resulting in reduced query efficiency).

  15. [Recommendation] The intermediate table (or temporary table) is used to retain the intermediate result set, and the name starts with tmp_. The backup table is used to back up or capture a snapshot of the source table, and its name starts with bak_. Intermediate tables and backup tables are cleaned regularly.

  16. [Demonstration] A more standardized table creation statement:

CREATE TABLE user_info (
	`id` INT UNSIGNED NOT NULL AUTO_INCREMENT COMMENT '自增主键',
	`user_id` BIGINT ( 11 ) NOT NULL COMMENT '用户id',
	`username` VARCHAR ( 45 ) NOT NULL COMMENT '真实姓名',
	`email` VARCHAR ( 30 ) NOT NULL COMMENT '用户邮箱',
	`nickname` VARCHAR ( 45 ) NOT NULL COMMENT '昵称',
	`birthday` date NOT NULL COMMENT '生日',
	`sex` TINYINT ( 4 ) DEFAULT '0' COMMENT '性别',
	`short_introduce` VARCHAR ( 150 ) DEFAULT NULL COMMENT '一句话介绍自己,最多50个汉字',
	`user_resume` VARCHAR ( 300 ) NOT NULL COMMENT '用户提交的简历存放地址',
	`user_register_ip` INT NOT NULL COMMENT '用户注册时的源ip',
	`create_time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
	`update_time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间',
	`user_review_status` TINYINT NOT NULL COMMENT '用户资料审核状态,1为通过,2为审核中,3为未
	通过,4为还未提交审核',
	PRIMARY KEY ( `id` ),
	UNIQUE KEY `uniq_user_id` ( `user_id` ),
	KEY `idx_username` ( `username` ),
KEY `idx_create_time_status` ( `create_time`, `user_review_status` ) 
) ENGINE = INNODB DEFAULT CHARSET = utf8 COMMENT = '网站用户基本信息'
  1. [Recommendation] When creating tables, you can use visualization tools. This ensures that all conventions related to tables and fields can be set.

10.3 About indexes

  1. [Mandatory] The primary key of the InnoDB table must be id int/bigint auto_increment, and the primary key value is prohibited from being updated.
  2. [Mandatory] For InnoDB and MyISAM storage engine tables, the index type must be BTREE.
  3. [Recommendation] The name of the primary key starts with pk_, the unique key starts with uni_ or uk_, and the ordinary index starts with idx_. Always use lowercase format, with the name or abbreviation of the field as the suffix.
  4. [Suggestion] For a columnname composed of multiple words, take the first letters of the first few words and add the last word to form column_name. For example:
    the index on sample table member_id: idx_sample_mid.
  5. [Recommendation] The number of indexes on a single table cannot exceed 6.
  6. [Suggestion] When building an index, consider building a joint index and put the field with the highest distinction first.
  7. [Recommendation] In the SQL of multi-table JOIN, ensure that the connection column of the driven table has an index, so that the JOIN execution efficiency is the highest.
  8. [Recommendation] When creating a table or adding an index, ensure that there are no redundant indexes in the tables. For example: if key(a,b) already exists in the table, key(a) is a redundant index and needs to be deleted.

10.4 SQL writing

  1. [Mandatory] The terminal SELECT statement must specify a specific field name, and writing * is prohibited.
  2. [Suggestion] Specify the specific field name in the insert statement on the terminal. Do not write INSERT INTO t1 VALUES(…).
  3. [Recommendation] Except for static tables or small tables (within 100 rows), DML statements must have WHERE conditions and use index search.
  4. [Recommendation] INSERT INTO...VALUES(XX),(XX),(XX)... The value of XX here should not exceed 5000. Although the value is too high, it will go online quickly, but it will cause master-slave synchronization delay.
  5. [Recommendation] Do not use UNION in the SELECT statement. It is recommended to use UNION ALL, and the number of UNION clauses should be limited to 5.
  6. [Recommendation] In online environment, multi-table JOIN should not exceed 5 tables.
  7. [Recommendation] Reduce the use of ORDER BY, and communicate with the business without sorting without sorting, or put the sorting in the program. ORDER BY, GROUP BY, and DISTINCT statements are relatively CPU-intensive, and the CPU resources of the database are extremely valuable.
  8. [Recommendation] For queries including ORDER BY, GROUP BY, and DISTINCT, please keep the result set filtered by the WHERE condition within 1,000 rows, otherwise SQL will be very slow.
  9. [Recommendation] Multiple alter operations on a single table must be merged into one. Alter table on a large table with more than 1 million rows must be reviewed by the DBA and executed during the off-peak period of the business. Multiple alters need to be integrated together. Because alter table will generate table locks, all writes to the table will be blocked during this period, which may have a great impact on the business.
  10. [Recommendation] When operating data in batches, it is necessary to control the transaction processing interval and perform necessary sleep.
  11. [Recommendation] The transaction contains no more than 5 SQL statements. Because transactions that are too long will cause problems such as locking data for a long time, MySQL internal cache, excessive connection consumption, etc.
  12. [Recommendation] The update statement in the transaction should be based on the primary key or UNIQUE KEY as much as possible, such as UPDATE... WHERE id=XX; otherwise, gap locks will be generated and the lock range will be expanded internally, resulting in reduced system performance and deadlock.

Guess you like

Origin blog.csdn.net/qq_51495235/article/details/133208728