3.6.2 Database system-paradigm judgment: paradigm classification, first normal form, second normal form, third normal form, BC normal form

3.6.2 Database system-paradigm judgment: paradigm classification, first normal form, second normal form, third normal form, BC normal form

paradigm classification

Step by step optimization to solve problems: insertion exception, deletion exception, data redundancy

  • 1NF: attribute values ​​are atomic values ​​that cannot be divided
  • 2NF: Eliminate partial dependence of non-primary attributes on candidate keys
  • 3NF: Eliminate transitive dependencies of non-primary attributes on candidate keys
  • BCNF: Removing Partial and Transitive Dependencies of Primary Attributes on Candidate Keys

The main direction of investigation is to distinguish between paradigms and sometimes optimization operations. BCNF only examines judgments and does not examine optimization.

first paradigm

First Normal Form (1NF): In the relational schema R, if and only if all fields contain only atomic values, that is, each attribute is an indivisible data item, then the relational schema R is said to be in the first normal form.

When it comes to whether attributes can be subdivided, they can be classified as follows:
simple attributes and composite attributes, single-valued attributes and multi-valued attributes, NULL attributes, derived attributes

Simple attributes and composite attributes
Simple attributes are atomic and indivisible. Composite attributes can be divided into smaller values. It is generally believed that attributes are indivisible unless specified. , For example, the name, it is normal to save the name, if there is an explanation, save it according to the last name and first name, corresponding to two columns in the table, first_name, last_name, so that it can be further subdivided is the composite attribute, to the common record address Streets in provinces, cities and districts can also be subdivided according to compound attributes.

Simple attributes and compound attributes
Single-valued attributes and multi-valued attributes refer to values. A single value means that it is unique, and a multi-value means that multiple values ​​​​can be recorded in it, such as phone numbers, emergency contacts, and family members. When we record, we will put If such multiple attributes are placed in one position, we will call this attribute a juicy attribute, that is, one attribute corresponds to multiple values.

NULL attribute
The attribute has no value, or the attribute is unknown, which means it has no meaning, or is unknown.

Derived attributes of NULL attributes can be calculated. For example, age is calculated by date of birth. Age is a derived attribute. Similarly, sales can be calculated by knowing sales volume and unit price. Sales is a derived attribute.

example

For example: Does the relational pattern R (department name, number of senior professional titles) satisfy 1NF? If not, how should it be adjusted?
Please add a picture description
From the figure, it can be seen that the number of senior professional titles can continue to be divided according to professional titles, so it does not meet the requirements of 1NF, and even the standard of the two-dimensional table is not enough. If it does not reach 1NF, the table cannot be successfully built. It is necessary to take the following tree shape that can be further subdivided as a real attribute.

second normal form

Second Normal Form (2NF): If and only if the entity E is in the first normal form (1NF), and each non-primary attribute is completely dependent on the primary key (there is no partial dependency), the entity E is said to be in the second normal form.

example

As shown in the figure below, what problems will exist from the perspective of the relationship model (considering data redundancy, update exceptions, insertion exceptions, and deletion exceptions), and what are the solutions?

Please add a picture description

It can be seen from the figure that the student number and the course number can get the grades, and the corresponding course credits can be obtained from the course number. The data of the credits on the way is redundant and will be updated abnormally. It is easy to update parts, resulting in incomplete updates and possibly data inconsistencies. , if there is no student in a course, there is no way to insert credits, and an insertion exception will occur. When this course is not in use, if you want to delete it, the student ID information of the student will be deleted at the same time, and a deletion exception will occur.

How to optimize?
①First determine which paradigm the relational schema satisfies.
It can be seen that the attribute values ​​are all indivisible atomic values, so at least 1NF is satisfied.
②Find the candidate key
on the basis of 1NF, and find the candidate key of the relational schema. From the perspective of functional dependence , the student number and course number can determine the grade of the only student corresponding to the course. These two conditions are indispensable, so you can get the candidate key is the student number and course number, and the candidate key (student number, course number) ③Know the candidate
key , Find the main attribute and non-main attribute
Main attribute: student number, course number
Non-main attribute: grade, credit
④Whether there is a partial functional dependence of the non-main attribute on the candidate key
If the candidate key is a single attribute, partial functional dependence is impossible
Therefore, when considering 2NF, there are combined candidate keys, and the non-main attributes are partially dependent on the candidate keys. The
credits only depend on the course number, not on the student number, so the non-primary attribute partly depends on the candidate key, so it
only reaches 1NF. , if you want to optimize to achieve 2NF, you need to split the relationship, pay attention when splitting the relationship, and keep the original functional dependencies
(course number, credits)
(student number, course number, grades)

third paradigm

Third Normal Form (3NF): If and only if the entity E is in the second normal form (2NF), and there is no non-key attribute transitively dependent code (ie candidate key) in E, the entity is said to be in the third normal form.

When the non-primary attribute transitively depends on the code: A → B, B → C, C transitively depends on A, and B is a non-primary attribute, it satisfies the third normal form.

Only when the third normal form is reached can the problems of data redundancy, update exceptions, insertion exceptions, and deletion exceptions be solved.

In the case of considering the degree of normalization, if there are no non-main attributes, it is considered that at least 3NF can be achieved.

example

As shown in the figure below, what problems will exist from the perspective of the relationship model (considering data redundancy, update exceptions, insertion exceptions, and deletion exceptions), and what are the solutions?
Please add a picture description
As can be seen from the figure, student number→name, student number→department number, department number→department name, department number→department position Analysis
belongs to which normal form
① Firstly, determine which normal form the relationship mode satisfies.
It can be seen that the attribute values ​​are all inseparable Atomic value, so at least satisfy 1NF
②Look for candidate key
Here is the student number
③Know the candidate key, find the main attribute and non-main attribute
Main attribute: student number
Non-main attribute: name, department number, department name, department position
④Whether it exists Partial functional dependence of non-primary attributes on candidate keys
If the candidate key is a single attribute, partial functional dependence is impossible.
Therefore, when considering 2NF, there are combined candidate keys, and the partial dependence of non-primary attributes depends on the candidate key
and the candidate key There is only one attribute, student number, so there is no partial functional dependence,
so 2NF is satisfied.
⑤ See if there is no non-primary attribute transitively dependent on the candidate key
student number→name, student number→department number, department number→department name, department number→department It can be seen that the
department name and department location are obtained through the department number, so the transfer depends on the student number

Therefore, the degree of standardization can only reach the second normal form. To optimize it into the third normal form, the relationship between the department number needs to be split, and the department number related is used as another relationship model. The department number needs to be retained as the relationship between the two relationship models. Association

(Student number, name, department number)
(Department number, department name, department location)

BC paradigm

BC Normal Form (BCNF): Suppose R is a relational schema, F is its dependency set, and R belongs to BCNF if and only if each dependent determinant in F must contain a candidate key of R.

The degree of normalization of BCNF is very high. It eliminates the partial functional dependence and transfer function dependence of the main attribute on the corresponding candidate key. If it is not eliminated, BC will not be satisfied, and if it is eliminated, it will be satisfied. The determinants on the left all contain candidate codes.

Generally, it is enough to reach 3NF. If BCNF is too finely divided, it will involve multi-table join table query, and the efficiency of joint query is low. Therefore, the idea of ​​denormalization has appeared for query efficiency.

example

In the relational model STJ (S, T, J), S represents a student, T represents a teacher, and J represents a course. Each teacher only teaches one course. There are several teachers for each course, and a student who chooses a certain course corresponds to a fixed teacher.

Please add a picture description
The functional dependency set is: F={T→J, SJ→T}
① Judging whether the 1NF
attribute is satisfied and cannot be divided further
② Judging the candidate key Candidate
keys only appear on the left side, and the in-degree is 0 is the candidate key Candidate
keys are two A composite key SJ, ST
③Judging the primary attribute and non-primary attribute
The candidate key is two composite keys SJ, ST, that is, STJ is a primary attribute, that is, there is no non-primary attribute, and 2NF is satisfied ④Consider that
some functional dependencies
exist in the candidate key In the case of composite keys, partial functional dependencies may occur

The partial functional dependence considered is the partial functional dependence of the non-primary attribute on the candidate key, and if there is no non-prime attribute, it means that there is no such part of dependence, that is, the partial functional dependence of the non-primary attribute on the candidate key is eliminated, satisfying 2NF

If there is no non-primary attribute, then 3NF is satisfied, and when there is no non-prime attribute transitively dependent on the code (namely candidate key), the entity is said to be in the third normal form.

In the case of considering the degree of normalization, if there are no non-main attributes, it is considered that at least 3NF can be achieved.
⑤According to the definition, judge whether the BCNF
function is satisfied. The dependent set is: F={T→J, SJ→T}
The candidate key is two composite keys SJ, ST
T→J's T does not contain the candidate key ,
SJ→T's SJ contains the candidate key

Definition: BC Normal Form (BCNF): Suppose R is a relational schema, F is its dependency set, and R belongs to BCNF if and only if each dependent determinant in F must contain a candidate key of R.

It needs to be included to achieve BCNF, so it is not satisfied, only 3NF is achieved.

Guess you like

Origin blog.csdn.net/qq_41929714/article/details/129797402