Three-paradigm design guidelines for databases

This article is excerpted from the blog garden
database design guidelines (the first, second, and third paradigm description)
I. Introduction to the relational database design paradigm
1.1 The first paradigm (1NF) has no duplicate columns

  • The so-called first normal form (1NF) means that each column of the database table is an inseparable basic data item. There cannot be multiple values ​​in the same column, that is, an attribute in an entity cannot have multiple values ​​or duplicate attributes . If there are duplicate attributes, it may be necessary to define a new entity. The new entity is composed of duplicate attributes, and there is a one-to-many relationship between the new entity and the original entity. In the first normal form (1NF), each row of the table contains only one instance of information. In short, the first normal form is a column without repetition.

Note: In any relational database, the first normal form (1NF) is the basic requirement for the relational model, and a database that does not meet the first normal form (1NF) is not a relational database.

1.2 The second normal form (2NF) attributes are completely dependent on the primary key [eliminate partial sub-function dependencies]

  • The second normal form (2NF) is established on the basis of the first normal form (1NF), that is, to satisfy the second normal form (2NF), the first normal form (1NF) must be satisfied first. The second normal form (2NF) requires that each instance or row in the database table must be uniquely distinguishable. In order to realize the distinction, it is usually necessary to add a column to the table to store the unique identification of each instance. For example, the employee ID (emp_id) column is added to the employee information table, because the employee ID of each employee is unique, so each employee can be uniquely distinguished. This unique attribute column is called the primary key or primary key, primary code.
    The second normal form (2NF) requires that the attributes of entities are completely dependent on the primary key. The so-called complete dependence means that there can be no attributes that only depend on a part of the primary key. If it exists, then this attribute and this part of the primary key should be separated to form a new entity, and there is one-to-many between the new entity and the original entity relationship. In order to realize the distinction, it is usually necessary to add a column to the table to store the unique identification of each instance. In short, the second paradigm is that attributes are completely dependent on the primary key.

1.3 Third Normal Form (3NF) attributes do not depend on other non-primary attributes [eliminate transitive dependence]

  • To meet the third normal form (3NF), you must first meet the second normal form (2NF). In short, the third normal form (3NF) requires that a database table does not contain non-primary key information that is already contained in other tables. For example, there is a department information table, where each department has information such as department number (dept_id), department name, and department profile. Then after the department number is listed in the employee information table, department-related information such as department name and department profile can no longer be added to the employee information table. If there is no department information table, it should be constructed according to the third normal form (3NF), otherwise there will be a lot of data redundancy. In short, the third paradigm is that attributes do not depend on other non-primary attributes.

II. Analysis of Paradigm Application Examples

  • The following takes a school’s student system as an example to analyze and illustrate the application of these paradigms. First of all, the first normal form (1NF): the fields in the database table are all of a single attribute and cannot be divided. This single attribute is composed of basic types, including integer, real, character, logical, date, etc. In any current relational database management system (DBMS), it is impossible for a fool to make a database that does not conform to the first normal form, because these DBMSs do not allow you to divide a column of a database table into two or more columns. Therefore, it is impossible for you to design a database that does not conform to the first normal form in the existing DBMS.
    First of all, we determine what to design including those. Student ID, student name, age, gender, course, course credits, department, subject score, department office address, department office phone number, etc. For the sake of simplicity, we only consider these field information for the time being. Regarding this information, we say that we are concerned about the following aspects.
    What basic information does the
    student have? Which
    courses did the student choose and what are the grades? What are the credits for each course ? Which department does the
    student belong to, and what is the basic information about the department.

2.1 Example analysis of the second normal form (2NF)

  • First of all, we consider that putting all this information in a table (student ID, student name, age, gender, course, course credits, department, subject score, department office address, department office phone number) has the following dependencies.
    (Student number) → (name, age, gender, department, department office address, department office telephone)
    (course name) → (credits)
    (student number, course) → (subject score)
    2.1.1 Problem Analysis

  • Therefore, if the requirements of the second normal form are not met, the following problems will arise

    • Data redundancy: If the same course is taken by n students, "credits" will be repeated n-1 times; if the same student has taken m courses, the name and age will be repeated m-1 times.

    • Update exception:
      1) If the credits of a certain course are adjusted, the "credits" values ​​of all rows in the data table must be updated, otherwise the credits of the same course will be different.
      2) Suppose a new course is to be opened, and no one has taken it yet. In this way, because there is no "student ID" keyword, the course name and credits cannot be recorded in the database.

    • Delete exception: Assuming a group of students have completed elective courses, these elective records should be deleted from the database table. However, at the same time, the course name and credit information have also been deleted. Obviously, this will also cause an insert exception.
      2.1.2 Solution

    Change the course selection relationship table SelectCourse to the following three tables:
    Student: Student (student number, name, age, gender, department, department office address, department office phone number);
    Course: Course (course name, credits);
    course selection relationship: SelectCourse (student number, course name, grade).
    2.2 Example analysis of the third normal form (3NF)

  • Then look at the student table above Student (student number, name, age, gender, department, department office address, department office phone number), the keyword is a single keyword "student ID", because there are the following determining relationships:

     (学号)→ (姓名, 年龄,性别,系别,系办地址、系办电话)
      但是还存在下面的决定关系 
     (学号) → (所在学院)→(学院地点, 学院电话)
      即存在非关键字段"学院地点"、"学院电话"对关键字段"学号"的传递函数依赖。 
      它也会存在数据冗余、更新异常、插入异常和删除异常的情况。 (數據的更新,刪除異常這里就不分析了,可以參照2.1.1進行分析)
    
  • According to the third normal form, the student relationship table can be divided into the following two tables to satisfy the third normal form:

      学生:(学号, 姓名, 年龄, 性别,系别);
      系别:(系别, 系办地址、系办电话)。
    

to sum up

   上面的数据库表就是符合I,II,III范式的,消除了数据冗余、更新异常、插入异常和删除异常。

Guess you like

Origin blog.csdn.net/qq_40084325/article/details/110727388