Database paradigm (1NF, 2NF, 3NF, BCNF)

The database design paradigm is the specifications that database design needs to meet. The database that meets these specifications is concise and clear in structure, and at the same time, there will be no exceptions in insert , delete , and update operations . On the contrary, it is messy, not only causing trouble to the programmers of the database, but also disgusting, and may store a lot of unnecessary redundant information.

 

Paradigm Description

 

1.1 First Normal Form ( 1NF ) No Duplicate Columns

 

    The so-called first normal form ( 1NF ) means that each column of the database table is an indivisible basic data item, and there cannot be multiple values ​​in the same column, that is, an attribute in an entity cannot have multiple values ​​or duplicate attributes. . If there are duplicate attributes, it may be necessary to define a new entity. The new entity is composed of duplicate attributes, and the new entity has a one-to-many relationship with the original entity. Each row of a table in first normal form ( 1NF ) contains information about only one instance. In short, first normal form is a column with no repeats.

 

Description: In any relational database, the first normal form ( 1NF ) is the basic requirement for the relational schema, and a database that does not satisfy the first normal form ( 1NF ) is not a relational database.

 

For example, the following database table is in first normal form:

 

 

field 1

field 2

field 3

field 4

 

And such a database table is not in first normal form:

 

 

field 1

field 2

field 3

field 4

 

 

Field 3.1

Field 3.2

 

         

 

The fields in the database table are all of a single attribute and cannot be further divided. This single attribute consists of basic types, including integer, real, character, logical, date, and so on. Obviously, in any current relational database management system ( DBMS ), it is impossible for a fool to make a database that does not conform to the first normal form, because these DBMS do not allow you to divide a column of a database table into two or more columns. Therefore, it is impossible for you to design a database that does not conform to the first normal form in the existing DBMS .

 

1.2 Second Normal Form ( 2NF ) properties are completely dependent on the primary key [ eliminates partial sub-functional dependencies ]

 

If the relational schema R is in first normal form, and every non-primary attribute in R is fully functionally dependent on a candidate key of R , it is called a second normal form schema.

The second normal form ( 2NF ) is established on the basis of the first normal form ( 1NF ), that is, to satisfy the second normal form ( 2NF ), the first normal form ( 1NF ) must be satisfied first . Second Normal Form ( 2NF ) requires that each instance or row in a database table must be uniquely distinguishable. To achieve the distinction usually requires adding a column to the table to store the unique identity of each instance. This unique attribute column is called the primary key or primary key, primary key.

 

For example, an employee ID ( emp_id ) column is added to the employee information table, because the employee ID of each employee is unique, so each employee can be uniquely distinguished.

In a nutshell, Second Normal Form ( 2NF ) is when non-primary attributes are completely dependent on the primary key.

 

The so-called complete dependency means that there cannot exist attributes that only depend on part of the primary key (if there is a functional dependency W→A , if there is XW , and X→A is established, then W→A is a partial dependency, otherwise W→A is called a local dependency. fully functional dependencies). If it exists, then this attribute and this part of the primary key should be separated to form a new entity, and the new entity has a one-to-many relationship with the original entity.

 

Assume that the course selection relationship table is SelectCourse ( student number , name , age , course name , grades , credits ) , and the keywords are combined keywords ( student number , course name ) , because there are the following decision relationships:

 

( student number , course name )( name , age , grades , credits )

 

This database table does not satisfy the second normal form because the following determinants exist:

 

(课程名称) (学分)

 

(学号) (姓名, 年龄)

 

即存在组合关键字中的字段决定非关键字的情况。

 

由于不符合2NF,这个选课关系表会存在如下问题:

 

(1) 数据冗余:

 

同一门课程由n个学生选修,"学分"就重复n-1次;同一个学生选修了m门课程,姓名和年龄就重复了m-1次。

 

(2) 更新异常:

 

若调整了某门课程的学分,数据表中所有行的"学分"值都要更新,否则会出现同一门课程学分不同的情况。

 

(3) 插入异常:

 

假设要开设一门新的课程,暂时还没有人选修。这样,由于还没有"学号"关键字,课程名称和学分也无法记录入数据库。

 

(4) 删除异常:

 

假设一批学生已经完成课程的选修,这些选修记录就应该从数据库表中删除。但是,与此同时,课程名称和学分信息也被删除了。很显然,这也会导致插入异常。

 

把选课关系表SelectCourse改为如下三个表:

 

学生:Student(学号, 姓名, 年龄)

 

课程:Course(课程名称, 学分)

 

选课关系:SelectCourse(学号, 课程名称, 成绩)

 

这样的数据库表是符合第二范式的,消除了数据冗余、更新异常、插入异常和删除异常。

 

另外,所有单关键字的数据库表都符合第二范式,因为不可能存在组合关键字。

 

1.3 第三范式(3NF)属性不依赖于其它非主属性 [ 消除传递依赖 ]

 

如果关系模式R是第二范式,且每个非主属性都不传递依赖于R的候选键,则称R为第三范式模式。

    满足第三范式(3NF)必须先满足第二范式(2NF)。第三范式(3NF)要求一个数据库表中不包含已在其它表中已包含的非主关键字信息。

 

例如,存在一个部门信息表,其中每个部门有部门编号(dept_id)、部门名称、部门简介等信息。那么在的员工信息表中列出部门编号后就不能再将部门名称、部门简介等与部门有关的信息再加入员工信息表中。如果不存在部门信息表,则根据第三范式(3NF)也应该构建它,否则就会有大量的数据冗余。

 

第三范式(3NF):在第二范式的基础上,数据表中如果不存在非关键字段对任一候选关键字段的传递函数依赖则符合第三范式。简而言之,第三范式就是属性不依赖于其它非主属性。

 

所谓传递函数依赖,指的是如果存在"A B C"的决定关系,则C传递函数依赖于A

 

因此,满足第三范式的数据库表应该不存在如下依赖关系:

 

关键字段→非关键字段x →非关键字段y

 

假定学生关系表为Student(学号, 姓名, 年龄, 所在学院, 学院地点, 学院电话),关键字为单一关键字"学号",因为存在如下决定关系:

 

(学号) (姓名, 年龄, 所在学院, 学院地点, 学院电话)

 

这个数据库是符合2NF的,但是不符合3NF,因为存在如下决定关系:

 

(学号) (所在学院) (学院地点, 学院电话)

 

即存在非关键字段"学院地点""学院电话"对关键字段"学号"的传递函数依赖。

 

它也会存在数据冗余、更新异常、插入异常和删除异常的情况,读者可自行分析得知。

 

把学生关系表分为如下两个表:

 

学生:(学号, 姓名, 年龄, 所在学院)

 

学院:(学院, 地点, 电话)

 

这样的数据库表是符合第三范式的,消除了数据冗余、更新异常、插入异常和删除异常。

 

1.4 鲍依斯-科得范式(BCNF3NF的改进形式)

 

若关系模式R是第一范式,且每个属性都不传递依赖于R的候选键。这种关系模式就是BCNF模式。即在第三范式的基础上,数据库表中如果不存在任何字段对任一候选关键字段的传递函数依赖则符合鲍依斯-科得范式。

 

假设仓库管理关系表为StorehouseManage(仓库ID, 存储物品ID, 管理员ID, 数量),且有一个管理员只在一个仓库工作;一个仓库可以存储多种物品。这个数据库表中存在如下决定关系:

 

(仓库ID, 存储物品ID) (管理员ID, 数量)

 

(管理员ID, 存储物品ID) (仓库ID, 数量)

 

所以,(仓库ID, 存储物品ID)(管理员ID, 存储物品ID)都是StorehouseManage的候选关键字,表中的唯一非关键字段为数量,它是符合第三范式的。但是,由于存在如下决定关系:

 

(仓库ID) (管理员ID)

 

(管理员ID) (仓库ID)

 

即存在关键字段决定关键字段的情况,所以其不符合BCNF范式。它会出现如下异常情况:

 

(1) 删除异常:

 

当仓库被清空后,所有"存储物品ID""数量"信息被删除的同时,"仓库ID""管理员ID"信息也被删除了。

 

(2) 插入异常:

 

当仓库没有存储任何物品时,无法给仓库分配管理员。

 

(3) 更新异常:

 

如果仓库换了管理员,则表中所有行的管理员ID都要修改。

 

把仓库管理关系表分解为二个关系表:

 

仓库管理:StorehouseManage(仓库ID, 管理员ID)

 

仓库:Storehouse(仓库ID, 存储物品ID, 数量)

 

这样的数据库表是符合BCNF范式的,消除了删除异常、插入异常和更新异常。

 

四种范式之间存在如下关系:

 

      
 
                    

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326356929&siteId=291194637