[Database Principle] Overview of Database Design

Database Design.

  • Database design refers to the process of developing database structure and behavior according to user needs.
  • For a given application environment, construct the optimal database model, establish the database and its application system; effectively store data to meet the user's information requirements and processing requirements.
  • Database design is divided into structural design and behavioral design . The former includes conceptual design, logical design and physical design, while the latter refers to user operations on the database.
  • The characteristics of database design:
    ①Structure comes from behavior—emphasizes the combination of structural design and behavior design;
    ②Behavior always changes—"repeated exploration, gradual refinement"

1. Database design method.

  • Intuitive design
  • Normative Design Method
  • Computer Aided Design
  • Modern database design method

1.1 Intuitive design method.

  • Intuitive design method is also called manual trial and error method, it is the earliest used database design method
  • This method relies on the designer's experience and skills, lacks the support of scientific theories and engineering principles, and the quality of the design is difficult to guarantee. It is often that various problems are discovered after the database has been running for a period of time, so that it can be modified again, which increases the cost of system maintenance.

1.2 Normative design method.

  • [Database design method based on ER model] Use the ER diagram to construct a conceptual model that reflects the connections between real-world entities, where E stands for Entity and R stands for Relationship.
  • [3NF-based database design method] Determine all attributes in the database and the dependencies between attributes, analyze the non-compliance with 3NF constraints, decompose them by projection, and normalize them into a collection of several 3NF relationship patterns.
  • [View-based database design method] Start by analyzing the data of each application, and establish its own view for each application, and then combine these views into a conceptual model of the entire database.

1.3 Computer-aided design method.

  • Computer-aided design method refers to a method of simulating a standardized design in certain processes of database design, and is led by human knowledge or experience, and realizes certain parts of the design through human-computer interaction.

1.4 Modern database design methods.

  • Around the idea of ​​software engineering, ER diagram design is usually the main body, supplemented by the evaluation and optimization of 3NF design and view design realization mode, to absorb the advantages of various design methods.
  • In order to improve the design coordination efficiency and standardization degree, the modern database design process will also obtain standardized database design results through computer-aided design tools.

2. Database design steps.

  • [System requirement analysis] Collect information content and processing requirements, and analyze;
  • [Conceptual structure design] A conceptual model that expresses user needs;
  • [Logical structure design] The data model derived from the conceptual model;
  • [Physical structure design] Storage structure and access method;
  • [Database implementation] Data storage, write database access procedures;
  • [Database Operation and Maintenance] Collect and record the data of actual system operation.
  • About conceptual structure , the logical structure and physical structure , in a database system overview (III) mentioned in.
  • The conceptual model does not depend on the specific computer system, nor does it involve issues such as how information is represented in the computer, how to deal with it, and is only used to describe the information of interest in a specific application scenario;
  • The logical model is to model data and information according to the computer's point of view. It is a model in the computer world with a strict formal definition and is easy to implement in a computer. Any DBMS is designed according to a certain logical model, that is, the database is combined and established according to the data model specified by the DBMS, so we say that the logical model is mainly used for the realization of the DBMS.
  • The physical model is an abstraction of the lowest level of data. It describes the storage method and storage method of data on the disk and is oriented to computer systems. The concrete realization of the physical model is the task of the DBMS, and the conversion from the logical model to the physical model is also completed by the DBMS.

2.1 System requirements analysis.

  • System requirement analysis is the starting point of database design, and prepares for the specific design in the future. Whether the result of the demand analysis accurately reflects the actual requirements of the user will directly affect the design of each subsequent stage, and affect whether the design result is reasonable and practical.
  • Incorrect or misunderstanding of the system requirements analysis, many errors are not discovered until the system testing stage, and correction will cost a lot.
  • The tasks of system requirements analysis: ①Investigate and analyze user activities, and clarify user requirements and goals; ②Collect and analyze demand data, and determine system boundaries; ③Compile demand analysis reports and organize expert reviews.
  • The method of system demand analysis: ①Top-down demand analysis; ②Bottom-up demand analysis.

Insert picture description here

  • The top-down analysis method (also called Structured Analysis, AS) is the simplest and most practical analysis method. It starts from the top layer, analyzes the system in a layer-by-layer decomposition, and describes the system with a data flow diagram (DFD) and a data dictionary (Data Dictionary, DD).
  • [Data Flow Diagram DFD] Use named arrows to represent data flow, circles to represent processing, unclosed rectangles or other shapes to represent storage, and closed rectangles to represent source and output.
    Insert picture description here
  • [Data Dictionary DD] The data dictionary is a detailed description of the data in the system, and is a list of various data structures and attributes. In the demand analysis stage, it usually contains the following five parts: data items , data structure , data flow , data storage and processing process . The final data flow diagram and data dictionary are the main content of the system analysis report. This is the next step. The basis of conceptual structure design.

[Example] The core business in the undergraduate teaching link-class and course selection: the course service only cares about the teacher's teaching of the course; the course selection business mainly records which students choose which courses and the scores of this course.

  • [Data flow diagram-
    Insert picture description here
    lesson business] In the process of arranging lessons, teaching administrators need to arrange the courses in the syllabus to relevant teachers based on the course information in the syllabus and the teacher information of the instructor, and save the arranged information as Class information.
  • [Data Flow Diagram—Course List item
    Selection Service] When students select courses, they need to select courses based on the current semester's course arrangement and the students' natural conditions (grade, time arrangement, etc.). The selected course information will be saved in the student's course selection situation. After the semester course is over, the instructor will grade the students based on the course selection, and the final score will be saved in the course score.
  • [Data Dictionary]
    Information on the natural situation of the student: student ID, name, age and department, etc.
    ②Course information : course number, name and instructor, etc.
    ③Teacher information : teacher's number, name, teacher's gender, professional title and teaching courses, etc.
    Class information : the name of the course and the name of the teacher.
    ⑤Student's course selection information : student name, course name and teacher name, etc.
    Course score information : student name, course name and score, etc.
  • [Hidden data structure] In addition to the above information, it is necessary to further analyze whether there is any hidden data structure in the system. The actual research results show that the management of colleges and universities is usually based on the department. If the department is not divided, the information of the students and teachers of each department will be mixed together, which is not convenient for various businesses. Therefore, it is also necessary to define the data items of the department.
    Department : Department number, name, department teacher and department student.

2.2 Conceptual structure design.

  • Conceptual structure design is to abstract the user needs obtained from the needs analysis into an information structure, that is, a conceptual model.
  • After separating the conceptual design from the logical design, the tasks at each stage are relatively simple, and the design complexity is greatly reduced, which is convenient for organization and management.
  • The conceptual model is not restricted by a specific DBMS, and is also independent of storage arrangements and efficiency considerations, so it is more stable than the logical model.
  • The conceptual model does not contain the technical details attached to the specific DBMS, which is easier for users to understand, and therefore more likely to accurately reflect users' information needs.
  • The characteristics of the conceptual model: rich semantic expression capabilities, easy to communicate and understand, easy to modify and expand, easy to transform to a data model.
  • The ER model is the most famous and practical one is the conceptual model, which was proposed by PPChen in 1976.
  • [Entity type] It is represented by a rectangular frame, and the entity name is marked in the frame.
  • [Property] It is represented by an oval frame, and the attribute name is marked in the frame, and an undirected edge is used to connect it with the corresponding entity.
  • [Contact] is represented by a diamond-shaped box, the name of the contact is marked in the box, and the undirected edge is connected to the relevant entity, and the type of contact is marked beside the undirected edge, namely 1:1, 1:n or m:n.
    Insert picture description here
    Insert picture description here
    [Example 】Complete ER diagram of student and course connection:
    Insert picture description here

2.2.1 Conceptual structure design method.

  • From top to bottom, gradually refine
  • From the bottom up, from the fine to the whole
  • Gradually expand, gradually expand outward from the core
  • Mixed strategy, top-down combined with bottom-up

2.2.1.1 Bottom-up design method.

  • Perform data abstraction and design partial ER diagrams, that is, design user views;
  • Integrate each local ER model to form a global ER model, that is, view integration.
    Insert picture description here
  • [Partial ER model design] ① Attributes must be inseparable data items, and can no longer be composed of other attributes; ② Attributes cannot be related to other entities, and connections can only occur between entities.
  • Department as an attribute or entity:
    Insert picture description here
  • Merging partial ER diagrams, eliminating conflicts between partial ER diagrams, and generating a preliminary ER diagram.
  • Eliminate unnecessary redundancy and generate a basic ER diagram. Redundancy refers to the redundant connection between redundant data and entities. Redundant data refers to data that can be derived from basic data. Redundant links are links derived from other links.

[Example]
① One student can choose multiple courses, and one course can be selected for multiple students. Therefore, students and courses are in many-to-many relationship.
②One teacher can teach multiple courses, and one course can be taught by multiple teachers. Therefore, teachers and courses have a many-to-many relationship.
③A department can have multiple teachers, and a teacher can only belong to one department. Therefore, the relationship between the department and the teachers is one-to-many, and the relationship between the department and the students is also one-to-many.
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here

2.3 Logical structure design.

  • In order to build the database required by the user, the conceptual model obtained in the previous step needs to be converted into a data model supported by a specific DBMS. The conceptual model represented by the ER diagram can be converted to any specific data model supported by the DBMS, such as the mesh model, the hierarchical model, and the relational model. We will discuss the relational model below.
  • [Transformation principle] ① Entity: An entity is transformed into a relationship mode, the attribute of the entity is the attribute of the relationship, and the code of the entity is the main code of the relationship. ②Relation: A connection is transformed into a relation mode. The main code of the entities related to the connection and the attribute of the connection are the attributes of the relation, and the main code of the relation needs to be determined according to the type of connection.
  • If the relationship is 1:1, the main code of each entity is the candidate code of the relationship;
  • If the relationship is 1:n, the main code of the n-end entity is the main code of the relationship;
  • If the relationship is n:m, then the combination of the main codes of each entity is the main code of the relationship.
  • [Special case] For a multiple connection between three or more entities, the main code of each entity connected to the multiple connection and the attributes of the connection itself are converted into the attributes of the relationship, and the main code is a combination of the entity codes after conversion .
    Insert picture description here
    Convert to relational model [supply] ( vendor number , item number , part number , quantity)

[Example] Perform conversion according to the ER obtained in the previous step.

entity:

  • Student ( student number , name, gender, age)
  • Department ( department number , department name, telephone)
  • Course ( course number , course name)
  • Teacher ( teacher number , name, gender, job title)

contact:

  • Belongs to ( teacher number , department number)
  • Lecture ( Teacher ID , Course ID )
  • Elective ( student number , course number , grade)
  • Have ( student number , department number)

2.3.1 Model evaluation and improvement.

  • [Functional evaluation] When there is a problem, retrospective analysis, and the results of demand analysis, check whether the standardized set of relational patterns supports all the application requirements of users.
  • [Performance evaluation] Execution effect, estimate the actual performance, including the number of accesses to logical records, the amount of transmission, and the model of the physical structure design algorithm.
  • If some applications cannot be supported due to system requirements analysis and conceptual structural design omissions, new relationship modes or attributes should be added.
  • If improvement is required due to performance considerations, the method of consolidation or decomposition can be used.
  • ★Merge the relational patterns and merge the relational patterns with the same main code.

[Example] To merge and improve the relationship model obtained above.
Insert picture description here
Finally:

  • Student ( student number , name, gender, age, department number)
  • Course ( course number , course name)
  • Teacher** (Teacher ID**, name, gender, title, department number)
  • Lecture ( Teacher ID , Course ID )
  • Elective ( student number , course number , grade)
  • Department ( department number , department name, telephone)

2.4 Physical structure design.

  • For a given logical model, the process of selecting a physical structure most suitable for the application environment. Effectively implement the logic mode and determine the access strategy adopted.
  • [Determine the physical structure] Mainly refers to the access method and storage structure in relational databases.
  • [Evaluation of physical structure] The focus of evaluation is time and space efficiency.
  • The database system is a system shared by multiple users, and multiple access paths must be established for the same relationship to meet the multiple application requirements of multiple users. One of the tasks of physical structure design is to determine which access methods to choose according to the access methods supported by the database management system.
  • [Aggregation] For repeated storage and frequent queries. In order to improve the query speed, the tuples that have the same value in a group of attributes (including single attributes) are stored in a physical block. This group of attributes is called an aggregation code .
  • Aggregations can be established for relationships that are frequently connected together.
  • If a set of attributes of a relationship often appear in the equality comparison condition , then the single relationship can establish an aggregation.
  • If the value repetition rate on a set of attributes of a relationship is very high , then this single relationship can establish aggregation. That is to say, the average number of tuples corresponding to each aggregated code value cannot be too small, and the aggregation effect is not obvious if it is too small.
  • [Index] Ensure data integrity, improve query efficiency, but pay attention to maintenance costs.
  • If a group of attributes often appear in the query conditions , consider building an index (or a composite index) on this group of attributes.
  • If an attribute is often used as a parameter of aggregate functions such as maximum and minimum values , consider building an index on this attribute.
  • If a group of attributes often appear in the connection conditions of the connection operation , consider building an index on this group of attributes.
  • [Data storage location] In order to improve system performance, the variable part, stable part, frequently accessed part, and less frequently accessed part of the data should be stored separately according to the application situation. Under multiple disks, separate storage of tables and indexes, logs and database objects.

Guess you like

Origin blog.csdn.net/weixin_44246009/article/details/108164124