Introduction to Database System (Chapter 7)-Database Design

1. Overview of database design

  • Broadly speaking: database design is the design of the database and its application system, that is, the design of the entire database application system
  • In a narrow sense: designing the database itself, that is, designing all levels of the database model and establishing the database, this is a part of the database application system design. of.

The database design mentioned here refers to the narrow sense.

Definition of database design:

Database design refers to constructing (designing) an optimized database logical mode and physical structure for a given application environment, and establishing the database and its application system accordingly, so that it can effectively store and manage data to meet various applications Requirements, including information management requirements and database operation requirements.

Features of database design

  • Three points for technology, seven points for management, and twelve points for basic data . This is also the basic law of database construction.
    "Twelve points of basic data" emphasizes that the collection, sorting, organization and continuous updating of data is an important part of database construction.
  • Structure (data) design and behavior (processing) design are combined . Early database design focused on structural characteristics.

Database design method

Large-scale database design is a comprehensive technology involving multiple disciplines, and it is also a huge engineering project. He requires professionals engaged in database design to have a wide range of guidelines and techniques, including:

  • Basic knowledge of computer
  • Principles and Methods of Software Engineering
  • Methods and techniques of programming
  • Basic knowledge of database
  • Database design technology
  • Application domain knowledge

Early design method: a combination of handwork and experience.
Current design methods: New Orleans method, E-Rmodel-based design method, 3NF(third normal form) design method, object-oriented database design method, unified modeling language (UML) method.

Database design steps

Insert picture description here
In the process of database design, demand analysis and conceptual structure design can be carried out independently of any database management system. Logical structure design and physical structure design are closely related to the selected database management system.

1. Requirements analysis stage

Whether it is done adequately and accurately determines the speed and quality of the database construction

2. Conceptual structure design stage

Through the integration, induction and abstraction of user needs, a conceptual model independent of the specific database management system is formed

3. Logical structure design stage

Convert the conceptual structure into a data model supported by a database management system and optimize it

4. Physical structure design stage

Select a physical structure most suitable for the application environment for the logical data structure, including storage structure and access method

5. Database implementation phase

Build a database based on the results of logical design and physical design
Write and debug application programs
Organize data storage and test run

6. Database operation and maintenance phase

After trial operation, it can be put into formal operation, and it must be continuously evaluated, adjusted and modified during operation.


The following figure is the design description of the data characteristics at each stage of the design process:
Insert picture description here

Patterns at all levels in the database design process

The following figure shows all levels of database patterns formed at different stages of database design.
Insert picture description here

  • Demand stage: Integrate the application needs of each user
  • Conceptual design stage: forming a conceptual model independent of machine characteristics and independent of each database management system product (ER diagram)
  • Logic design stage:
  1. First, convert the ER diagram into a data model supported by a specific database product, such as a relational model, to
    form a database logic model
  2. Then, based on user processing requirements and security considerations, the
    necessary view (View) is established on the basis of the basic table to form an external mode of data
  • Physical design stage:
    According to the characteristics of the database management system and processing needs, physical storage arrangements, indexing, and formation of the database model

2. Demand analysis

Demand analysis is simply analyzing user requirements. Is the starting point for designing a database

Requirements analysis task

The task of demand analysis is to fully understand the working conditions of the original system (manual system or computer system) through detailed investigation of the objects to be processed in the real world, clarify the various needs of users, and then determine the functions of the new system on this basis. Note that the new system must fully consider future expansion and changes.

Method of needs analysis

To conduct demand analysis, we must first investigate the actual needs of users, reach a consensus with users, and then analyze and express these requirements.

Steps to investigate user requirements:
(1) Investigate the organizational structure
(2) Investigate the business activities of various departments
(3) Assist users to clarify various requirements for the new system, including information requirements, processing requirements, completeness and completeness requirements
( 4) Determine the boundaries of the new system

Common survey methods:
(1) Follow-up
work to learn about business activities by personally participating in business work
(2) Conduct survey meetings
to understand business activities and user needs through discussions with users
(3) Please introduce someone
(4) Ask
about some For questions in the survey, you can ask someone to ask.
(5) Design a questionnaire and ask users to fill in the questionnaire. If the questionnaire is
reasonably designed, it is very effective.
(6) Check the records
Check the data records related to the original system

Analysis method:
structured analysis method (Structured Analysis, referred to as SA method). The SA method starts from the top system organization and uses a top-down, layer-by-layer decomposition method
to analyze and express the needs of users. The analysis report must be submitted to the user and be approved by the user

Requirements analysis process:
Insert picture description here

Data Dictionary

The data dictionary is the main result of detailed data collection and data analysis. It is about the description of the data in the database, that is, metadata, not the data itself. The data dictionary is established during the requirement analysis stage, and is constantly revised, enriched, and perfected during the database design process.

Data dictionary usually includes data items, data structure, data flow, data storage and processing procedures. The data item is the smallest unit of data, and several data items can form a data structure. The data dictionary describes the logical content of data flow and data storage through the definition of data items and data structures.

data item

A data item is a data unit that cannot be subdivided. The description of the data item usually includes the following.

数据项描述={数据项名,数据项含义说明,别名,
                          数据类型,长度,取值范围,取值含义,
                          与其他数据项的逻辑关系,
                          数据项之间的联系}

The simple understanding of the data item is a field in the table.

data structure

The data structure reflects the combination of data. A data structure can consist of several data items, several data structures, or a mixture of several data items and data structures.

数据结构描述={数据结构名,含义说明,组成:{数据项或数据结构}}
data flow

The data stream is the path through which the data structure is transmitted in the system.

数据流描述={数据流名,说明,数据流来源,
                           数据流去向,组成:{数据结构},
                           平均流量,高峰期流量}
  • Data flow source: Indicate which process the data flow comes from
  • Data flow destination: Indicate which process the data flow will go to
  • Average traffic: the number of transmissions per unit time (daily, weekly, monthly, etc.)
  • Peak traffic: data traffic during peak periods
data storage

Data storage is the place where the data structure stays or saves, and it is also one of the sources and destinations of the data flow.

数据存储描述={数据存储名,说明,编号,输       
                             入的数据流 ,输出的数据流,
                             组成:{数据结构},数据量,
                             存取频度,存取方式}
  • Access frequency: the number of accesses per hour, day or week, the amount of data accessed each time, and other information
  • Access method: batch processing/online processing; retrieval/update; sequential retrieval/random retrieval
  • Input data stream: data source
  • Output data flow: where the data goes
Process

The specific processing logic of the processing process is generally described by a decision table or a decision tree. Only descriptive information describing the processing process is required in the data dictionary

处理过程描述={处理过程名,说明,输入:{数据流},   
                              输出:{数据流},处理:{简要说明}}

Brief description: Explain the function and processing requirements of the processing process

  • Function: What is the process used for
  • Processing requirements: processing frequency requirements, such as how many transactions are processed per unit time, how much data volume, response time requirements, etc.
  • Processing requirements are the input and performance evaluation criteria for the subsequent physical design

Insert picture description here

3. Conceptual structure design

The process of abstracting the user needs obtained from the needs analysis into an information structure (conceptual model) is the conceptual structure design.

Conceptual model

The application requirements obtained in the requirements analysis stage should first be abstracted into the structure of the information world, and then a certain database management system can be used to achieve these requirements better and more accurately.

The main features of the conceptual model:

  • It is a true model of the real world that can truly and fully reflect the real world, including the connection between transactions and transactions, and can meet the data processing requirements of users.
  • Easy to understand, you can use it to exchange opinions with users who are not familiar with computers
  • Easy to change, easy to modify and expand the conceptual model when the application environment and application requirements change
  • It is easy to convert to various data models such as relationship, mesh, and hierarchy.

ER model

ER model is a powerful tool for describing conceptual models

Links between entities

Insert picture description here
The connection between entities includes the connection between two entities (binary connection), the connection between more than two entities (n-ary connection), and the connection between a single entity (father and child)

ER

The ER diagram provides a way to represent entity types, attributes, and connections.

  • The entity type is represented by a rectangle, and the entity name is written in the rectangle
  • The attribute is represented by an ellipse, and an undirected edge is used to connect it with the corresponding entity
    Insert picture description here
  • The contact is represented by a rhombus. The name of the contact is written in the rhombus and the type of contact (1∶1, 1:n or m:n) is marked beside the undirected side.
    Insert picture description here

Conceptual structure design

The first step of conceptual structure design is to classify and organize the data collected in the requirements analysis stage, and determine the attributes of entities, entity classes, and the types of connections between entities.

1. Rules for the division of entities and attributes

In order to simplify the handling of ER diagrams, real-world transactions can be treated as attributes as much as possible.

What conditions are met for the transaction to be treated as an attribute? Two rules:

  • 作为属性,不能再具有需要描述的性质, That is, the attribute must be an indivisible attribute item (1NF) and cannot contain other attributes
  • 属性不能与其他实体具有联系, That is, the ER diagram shows the relationship between entities, not the relationship between attributes and other entities or attributes.
2. ER diagram integration

When developing a large-scale information system, the most frequently adopted strategy is top-down requirements analysis, and then bottom-up conceptual structure design. That is, first design the sub-ER diagram of each subsystem (may be designed by different teams), and then integrate them to obtain the global ER diagram. Integration is generally divided into two steps: 1. Merge and resolve conflicts. 2. Modification and reconstruction
Insert picture description here
1. Merging to resolve conflicts
There are three main types of conflicts between the ER diagrams of the ER diagrams of each subsystem: attribute conflicts, naming conflicts, and structural conflicts.
① Attribute conflict

  • Attribute domain conflict, that is, the attribute value type, value range, or value set are different.
  • Attribute value unit conflicts. For example, the weight of a part may be in kilograms or grams.
    ②Name conflict
  • Same name with different meaning, that is, objects with different meanings have the same name in different local applications
  • Different names have the same meaning (multiple names with one meaning), that is, objects with the same meaning have different names in different local applications.
    ③Structural conflict
  • The same object has different abstractions in different applications. For example, employees are treated as entities in one partial application, but are treated as attributes in another partial application.
  • The number of attributes and the number of attribute arrangements in the ER diagrams of different subsystems of the same entity are not exactly the same.
  • Entity connections are of different types in different ER diagrams. For example, E1 and E2 are in a many-to-many relationship in an ER diagram, and in another ER diagram, E1, E2, and E3 are in a many-to-many relationship.
    2. Eliminate unnecessary redundancy The
    so-called redundancy refers to data that can be derived from basic data. Sometimes in order to improve efficiency, redundancy may not be eliminated.

4. Logical structure design

The logical structure design is to convert the basic ER diagram designed in the conceptual structure design stage into a logical structure that matches the data model supported by the database management system product.

Conversion of ER diagram to relational model

The problem to be solved in the conversion of ER diagram to relational model is how to convert the connections between entities into relational patterns, and how to determine the attributes and codes of these relational patterns. The general principle of conversion is: an entity is converted into a relational model

Data model optimization

The result of database logic design is not unique. In order to further improve the performance of the database application system, the structure of the data model should be appropriately modified and adjusted according to the application. The optimization of relational models is usually guided by normalization theory, such as the third normal form. But it should be noted that the higher the standardization, the better the relationship.

Design user sub-pattern

After converting the conceptual model to the global logic model, the user's external model should be designed according to the local application requirements and the characteristics of the specific relational database management system. For example, in SQL relational databases, views can be used to design external models that better meet the needs of local users.

When defining the external mode, you can pay attention to the user’s habit and convenience, including several aspects

  • Use aliases that are more in line with user habits
  • Different views can be defined for users of different levels to ensure the security of the system.
  • Simplify the user's use of the system.

5. Physical structure design

The storage structure and access method of the database on the physical device is called the physical structure of the database, which depends on the selected database management system. The process of selecting a physical structure most suitable for application requirements for a given logical data model is the physical design of the database.
It is mainly divided into the following two steps:

  • Determine the physical structure of the database. In a relational database, it mainly refers to the access method and access structure
  • Evaluate the physical structure, focusing on time and space efficiency

Contents and methods of database physical design

The physical environment, access method and access structure provided by different database products are very different, so there is no general design method to follow, only general design content and principles can be given.

Generally, the main content of the physical design of a relational database includes: the selection of the access method of the relational mode, and the physical storage structure of the database file such as design relations and indexes .

Relational mode access method selection

The database system is a multi-user sharing system, and multiple access paths must be established for the same relationship to meet the multiple application requirements of multiple users. One of the tasks of physical structure design is to choose which access method according to the support of the database system.

The commonly used access methods are indexing and clustering. B+ tree index and hash index are the classic database access methods and are the most commonly used.

1. The choice of B+ tree index storage method

The so-called index storage method is actually to determine which attribute columns of the relationship are indexed, which index columns are to be indexed, and which columns are to be uniquely indexed according to application requirements. Generally as follows:

  • If an attribute (or a group of) often appears in the query conditions, consider establishing an index (or a composite index) on this (column or group) attribute
  • If an attribute is often used as a parameter of aggregate functions such as maximum and minimum values, consider establishing an index on this attribute
  • If an attribute (or a group of) often appears in the connection condition of a connection or a connection operation, consider this (or this group) of attributes to build an index.

The definition of index is not as many as possible. If there are more indexes, the system will have to pay a great price to maintain indexes. For example, attributes with high update frequency are not suitable for building too many indexes.

2. The choice of hash index access method

The rules for choosing the hash access method are as follows: If the attributes of a relationship mainly appear in the equivalence connection condition or mainly appear in the equivalence comparison selection, and the following two conditions are met, the hash storage method can be selected for this relationship.

  • The size of a relationship is predictable and constant
  • The size of the relationship changes dynamically, but the database management system provides a dynamic hash storage method.
3. The choice of cluster access method

In order to improve the query speed of a certain attribute (or attribute group), the tuples with the same value on this or these attributes (called clustering code) are collectively stored in a continuous physical block called clustering. This attribute (or attribute group) is called the cluster key
Insert picture description here
Insert picture description here
Insert picture description here

Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here

Determine the storage structure of the database

Determining the physical structure of the database mainly refers to determining the storage location and storage structure of the database, including determining the storage arrangement and storage structure of relationships, indexes, clusters, logs, backups, etc., and determining the system configuration.

Evaluate the physical structure

In the database physical design process, time efficiency, space efficiency, maintenance cost, and various user requirements need to be weighed. As a result, a variety of solutions can be produced. Choose the best solution.

6. Database implementation and maintenance

After completing the physical design of the database, the designer must use the data definition language provided by the relational database management system and other applications to strictly describe the database logical design and physical design results, which is called the acceptable source code of the relational database management system. After debugging, the target mode is generated, and then the data can be stored in the database. The number is the database implementation stage.

In the database operation stage, the regular maintenance of the database is mainly completed by the database administrator, which mainly includes the following aspects:

  • Database dump and restore
  • Database security and integrity control
  • Monitoring, analysis and modification of database performance
  • Reorganization and restructuring of the database.

Guess you like

Origin blog.csdn.net/qq_41262903/article/details/106228753