With the development of science and technology, we have entered the "Big Explosion Era of Information". A large amount of data and information are constantly being produced. What follows is how to store, retrieve and manage them safely and effectively.
Effective storage, efficient access, convenient sharing and security control of data have become issues worthy of attention today
So it is necessary to use a database!
It now appears that the use of databases can store data efficiently and clearly, enabling people to manage data more quickly and conveniently
To sum up, the database has the following characteristics:
A large amount of data information can be stored in a structured manner to facilitate effective retrieval and access by users
It can effectively maintain the consistency and integrity of data information and reduce data redundancy
Can meet application sharing and security requirements
One, the basic concept of the database
1. Data
Symbol records describing things are called data (Data)
Data includes not only numbers, text, graphics, images, sounds, file records, etc. are all data
In the database, data is stored in a unified format in the form of "records", rather than disorderly
In this way, the storage of data can be organized
A row of data stored in the following figure is called a "Record" in the database, and each output in each record is called a "column". The number, name, gender, and age in the figure are all column names.
Numbering
gender
age
s
male
12
King two
Female
14
Wang San
male
16
2. Database table and database (DB)
Organize different records together to form the "Table" of the database (Database, DB)
Table is used to store specific data
A database is a collection of tables, a collection of related data stored in a certain organization
And usually, the database does not simply store these data, but also express the relationship between them; for example, the relationship between the book and the person, the author of the book is someone, so the "relationship" between the book and the person needs to be established.
This corresponding relationship also needs to be represented by a database, so the description of the relationship is also part of the database
3. Database Management System (DBMS)
Database Management System (DBMS) is a system software that realizes effective organization, management and access to database resources
With the support of the operating system, it supports various operations of the user on the database:
Features
Explanation
Database establishment and maintenance
Including the establishment of database structure and data entry and conversion, database dump and recovery, database reorganization and solid performance, etc.
Data definition function
Including the definition of global data structure, local logical data structure, storage structure, confidentiality mode and information format, etc.; to ensure that the data stored in the database is correct, effective and compatible to prevent incorrect data from being input or output.
Data manipulation function
Including data query statistics and data update two aspects
Database operation management function
This is the core part of the database management system, including functions such as concurrency control, access control, and internal database maintenance
Communication function
Communication between DBMS and other software, such as Access can exchange data with other Office components
4. Database System (DBS)
Database System (DBS) is a human-machine system, generally composed of hardware, OS, database, DBMS, application software and database users (including database administrators)
Users can operate the database through DBMS or applications
An application is an orderly collection of a series of commands compiled to solve a specific management or data processing task by using a DBMS
Database Administrator (DBA) is responsible for database update and backup, database system maintenance, user management, etc., to ensure the normal operation of the database system
Note: Nouns such as database, database system, database management system and even database table are usually not strictly distinguished in daily discussion. You can judge what the actual refers to according to the specific situation
2. The history of database development
1. The initial stage-the first generation of databases
Since the 1960s, the first generation of database systems came out
They are database systems of hierarchical model and network model, which provide strong support for unified management and sharing of data
In this stage, the representative of the database is the database management system of the hierarchical model developed by IBM in 1969-IMS (Information Management System) and the following database tasks of the American Association for Data Systems Language (CODASYL) in the 1970s Mesh model proposed by the group (DBTG)
2. Intermediate stage-second generation database
In the early 1970s, the second generation of databases-relational databases began to appear
Since IBM researchers expounded the concept of relational model in 1970, IBM has invested heavily in relational database research
The bottom layer of relational database is relatively easy to implement, so it was quickly adopted and entered into the research and development plan of many commercial databases. Oracle was established at that time in response to the emergence of the relational data model, a company specializing in (relational) databases.
In the early 1980s, IBM's relational database system DB2 came out, and Oracle also ported Orale to desktop computers
At this time, as the relational database of the second-generation database system, it began to gradually replace the database of the hierarchical and mesh model and became the mainstream of the industry
So far, relational database systems still occupy the main position of database applications
Relational database systems use structured query language (Structured Query Language, SQL) as data definition language (DDL) and data manipulation language (DML), and it has become the standard language of relational databases since its birth
SQL enables querying of database tables in relational databases in a simple and declarative way, which greatly simplifies the programmer's work
3. Advanced stage-third generation database
Traditional relational databases are developed on the background of commercial applications and transaction processing, and new database systems are needed to meet the requirements of different and new fields.
Since the 1980s, new database systems suitable for different fields have continuously emerged
Especially the object-oriented database system has strong practicability and wide adaptability
In the late 1990s, a situation in which multiple database systems jointly supported applications was formed
Of course, in terms of business applications, relational databases still dominate. However, some new elements have been added to mainstream database systems, such as the "relation-object" database model supported by Oracle
3. Introduction to current mainstream databases
1. Relational database
SQL Server (product of Microsoft Corporation)
For Windows system
Simple and easy to use
Oracle (Oracle Corporation product)
For all major platforms
Safe and complete
Complex operation
DB2 (product of IBM)
For all major platforms
Large, safe and complete
MySQL (acquired by Oracle)
Free, open source and small size
2. Non-relational database
Non-relational databases are also known as NoSQL (Not Only SQL). The stored data is not based on the relational model and does not require a fixed table format
Non-relational databases, as a supplement to relational databases, are playing high efficiency and high performance in the era of increasingly rapid development of websites
Advantages of non-relational databases:
Meet the needs of high concurrent reading and writing of the database
Highly efficient storage and access to massive data
Meet the needs of database high scalability and high availability
Fourth, the basic concepts of relational databases
Relational database system is a database system based on relational model, which is the instantiation of relational model applied to the database field
Its basic concept comes from the relational model
1. Basic structure of relational database
The data structure of the relational model uses a simple and easy-to-understand two-dimensional data table, that is, the data description reflecting things and their connections is embodied in the form of a flat table
In each two-dimensional table, each row is called a record to describe the information of an object; each column is called a field to describe an attribute of the object
There are corresponding associations between the data table and the database, and these associations are used to query related data
A relational database is composed of associations between data tables, which can be represented by a simple "entity-relationship" (ER) diagram
The ER diagram contains three elements: entity (data object), relationship and attribute
Entities: also called instances, corresponding to "events" or "things" that can be distinguished from other objects in the real world, such as bank customers, bank accounts, etc.
Attribute: a certain characteristic of an entity, an entity can have multiple attributes; for example, each entity in the bank customer entity set has attributes such as name, address, and phone number
Connection: The corresponding relationship between entity sets is called connection, also called relationship; for example, there is a "savings" relationship between bank customers and bank accounts
The collection of all entities and their connections constitutes a relational database
In each two-dimensional table, each row is called a record, used to describe the information of an object
Each column is called a field, used to describe an attribute of the object
Although in the eyes of bank customers, their own accounts are completely different from those of others and are unique, but internal codes are often used within the bank to distinguish and manage different businesses
2. Primary key
Each row of records in the data table must be unique, and identical records are not allowed. By defining the primary key (primary key, Promary Key), the uniqueness of the record (entity) can be guaranteed
The key, the keyword, is a very important element in the relational model
The primary key uniquely identifies the row data in the table. A primary key value corresponds to a row of data. The primary key consists of one or more fields, and its value is unique, and it is not allowed to take NULL values (NULL)
A table can only have one primary key
If an attribute set can uniquely identify a row of the table and does not contain redundant attributes, then this attribute set is called a candidate key
There can be multiple candidate keys in the table, but only one candidate key can be selected as the primary key of the table, and all other candidate keys are called alternate keys
For example, in the figure below, "number", "name", "gender", "age", and "professional number" are all candidate keys, but "number" can be defined as the main key
Numbering
Name
gender
age
Professional ID
1
Xuichi
male
21
1
2
Xu Er
male
22
3
3
Xu San
male
18
3
4
Xu Si
Female
18
5
3. Foreign keys
A relational database usually includes multiple tables, through the foreign key (Foreign Key) can make these tables related
A foreign key is one or more columns used to establish and strengthen the link between the data of two tables. One or more columns can be added to another table through the primary key in the table to create a link between the two tables.
This column is called the foreign key of the second table
As shown in the following table, the field "Professional Number" is the primary key of the table, and there is also a same field "Professional Number" in the above table, this field is called a foreign key
Professional ID
profession
1
Cloud computing operation and maintenance
3
Big data development
5
artificial intelligence
The table with the primary key is called the "master table", and the table with the foreign key is called the "slave table"
The master table and the slave table always appear in pairs, and they are associated with each other by "foreign keys"
4. Data integrity rules
In order to maintain the consistency between the data in the database and the real world, the data and update operations of the relational database must follow the following four types of integrity rules:
Entity integrity rules
Entity integrity rules require that the tuples in the relationship cannot have null values on the attributes of the primary key
If there is a null value, then the primary key value will not be able to uniquely identify the tuple
For example, in the primary key table, each person has its own corresponding "number", which is used to uniquely identify each person's information record. This "number" is often set as the primary key of the table to facilitate the related application of other database tables
According to the entity integrity rules, the "number" field is not allowed to be empty
Domain integrity rules
Domain integrity is also called column integrity, which specifies whether a data set is valid for a certain column or whether to allow null values
If in the primary key table, define the "gender" field can only take the value "male" or "female", then the column will not enter some other invalid values
Referential integrity rules
If the two tables are related to each other, then referential integrity rules require that the non-existent tuple is not allowed
For example, everyone’s information is recorded in the primary key table, and people’s interests and hobbies are recorded in the table below.
"Zheng San" does not exist in the primary key table, but there are records of his hobbies in the above table. This situation is not allowed
Serial number
Name
interest
Hobby
1
Xuichi
Play ball
Street dance
2
Xu Er
To sing
Shouting
3
Zheng San
Wash feet
massage
User-defined integrity rules
User-defined integrity rules are constraints for a specific data and are determined by the application environment
It reflects the semantic requirements that the data involved in a specific application must meet
The system provides a mechanism to define and verify this type of integrity, so that a unified system method can be used for processing, and the application is no longer responsible for this work