Do you know 25 database interview questions that must be mastered?

Table of contents

1. Why can using data index improve efficiency?

2. What is the difference between B+ tree index and hash index?

3. What are the advantages of hash index?

4. Scenarios where the hash index is not applicable?

5. What is table partitioning?

6. What is the difference between table partition and sub-table?

7. What are the benefits of table partitioning?

8. In MVCC concurrency control, what are the two types of read operations?

9. What are the advantages of row-level locking?

10. What are the disadvantages of row-level locking?

11. MySQL optimization?

12. What is the difference between key and index?

13. What are the differences between MyISAM and InnoDB in Mysql?

14. What are the precautions for database table creation?

  1>, field name and field configuration rationality

  2>, System special field processing and suggestions after completion

  3>, rational configuration of the table structure

  4>, other suggestions

15. Stored procedure? What are the pros and cons?

16. What is an index? What is the role and advantages and disadvantages?

17. What is a transaction?

18. What are optimistic locks and pessimistic locks in the database?

19. Can index query improve query performance? Why?

20. Briefly talk about the difference between drop, delete and truncate?

21. In what scenarios are drop, delete, and truncate used?

22. What are super key, candidate key, primary key and foreign key?

23. What is a view? And what are the usage scenarios of the view?

24. Tell me what are the three paradigms?

25. What are the four isolation levels?


1.  Why can using data index improve efficiency?

  • The storage of data indexes is ordered;

  • In the ordered case, querying a data through the index does not need to traverse the index records;

  • In extreme cases, the query efficiency of the data index is the query efficiency of the binary method, which is close to log2(N).

2.  What is the difference between B+ tree index and hash index?

The B+ tree is a balanced multi-fork tree. The height difference from the root node to each leaf node does not exceed 1, and there are pointers between nodes at the same level, which are ordered, as shown in the following figure:

The hash index uses a certain hash algorithm to convert the key value into a new hash value. When searching, it does not need to search step by step from the root node to the leaf node like a B+ tree. Only one hash algorithm is needed, yes Unordered, as shown in the figure below:

3.  What are the advantages of hash index?

For equivalent query, the hash index has an absolute advantage (the premise is: there is no large number of repeated key values, if there are a large number of repeated key values, the efficiency of the hash index is very low, because there is a so-called hash collision problem).

4.  Scenarios where the hash index is not applicable?

  • Range queries are not supported;

  • Indexing is not supported for sorting;

  • The leftmost prefix matching rule for joint indexes is not supported.

5.  What is a table partition?

Table partitioning refers to decomposing a table in the database into multiple smaller parts that are easy to manage according to certain rules. Logically, there is only one table, but the bottom layer is composed of multiple physical partitions.

6.  What is the difference between table partition and sub-table?

Table division: refers to decomposing a table into multiple different tables through certain rules. For example, record user orders into multiple tables according to time.

The difference between table partitioning and partitioning is that partitioning logically has only one table, while table partitioning is to decompose a table into multiple tables.

7.  What are the benefits of table partitioning?

  • Store more data . The data of the partition table can be distributed on different physical devices, so as to efficiently utilize multiple hardware devices. Can store more data than a single disk or file system.
  • Optimize E query . When the where statement contains partition conditions, you can only scan one or more partition tables to improve query efficiency. When sum and count statements are involved, you can also process in parallel on multiple partitions, and finally summarize the results.
  • Partition tables are easier to maintain . For example: If you want to delete a large amount of data in batches, you can clear the entire partition.
  • Avoid certain special bottlenecks. For example, the mutual exclusive access of a single index of InnoDB, the inode lock competition of the system, etc.

8.  In MVCC concurrency control, what are the two types of read operations?

Snapshot read (snapshot read): read the visible version of the record (possibly a historical version), without locking (the shared read lock s lock is not added, so it will not block the writing of other transactions);

Current read (currentread): The latest version of the record is read, and the records returned by the current read will be locked to ensure that other transactions will not modify this record concurrently.

9.  What are the advantages of row-level locking?

  • There are only a few lock conflicts when different rows are accessed in many threads;
  • only minor changes when rolling back;
  • A single row can be locked for a long time.

For more content, please pay attention to the official account [Programmer Style] to get more exciting content!

10.  What are the disadvantages of row-level locking?

Uses more memory than page-level or table-level locking . Slower than page-level or table-level locking when used over large portions of the table because you have to acquire more locks. If you frequently perform GROUP BY operations on most of the data or must scan the entire table frequently, it is significantly slower than other locks. With high-level locking, you can also easily tune your application by supporting different types of locking, because the locking cost is less than row-level locking.

11.  MySQL optimization?

  • Enable query caching to optimize queries;

  • explain your select query, which can help you analyze the performance bottleneck of your query statement or table structure. The query result of EXPLAIN will also tell you how your index primary key is used and how your data table is searched and sorted;

  • When using limit 1 when there is only one row of data, the MySQL database engine will stop searching after finding a piece of data, instead of continuing to search for the next piece of data that matches the record;

  • index the search field;

  • Use ENUM instead of VARCHAR;

  • Prepared Statements Prepared Statements, much like stored procedures, is a collection of SQL statements that run in the background. We can get many benefits from using prepared statements, whether it is performance or security issues. Prepared Statements can check some of the variables you have bound, which can protect your program from "SQL injection" attacks;

  • vertical table;

  • Choose the correct storage engine.

12.  What is the difference between key and index?

The key is the physical structure of the database, which contains two layers of meaning and function, one is constraints (focusing on constraints and standardizing the structural integrity of the database), and the other is indexes (used for auxiliary queries). Including primary key, unique key, foreign key, etc.;

index is the physical structure of the database, it is only for auxiliary query, it will be stored in another table space (innodb table space in mysql) in a directory-like structure when it is created. If the index is to be classified, it can be divided into prefix index, full-text index and so on.

13.  What are the differences between MyISAM and InnoDB in Mysql?

  • InnoDB supports transactions, but MyISAM does not;
  • InnoDB supports foreign keys, while MylSAM does not. Converting an InnoDB table containing foreign keys to MYISAM will fail;
  • InnoDB is a clustered index, and the data file is tied to the index. It must have a primary key, and the efficiency of indexing through the primary key is high.
  • InnoDB does not save the specific number of rows in the table, and requires a full table scan when executing select count(*) from table;
  • Innodb does not support full-text indexing, but MyISAM supports full-text indexing, and MyISAM has higher query efficiency.

14.  Precautions for database table creation?

  1>, field name and field configuration rationality

  •  Eliminate fields that are not closely related; field naming should have rules and corresponding meanings (not part English, part pinyin, and fields with unknown meanings like abc);
  • Try not to use abbreviations for field naming (most abbreviations cannot clarify the meaning of the field);
  • Do not use mixed case in the field (for readability, multiple English words can be connected with underscores);
  • Do not use reserved words or keywords for field names;
  • Keep field names and types consistent;
  • Choose the number type carefully; leave enough margin for the text field;

  2>, System special field processing and suggestions after completion

  • Add deletion marks (such as operator, deletion time);
  • Establish a version mechanism;

  3>, rational configuration of the table structure

  • The processing of multi-type fields is whether there are fields in the table that can be decomposed into smaller independent parts (for example: people can be divided into men and women);

  • The processing of multi-value fields can divide the table into three tables, which makes the retrieval and sorting more regulated, and ensures the integrity of the data!

  4>, other suggestions

  • For large data fields, separate tables are stored in order to affect performance (for example: profile fields);

  • Use the varchar type instead of char, because varchar will dynamically allocate the length, and the specified length of char is fixed; create a primary key for the table, and for tables without a primary key, it will have a certain impact on query and index definitions;

  • To avoid the table field running as null, it is recommended to set the default value (for example: the default value of the int type is set to 0) In the index query, the efficiency is immediately obvious; 1 Create an index, preferably on a unique and non-null field, too many The index has a certain impact on later insertion and update (created considering the actual situation).

15.  Stored procedure? What are the pros and cons?

Stored procedures are some precompiled SQL statements.

A more straightforward understanding: a stored procedure can be said to be a record set, which is a code block composed of some T-SQL statements. These T-SQL statement codes implement some functions like a method (adding to a single table or multiple tables Delete, modify, check), and then give this code block a name, and call it when this function is used.

  • The stored procedure is a precompiled code block with high execution efficiency;

  • A stored procedure replaces a large number of T_SQL statements, which can reduce network traffic and increase communication speed;

  • Data security can be ensured to a certain extent.

For more content, please pay attention to the official account [Programmer Style] to get more exciting content!

16.  What is an index? What is the role and advantages and disadvantages?

An index is a structure that sorts the values ​​of one or more columns in a database table, and is a data structure that helps MySQL efficiently obtain data

You can also understand it this way: an index is a way to speed up the retrieval of data in a table. An index for a database is similar to an index for a book. In books, the index allows users to quickly find the information they need without having to flip through the entire book. In databases, indexes also allow database programs to quickly find data in tables without having to scan the entire database.

Several basic index types of MySQL database: common index, unique index, primary key index, full-text index

  • Indexes speed up database retrieval;

  • Indexes reduce the speed of maintenance tasks such as insertion, deletion, and modification;

  • A unique index can ensure the uniqueness of each row of data;

  • By using the index, you can use the optimization hider in the query process to improve the performance of the system

  • Indexes require physical and data space.

17.  What is a transaction?

Transaction is the basic unit of concurrency control. The so-called transaction, it is a sequence of operations, these operations are either executed, or not executed, it is an indivisible unit of work. A transaction is a unit in which the database maintains data consistency. At the end of each transaction, data consistency can be maintained.

18.  What are optimistic locks and pessimistic locks in the database?

The task of concurrency control in the database management system (DBMS) is to ensure that the isolation and unity of the transaction and the unity of the database are not destroyed when multiple transactions access the same data in the database at the same time.

Optimistic concurrency control (optimistic lock) and pessimistic concurrency control (pessimistic lock) are the main technical means used in concurrency control.

  • Pessimistic lock: Assuming that concurrency conflicts will occur, all operations that may violate data integrity are blocked;

  • Optimistic locking: Assuming that no concurrency conflicts will occur, check for data integrity violations only when committing operations.

19.  Can using index query improve query performance? Why?

Usually, querying data through an index is faster than a full table scan. But we must also pay attention to its cost.

The index needs space for storage and regular maintenance. Whenever a record is added or deleted in the table or the index column is modified, the index itself will also be modified. This means that the INSERT, DELETE, and UPDATE of each record will be more Pay 4 or 5 times of disk I/O. Because indexes require additional storage space and processing, those unnecessary indexes will slow down query response time. Using index queries does not necessarily improve query performance, index range queries (INDEX RANGE SCAN) works in two cases:

  • Based on a range of retrieval, the general query returns a result set that is less than 30% of the number of records in the table;

  • Retrieval based on non-unique indexes.

20.  Briefly talk about the difference between drop, delete and truncate?

drop, delete, and truncate in SQL all mean deletion, but there are some differences between the three

  • delete and truncate only delete the data of the table but not the structure of the table;

  • Speed, in general:  drop> truncate >delete;

  • The delete statement is dml, this operation will be placed in the rollback segment , and it will take effect after the transaction is submitted;
    if there is a corresponding trigger , it will be triggered when it is executed.    truncate, drop is dml, the operation will take effect immediately, and the original data will not be placed in rollback In the segment , it cannot be rolled back. The operation does not trigger the trigger.

21.  In what scenarios are drop, delete, and truncate used?

  • When a table is no longer needed, use drop;

  • When you want to delete some data rows, use delete with a where clause;

  • Use truncate when you keep the table and delete all the data .

22.  What are super keys, candidate keys, primary keys, and foreign keys?

Superkey : The set of attributes that can uniquely identify a tuple in a relationship is called the superkey of the relational schema. An attribute can be used as a super key, and a combination of multiple attributes can also be used as a super key. Superkeys include candidate keys and primary keys.

Candidate key : It is the smallest superkey, that is, a superkey without redundant elements.

Primary key : A combination of data columns or attributes that uniquely and completely identify stored data objects in a database table. A data column can only have one primary key, and the value of the primary key cannot be missing, that is, it cannot be a null value (Null).

Foreign key : The primary key of another table that exists in one table is called the foreign key of this table.

23.  What is a view? And what are the usage scenarios of the view?

A view is a virtual table that has the same functionality as a physical table. Views can be added, modified, checked, and manipulated. Views are usually a subset of the rows or columns of one or more tables. Modifications to the view do not affect the underlying tables. It makes it easier for us to get data, compared to multi-table queries.

  • Only some fields are exposed to visitors, so a virtual table is built, which is a view.

  • The queried data comes from different tables, and the queryer wants to query in a unified way. In this way, a view can also be created to combine the query results of multiple tables. The queryer only needs to obtain the data directly from the view without considering the data source. Differences brought about by different tables.

24.  Tell me what are the three paradigms?

First Normal Form (1NF) : The fields in the database table are all single attributes and cannot be further divided. This single attribute consists of basic types, including integer, real, character, logical, date, etc.

Second normal form (2NF) : There is no partial functional dependence of non-key fields on any candidate key field in the database table (partial functional dependence refers to the situation that some fields in the combined key determine non-key fields ), that is, all non-key fields are completely dependent on any set of candidate keywords.   

Third Normal Form (3NF) : On the basis of the second normal form, if there is no transfer function dependence of non-key fields on any candidate key field in the data table, it conforms to the third normal form. The so-called transfer function dependence means that if there is a decision relationship of "A → B → C", then the transfer function of C depends on A. Therefore, the database tables satisfying the third normal form should not have the following dependencies: key field → non-key field x → non-key field y.

25.  What are the four isolation levels?

  • Serializable (serialization): It can avoid the occurrence of dirty reads, non-repeatable reads, and phantom reads.

  • Repeatable read (repeatable read): It can avoid the occurrence of dirty read and non-repeatable read.

  • Read committed (read committed): It can avoid the occurrence of dirty reads.

  • Read uncommitted (read uncommitted): The lowest level, which cannot be guaranteed in any case.

Well, that's all for this article. Welcome friends to leave a message and talk about the database interview questions you have encountered!

For more content, please pay attention to the official account [Programmer Style] to get more exciting content!

Guess you like

Origin blog.csdn.net/dreaming317/article/details/129812161