Mysql: Common Interview Questions and Answers

1. What are the characteristics of database transactions?

  • Atomicity: that is, indivisibility, either all transactions are executed, or none are executed.
  • Consistency or stringability. The execution of a transaction causes the database to transition from one correct state to another
  • isolation. Any changes made by the transaction to the data are not allowed to be provided to any other transaction until the transaction is properly committed,
  • Persistence. After the transaction is submitted correctly, its result will be permanently saved in the database, even if there are other failures after the transaction is submitted, the processing result of the transaction will also be preserved.

2. What are the three paradigms of database

  • First Normal Form: Each column cannot be split again.
  • Second normal form: On the basis of the first normal form, non-primary key columns are completely dependent on the primary key, but cannot be part of the primary key.
  • The third normal form: On the basis of the second normal form, non-primary key columns only depend on the primary key and do not depend on other non-primary keys.

When designing the database structure, try to comply with the three paradigms. If you do not comply, there must be sufficient reasons. Such as performance. In fact we often compromise database design for performance.

3. What are the classifications of SQL statements?

  • DDL: Data Definition Language (create alter drop)
  • DML: data manipulation statement (insert update delete)
  • DTL: data transaction statement (commit callback savapoint)
  • DCL: data control statement (grant revoke)

4. What is the difference between delete, drop, and truncate in the database deletion operation?

Answer: When the table is no longer needed, you can use drop to delete the table;
when you still want to keep the table but delete all records, use truncate to delete the records in the table.
When you want to delete some records (generally there are WHERE clause constraints), use delete to delete some records in the table.

5. Is the leaf node linked list of the B+ tree unidirectional or bidirectional?

Answer: doubly linked list

6. What is MVCC? What is its underlying principle?

Answer: MVCC, multi-version concurrency control, is a mechanism to reduce concurrent transaction conflicts by reading historical versions of data, thereby improving concurrency performance.

  • transaction version number
  • table hidden columns
  • undo log
  • read view

7. How to get the current Mysql version?

Answer: It is used to obtain the current Mysql version.

SELECT VERSION();

8. What is the difference between CHAR and VARCHAR?

Answer: The following are the differences between CHAR and VARCHAR:

CHAR and VARCHAR types differ in storage and retrieval

The CHAR column length is fixed to the length declared when creating the table, and the length value range is 1 to 255

When CHAR values ​​are stored, they are padded with spaces to a specific length, and trailing spaces are removed when retrieving CHAR values.

9. What is the difference between MyISAM and InnoDB in implementing B-tree indexes?

  • InnoDB storage engine: The leaf nodes of the B+ tree index store the data itself, and its data files themselves are index files.
  • MyISAM storage engine: The leaf node of the B+ tree index stores the physical address of the data, the data field of the leaf node stores the address of the data record, and the index file and data file are separated.

10. Why does InnoDB design B+ tree index?

  • Two considerations:
    The scenarios and functions that InnoDB needs to perform require strong performance on specific queries.
    It takes a lot of time for the CPU to load data from disk into memory.
  • Why choose B+ tree:
    Although the hash index can provide O(1) complexity query, it cannot support range query and sorting well, which will eventually lead to full table scan. B-trees can store data in non-leaf nodes, but it may lead to more random IO when querying continuous data. And all leaf nodes of the B+ tree can be connected to each other through pointers, reducing random IO caused by sequential traversal.
  • Ordinary index or unique index?
    Since the unique index does not use the change buffer optimization mechanism, if the business is acceptable, it is recommended that you give priority to non-unique indexes from a performance perspective.

11. Please briefly describe the names of the four transaction isolation levels supported by InnoDB in Mysql

The four isolation levels defined by the SQL standard are:

  • read uncommited : read uncommitted data
  • read committed: dirty read, non-repeatable read
  • repeatable read: rereadable
  • serializable : serial things

12. Features of InnoDB storage engine

Answer: Since MySQL5.1, the default storage engine has become the InnoDB storage engine. Compared with MylSAM, the InnoDB storage engine has undergone major changes. Its main features are:

  • It supports transaction operations and has transaction ACID isolation characteristics. The default isolation level is repeatable read (repetable-read), which is realized through MVCC (concurrent version control). It can solve the problems of dirty read and non-repeatable read. InnoDB supports foreign key operations.  InnoDB's default lock granularity row-level lock has better concurrency performance, but deadlocks may occur.
  • Like MyISAM, InnoDB storage engine also has frm file storage table structure definition, but the difference is that InnoDB table data and index data are stored together, and they are all located on leaf nodes with B+ numbers, while MylSAM table data and index data are stored together. Index data is separated.
  • InnoDB has a secure log file, which is used to recover data loss problems caused by database crashes or other situations, to ensure data consistency.
  • The index types supported by InnoDB and MylSAM are the same, but the specific implementations are quite different due to the different file structures.
  • In terms of CRUD performance, if you perform a large number of CRUD operations, it is recommended to use the InnoDB storage engine, which deletes rows and does not rebuild tables.

13. What can be the string type of the column?

String types are:

  • SET
  • BLOB
  • ENUM
  • CHAR
  • TEXT
  • VARCHAR

14. What are the methods of MySQL multi-table connection? How do you use it? What is the difference between these connections?

Answer: connection method: left connection, right connection, inner connection

Instructions:

Left join: select * from A LEFT JOIN B on A.id=B.id;
Right join: select * from A RIGHT JOIN B on A.id=B.id;
Inner join: select * from A inner join B on a .xx=b.xx; (inner can be omitted)

the difference:

Inner join Inner connection, when two tables are connected and queried, only the exact matching result sets in the two tables are kept. left join When two tables are connected and queried, all rows in the left table will be returned, even if there is no match in the right table Record. right join When performing a join query on two tables, all rows in the right table will be returned, even if there are no matching records in the left table.

15. What is the difference between UNION and UNION ALL?

  • Union: Perform a union operation on two result sets, excluding duplicate rows, and sort by default rules at the same time;
  • Union All: Perform a union operation on two result sets, including duplicate rows, without sorting;
    UNION is more efficient than UNION ALL

16. What kinds of locks are there in MySQL?

  • Table-level locks: small overhead, fast locking; no deadlocks; large locking granularity, the highest probability of lock conflicts, and the lowest concurrency.
  • Row-level locks: high overhead and slow locking; deadlocks may occur; the locking granularity is the smallest, the probability of lock conflicts is the lowest, and the concurrency is the highest.
  • Page lock: The overhead and locking time are between table locks and row locks; deadlocks may occur; the locking granularity is between table locks and row locks, and the concurrency is average.

17. What are the types of indexes?

  • Primary key index: Data columns are not allowed to be repeated, and NULL is not allowed. A table can only have one primary key.
  • Unique index: Data columns are not allowed to be repeated, NULL values ​​are allowed, and a table allows multiple columns to create a unique index.
	ALTER TABLE table_name ADD UNIQUE (column); 
	ALTER TABLE table_name ADD UNIQUE (column1,column2); 
  • Ordinary index: basic index type, no uniqueness restriction, NULL value is allowed.
	ALTER TABLE table_name ADD INDEX index_name (column);
	ALTER TABLE table_name ADD INDEX index_name(column1, column2, column3);
  • Full-text indexing: It is a key technology currently used by search engines.
	ALTER TABLE table_name ADD FULLTEXT (column);

18. What are the advantages and disadvantages of indexes?

Advantages of indexes:

  • It can greatly speed up the retrieval of data, which is the main reason for creating indexes.  By using the index, you can use the optimization hider during the query process to improve the performance of the system. Index Disadvantages
  • In terms of time: it takes time to create and maintain indexes. Specifically, when adding, deleting, and modifying data in a table, the index must also be dynamically maintained, which will reduce the execution efficiency of adding/modifying/deleting;
  • In terms of space: indexes need to occupy physical space.

19. Have you ever cared about the time-consuming SQL in the business system? How have you optimized slow queries?

  • Analyze the statement, whether unnecessary fields/data are loaded.
  • Analyze whether the SQL execution statement hits the index, etc.
  • If the SQL is complex, optimize the SQL structure
  • If the amount of table data is too large, consider the situation of sub-table

20. Does the primary key use auto-increment ID or UUID? Why?

Answer: If it is a stand-alone system, choose self-incrementing ID; if it is a distributed system, give priority to UUID, but it is still best for the company to have a distributed unique ID production plan.

  • Auto-increment ID: small data storage space, high query efficiency. However, if the amount of data is too large, it will exceed the value range of self-growth, and problems may also occur when multiple databases are merged.
  • uuid: Suitable for insert and update operations of a large amount of data, but it is unordered, inserting data is slow and takes up a lot of space.

21. Why do you need redo log?

Answer: The redo log is mainly used as a means of data recovery after MySQL restarts abnormally, ensuring data consistency. In fact, it is to cooperate with MySQL's WAL mechanism. Because MySQL performs an update operation, in order to be able to respond quickly, it adopts the technology of asynchronously writing back to disk, and returns after writing to memory. But in this way, there will be hidden dangers of memory data loss after a crash, and the redo log has the capability of crash safe.

22. How does MySQL do distributed locks?

  • Method 1: Use the Mysql lock table to create a table and set a UNIQUE KEY. This KEY is the KEY to be locked, so the same KEY can only be inserted once in the Mysql table, so the competition for the lock is handed over to the database , processing the same KEY database ensures that only one node can be inserted successfully, and other nodes will fail to insert. The implementation of DB distributed lock: lock through the uniqueness of the primary key id. To put it bluntly, the form of locking is to insert a piece of data into a table. The id of this piece of data is a distributed lock. For example, when a request inserts For a piece of data with an id of 1, other concurrent requests that want to insert data must wait for the execution of the first request to delete the data with an id of 1 before continuing to insert, realizing the function of distributed locks.
  • Method 2: Use the serial number + timestamp to perform idempotent operations, which can be regarded as a lock that will not be released.

23. How to delete millions of data or more

Answer: Regarding the index: because the index requires additional maintenance costs, because the index file is a separate file, so when we add, modify, or delete data, additional operations on the index file will be generated, and these operations need to consume additional IO will reduce the execution efficiency of adding/modifying/deleting. Therefore, when we delete millions of data in the database, we can check the official MySQL manual to know that the speed of deleting data is directly proportional to the number of indexes created.

  • So when we want to delete millions of data, we can delete the index first (it takes about three minutes at this time)
  • Then delete the useless data in it (this process takes less than two minutes)
  • After the deletion is complete, re-create the index (there is less data at this time), and the index creation is also very fast, about ten minutes.
  • Compared with the previous direct deletion, it is definitely much faster, not to mention that if the deletion is interrupted, all deletions will be rolled back. That was even more of a pitfall.

24. Introduce the master-slave replication principle of MySQL? The reason for the master-slave delay?

Answer: The principle of master-slave replication: the main library writes changes to the binlog log, and then after the slave library is connected to the main library, the slave library has an IO thread that copies the binlog log of the main library to its own local and writes it into a relay relay log middle. Then there is an SQL thread in the slave library that reads the binlog from the relay log, and then executes the content in the binlog log, that is, executes the SQL again locally.
Master-slave delay:
a. There are too many slave libraries in the main library
b. The hardware configuration of the slave library is worse than that of the main library
c. Too many slow SQL statements
d. The network delay between the master and slave libraries
e. The read and write pressure of the main library is high

25. Tell me about your understanding of SQL injection attacks?

Answer: The so-called SQL injection attack is that the attacker inserts SQL commands into the input field of the Web form or the query string of the page request to trick the server into executing malicious SQL commands.
How to prevent SQL injection attacks?
It is sufficient to filter all the input before using the form input to construct the SQL command. Filtering input can be done in several ways.

  • For the occasion of dynamically constructing SQL queries
    a. Replace single quotation marks, that is, change all single quotation marks into two single quotation marks to prevent attackers from modifying the meaning of SQL commands.
    b. Remove any hyphens from user input
    c. Restrict the privileges of the database account used to execute the query. Use different user accounts to perform query, insert, update, delete operations.
  • Use stored procedures to execute all queries.
  • Limit the length of form or query string inputs.
  • Check the validity of user input.
  • Check the number of records returned by the query that fetched the data.

26. The difference between PostgreSQL and mysql database

  • Data types: PostgreSQL supports more data types, such as arrays, json, hstore, etc., while MySQL supports spatial data types (GIS).

  • Scalability: Compared with MySQL, PostgreSQL has stronger scalability and supports custom data types, functions, and stored procedures. It also provides some advanced features like asynchronous replication, streaming replication, hot standby, etc.

  • ACID: PostgreSQL has stricter ACID (Atomicity, Consistency, Isolation, and Durability) compatibility. By default, PostgreSQL uses stricter isolation levels, which help ensure data consistency and integrity. MySQL uses a lower isolation level by default.

  • Performance: MySQL is more suitable for large data sets than PostgreSQL because it has better performance, especially in terms of reading and writing and concurrency. PostgreSQL, on the other hand, performs better in handling complex queries and larger data sets.

  • Open source agreement: MySQL's open source agreement is GPL (General Public License), which means that derivative products that modify MySQL must also be released using the same agreement. The open source protocol of PostgreSQL is BSD, which means that PostgreSQL can be used by commercial software, and the modified code can be privatized.

  • Cross-platform support: MySQL supports more operating systems, such as Windows, Linux, macOS, FreeBSD, etc. Although PostgreSQL also supports these operating systems, its original goal is to run on UNIX operating systems.
    insert image description here

Guess you like

Origin blog.csdn.net/lishangke/article/details/131692147