java interview questions (database)

Mysql

  1. Fundamentals of Indexing  

Indexes are used to quickly find specific records; turn unordered data into ordered data for query

  1. Sort the column data to create the index

  1. Generate an inverted list of the sorted results

  1. Splice the address chain on the content of the posting list

  1. When querying, first get the content of the inverted list, then get the address chain, and finally get the data

  1. The difference between clustered index and non-clustered index

Clustered index: enter the data and index together, find the index and find the data

Non-clustered index: store the data and the index separately, the leaf nodes of the index structure point to the location of the data, and find the data through the location

the difference:

  1. Querying the clustered index can directly obtain data, and the non-clustered index requires a second query

  1. Clustered indexes are suitable for range queries, and non-clustered indexes are suitable for sorting

  1. The structure of mysql index, their respective advantages and disadvantages

The data structure of the index is related to the specific storage engine. Hash index and B+ tree index are more commonly used;

Hash index: Use a certain hash algorithm to convert the key value into a new hash; equivalent query, then the hash index has obvious advantages. The premise is that the key value is unique. If the key value is not unique, you need to find the key first. position, and then scan the linked list to find the corresponding value; range query, hash index is not easy to use

B+ tree index: keyword retrieval efficiency is relatively average; conventional retrieval, the search efficiency from the root node to the leaf node is basically the same, there will be no large fluctuations, and when scanning based on the index, you can also use the bidirectional pointer to quickly move left and right, improving efficiency very high

  1. Index Design Principles

Queries are faster and take up less space

  1. Columns suitable for appearing in where clauses, or columns specified in join clauses

  1. Columns with a small cardinality have poor indexing effect and do not need to be created

  1. Don't over index. Indexes require disk space

  1. Columns defined as foreign keys must be indexed

  1. Frequently updated fields are not suitable for indexing

  1. Do not create indexes for columns with less query involvement and more repeated values

  1. Data types defined as text, image, and bit are not indexed

  1. Basic characteristics and isolation principles of transactions

Basic characteristics (ACID):

Atomicity: The smallest execution unit of a transaction, no separation is allowed. Make sure the action is either fully completed or not completed

Consistency: The data remains consistent before and after the execution of the transaction, and the results of multiple transactions reading the same data are the same

Isolation: Modifications made by a transaction are not visible to other transactions until they are finally committed

Persistence: After a transaction is committed, the changes made are permanently stored in the database

Segregation Principle:

Read uncommitted: may read uncommitted data of other transactions, also called dirty read

Read committed (oracle): only read committed transactions, the results of the two reads are inconsistent, called non-repeatable read

Repeatable reading (mysql): The data read each time is consistent, but phantom reading may occur

Serializable: Generally not used, each row will be locked, which will cause a lot of timeout and lock competition problems

  1. How to sub-database and table mysql? How much data needs to be divided into databases and tables? What are the methods and sharding strategies of sub-database and table? What is the execution sequence of SQl after sub-database sub-table?

What is sub-database and sub-table: When the amount of data is too large, the query speed decreases. To improve efficiency, the data in one table is distributed to multiple tables in multiple databases.

Commonly used sub-database and sub-table tools: MyCat, ShardingSphere

Data fragmentation method:

Vertical sharding: Splitting different tables into different libraries from a business perspective can solve the problem of too large database data files, but it cannot fundamentally solve the query problem.

Horizontal sharding: From the perspective of data, split the data in a table into different libraries or tables, which can fundamentally solve the problem of low query efficiency caused by excessive data volume.

Fragmentation strategy:

  1. Remainder: evenly store data, but expansion is very troublesome

  1. According to the scope: it is better to expand the capacity, but the data distribution is not uniform enough

  1. According to time: it is easier to distinguish hot data

  1. According to the enumeration value: for example, sharding by region

  1. Specify the partition according to the target field prefix: custom business rule fragmentation

Horizontal sharding breaks through the bottleneck of single-machine data volume processing in theory, and expands freely. It is a standard solution for sub-database and sub-table

Ali Development Manual suggests: the data of a table exceeds 5 million or the data file reaches 2G (before the business starts, estimate the business volume for 3 years in advance)

Execution process after sub-database sub-table (ShardingSphere):

SQL analysis->query optimization->sql routing->sql rewriting->sql execution->result merging

Sub-database sub-table problem:

Cross-database query, cross-database sorting, distributed transactions, public tables, primary key duplication...

  1. Mysql's master-slave synchronization principle

Master-slave synchronization: When the data in the master database changes, the changes will be synchronized to the slave database in real time;

Benefits of master-slave synchronization:

The ability to expand the database horizontally

fault tolerance, high availability

data backup

Realization: On the main library machine, the master-slave synchronization event will be written to a special log file; on the slave library machine, the slave library reads the master-slave synchronization event, and according to the change of the read event, do it on the slave library Corresponding changes

  1. The difference between Myisam and Innodb
  1. InnoDB index is clustered index, MyISAM index is non-clustered index

  1. The leaf nodes of InnoDB's primary key index store row data, so the primary key index is very efficient

  1. The leaf node of the MyISAM index stores the row data address, which needs to be addressed again to get the data.

 d. The leaf nodes of the InnoDB non-primary key index store the primary key and other indexed column data, so when querying, the covering index will

very efficient

  1. Index types in mysql and their impact on database performance

Primary key: special unique index, only one in a table

Unique index: guarantee the uniqueness of data

Ordinary index: Allows the indexed data column to contain duplicate values

Indexes can greatly improve data query speed and improve system performance; however, the speed of deletion, addition, and modification will be reduced; each index takes up physical space.

  1. What does each field sub-table represent in the Explain statement result

id: Every time select appears in the statement, a unique id will be assigned, some subqueries will be optimized into join, and the id will be consistent

select_type: the query type corresponding to the select keyword

table: table name

partitions: matching partition information

type: query method for a single table (full table scan, index)

possible_keys: Indexes that may be used

key: the index actually used

key_len: the actual index length used

ref: When using an index query, the object information that matches the value of the index column

rows: the number of records read

filtered: Percentage of remaining records after table filtering

Extra: extra information

  1. What is index covering

When executing sql, the field data that needs to be queried by the current sql is included in the B+ tree corresponding to the index, and there is no need to search for it, and the result is returned directly.

  1. What is the leftmost prefix principle

Leftmost first, when creating an index, the most frequent column is placed on the leftmost

  1. How Innodb implements transactions

Take update as an example:

  1. After Innodb receives the update statement, it queries the page where the data is located according to the conditions and caches it in the Buffer Pool

  1. Execute the update statement to modify the Buffer Pool data

  1. Generate a redoLog object for the update statement and store it in the LogBuffer

  1. Generate undo Log logs for update statements for transaction rollback

  1. If the transaction is committed, then the redo Log object and retrograde persistence will be persisted, and there will be other mechanisms to persist the data page to the disk; if the transaction is rolled back, the undo log log will be used to roll back

  1. The difference between B tree and B+ tree, why Mysql uses B+ tree

B tree: Sort nodes, a node can store multiple elements, and multiple elements are also sorted

B+ tree: It has the characteristics of B tree, there are pointers between leaf nodes, non-leaf nodes have redundancy in leaf nodes, and they are sorted

Because the index is used to speed up the query, and the B+ tree can improve the query speed by sorting the data. The B+ tree can store more elements through a node, making the B+ tree more chunky and requiring less IO. The page size is only 16KB. Generally, a B+ tree with a task depth of 3 can store 20 million rows of data. Using the ordered linked list of leaf nodes in the B+ tree can well support range search and full table scan.

  1. What are the types of mysql locks
  1. Row lock: refers to locking one or more rows of a table. When accessed by other transactions, the locked rows cannot be accessed, and others are normal.

  1. Table lock: refers to locking the entire table, and other requests can only read, not write; it cannot be written until the read lock is released

  1. Deadlock: During the execution process, multiple processes compete for resources and cause each other to wait and cannot continue to execute

  1. Optimistic lock: Assuming that the data will not conflict, it will be detected when the data is submitted for update, and an error message will be returned if there is a conflict

  1. Pessimistic lock: When modifying a piece of data in the database, in order to avoid being modified by others, lock it directly to prevent concurrency

  1. Shared lock: When the data is locked, other transactions can only read the lock, but not the write lock; the write lock cannot be added until all the read locks are released

  1. Exclusive lock: When a transaction adds a write lock to data, other requests cannot add any locks until the lock is released

  1. How to optimize Mysql slow query?
  1. Check if the index is gone, if not, optimize SQL to use the index

  1. Check if the optimal index is used

  1. Check whether all fields are required, whether too many fields are queried, and more than data are found

  1. Check whether sub-database and sub-table are required

  1. Check the database configuration to see if resources need to be added

  1. What indexes are there in mysql?
  1. Primary key index: Data columns are not allowed to be repeated, and are not allowed to be null. There can only be one in a table

  1. Unique index: data columns are not allowed to be repeated, null is allowed, and a table allows multiple columns to create a unique index

  1. Ordinary index: basic index type, no uniqueness restriction, NULL value is allowed

  1. Under what circumstances will mysql cause index failure
  1. Non-leftmost match (joint index can be used for field query starting from the leftmost, otherwise it cannot be used)

  1. Error fuzzy query (only right fuzzy query can trigger index)

  1. Column operations (indexed columns use operations)

  1. Use functions (index columns use functions)

  1. Type conversion (the field is a string type, but the int type is passed in, and the index is invalid)

  1. use is not null

  1. use (!= or <>)

  1. The index column uses or

  1. The difference between primary key and unique index
  1. The primary key index does not allow null and is unique; the unique index allows null, allowing multiple columns to create a unique index

  1. The primary key must create a unique index, and the column with the unique index is not necessarily the primary key

  1. There can only be one primary key index, but there can be multiple unique indexes

  1. A primary key can be referenced as a foreign key by other tables, but a unique index cannot

  1. The execution order of the primary key is higher than that of the unique index

  1. A primary key is a constraint, but a unique index is an index

20. What is MVCC

Multi-version concurrency control: When reading data, the data is retained in a way similar to snapshots, so that read locks and write locks do not conflict, and different transaction sessions will see their own specific versions of data and version chains;

MVCC can only work under the two isolation levels of committed read and repeatable read

おすすめ

転載: blog.csdn.net/qq_35056891/article/details/129675144