c ++ background knowledge to develop common interview summary (four) database

  • The type of database index
  • Clustered index and non-clustered index difference (leaf node storing content)
  • Unique index difference between the master key and the index
  • The advantages and disadvantages of the index, when to use the index, when not to use the index (focus)
  • Index of the most left-prefix problem
  • In a transaction database ACID
  • Database isolation settings problem (dirty reads, non-repeatable read, modify lost, phantom read) will appear different
  • Mysql has four isolation levels: read uncommitted, read committed, repeatable read, serializable.
  • Mysql optimization (frequency, optimization index, performance optimization)
  • Database engine introduced, InnoDB and myisam characteristics and differences
  • The role of database connection pool
  • All you used to talk about locks, in addition to a mutex lock and reader such as spin locks, lock recursion, optimistic locking, pessimistic locking
  • Two-phase locking protocol
  • The difference between relational and non-relational databases (respective merits)
  • Paradigm database
  • Mysql table space type, their own characteristics.
  • Distributed Transaction
  • The role and use of view (how to delete, etc.)
  • Sub-library sub-table, from the master copy, separate read and write. (I would not, did not come across)
  • memcache how the data structure is implemented
  • memcache, redis internal data storage principle

Index Type (1) databases

Having a different index index is implemented in the storage engine types and layers implemented in different storage engine.

B + Tree index : The index is the most MySQL storage engine the default index type. Engine such as InnoDB and MyISAM storage engine.

Hash index: InnoDB engine has a special feature called "adaptive hash index," when an index value is used very frequently, will then create a hash index on B + Tree index, so let Some advantages of B + Tree indexes with a hash index, such as fast hash lookup. The hash index can be performed in O (1) time to find, but lost orderliness, it has the following limitations: 1 can not be used for sorting and grouping; 2 support to pinpoint and can not find a part of the range and look...

Full-text index : MyISAM storage engine supports full-text index for finding key words in the text, rather than a direct comparison for equality. Find conditions using MATCH AGAINST, instead of the normal WHERE. Full-text indexing is generally achieved using an inverted index, which records the keywords to which it is mapped in the document. 5. InnoDB storage engine in MySQL 6.4 version began to support full-text indexing.

Spatial index (R & lt-Tree) : MyISAM storage engine supports spatial index, may be used to store geographic data. Spatial data indexes to index data from all dimensions, can effectively be combined with any dimension query.

GIS-related functions must be used to maintain data.

(2) clustered index and non-clustered index difference (leaf node storing content)

Clustered index: InnoDB primary index is a clustered index, the data stored in the index, the database table is organized by the main index, a complete data record stored in the data field of the leaf nodes of the B + tree. A table can have only one clustered index.

leaf node data field of secondary indexes primary key value is recorded, and therefore when a secondary index to find, it is necessary to find the primary key, and then to lookup the primary index. Usually defaulted to auto-increment value-based index of the primary key, rather than a single column as the primary key. With unique column as the primary key will result when inserting new data file is recorded in order to maintain the characteristics of the B + Tree and frequent split adjusted, very inefficient, and use auto-increment field as the primary key is a good choice.

Non-clustered index: non-clustered index MyISAM provided. MyISAM main index and secondary indexes are non-clustered index. B + tree leaf node, recording the stored pointer is pointing.


The difference between (3) and the primary unique index code index

A unique primary key constraint stringent than constraint index, when the primary key is not set, the only non-null primary key index is automatically called. For some difference between the primary key and the unique index mainly as follows: 1 null primary key value is not allowed, a unique index allows null values. 2. allows only one primary key, allows multiple unique index. 3. generates a unique primary key clustered index unique index produces a unique non-clustered index.

When using the InnoDB storage engine, if no special needs, please always use a business unrelated to auto-increment field as the primary key. Instead choose to use as student number or unique ID number such as the primary key field.

Create a unique index of the purpose is often not in order to improve access speed, but only in order to avoid duplication of data.

(4) the advantages and disadvantages of the index, when to use the index, when not to use the index (focus)

Although the index to speed up the query speed, but there is a price index: the index file itself consumes storage space, while the index will increase the insertion, deletion and modification of the burden of record, in addition, MySQL also consumes resources at run time maintaining an index, so the index is not better. Generally not recommended to build the index in both cases.

The first case is relatively small table records, no need to build the index, so queries do full table scan just fine. Usually no more than the number of records 2000 can not be considered to build the index, an index over 2000 can be considered as appropriate.

Another case is not recommended building the index is an index of selectivity is low. (Optional index)


(5) an index of the most left-prefix problem

When the index matched exactly all the columns (herein refers to an exact match "=" or "IN" match), the index may be used.

When the left side of the query exactly matches the index of one or several consecutive columns, can only be used part of the index that most left-prefix composed condition.


(6) in a transaction database the ACID (four characteristics must be able to illustrate and understand thoroughly, such as atomic relevance and consistency, isolation, poor problem might arise)

Atomic: A transaction is regarded as the smallest indivisible unit, all operations in the transaction either all submitted successfully, or all fail rolled back. Rollback can be achieved with logs, logging the modification operation performed by the firm, the reverse operations to perform these modifications upon rollback.

Consistency: database before and after the transaction execution are consistent state. In coherency state, all transactions of a reading result data is the same.

Isolation: a firm to make changes before final submission, other transactions are not visible.

Durability: Once a transaction is committed modification, it will be done forever saved to the database. Even if the system crashes, the implementation of the outcome of the transaction can not be lost. May be implemented by a database backup and recovery, Ben collapse occurs in the system, the database using the backup data recovery.

A head grade is not a relationship between the ACID properties of transactions:

  • Only to meet the consistency of the results of the transaction is correct.
  • In uncomplicated cases, transaction serial execution, be able to meet isolation. As long as this time to meet the atomicity, consistency will be able to satisfy.
  • In the case of concurrent, concurrent execution of multiple transactions, the transaction must not only meet the atomicity, isolation also need to meet in order to meet consistency.
  • Services to meet the persistence database is to be able to deal with the case of Ben collapse.

MySQL default automatic submission mode. In other words, if you do not explicitly use the START TRANSACTION statement to begin a transaction, each query will be treated as a transaction automatically submitted.

In the case of concurrent, when the transaction isolation is not possible, there will be dirty reads, non-repeatable read, modify lost, phantom read other issues.

 

(7) database isolation settings problem (dirty reads, non-repeatable read, modify lost, phantom read) will appear different

Lost modify: T 1  and T 2  both transactions to a data modification, T 1  to modify, T 2  subsequently modified, T 2  modified to cover the T 1  changes. The T 1 lost modify. X may be coupled to the data latch T1 when modified until the end of the lock release T1, T2 can not modify the data during this period.

Dirty read: after modifying the data transaction T1 a, did not commit. Transaction T2 reads the data a, and then T1 rollback

Revocation modified, T2 a read data as dirty data. Read Committed isolation level to solve the problem of dirty read.

Non-repeatable read: read the results of the same data multiple times in the same transaction is not the same. T 2  to read a data, T . 1  the data has been modified. If T 2  read this data again, and this time the result of the reading results of the first read different. When the data can be read together with the data in the S lock T2, T2 until the end of lock release, in this period T1 can not modify the data. Repeatable Read isolation level to solve the problem of non-repeatable reads. 

T: Magic Reading 1  reads data in a range, T 2  insert new data within this range, T 1  different from the data read range again, and the results read at this time and the result of the first reading . Above, can also be added to the S lock to solve the problem.

 The MySQL InnoDB storage engine, in repeatable read (REPEATABLE READ) isolation level, using MVCC + Next-Key Locks phantom read problem can be solved.


(8) What isolation level of the database, mysql and Oracle isolation levels are

Mysql has four isolation levels: read uncommitted, read committed, repeatable read, serializable.

Uncommitted read: modified transaction, even if not submitted, other transactions are also visible. Always read the latest data line that is uncommitted read isolation level.

Read Committed: A transaction can only read the firm to do has been submitted. In other words, a firm changes made prior to submission of other transactions is not visible.

Repeatable read: to ensure that the results of reading the same data in the same transaction multiple is the same. MVCC (multi-version concurrency control) implement read committed and repeatable read both isolation level.

 

Serialization: Force transaction serial execution. Follow the two-phase locking protocol can be serialized.

(. 9) Mysql optimization (frequency, optimization index, performance optimization)

Query optimization: write SQL statements, you can select another JOIN, try to put a meta-operations (select) to press the leaves, thereby reducing the size of the binary operator.

 

Index Tuning: During the query, the index column can not be part of an expression, not a function of the parameters, otherwise you can not use the index.

When queried as conditions require the use of multiple columns, multi-column index is better than using multiple separate index performance.

Let the most selective index column on the front, the selective index means: the ratio of the index value will not be repeated, and the total number of records. Maximum value of 1, each case record has a unique index corresponding thereto. The higher the selectivity, the higher the efficiency of the query.

Similar VARCHAR type column, you must use the prefix index, the index is only part of the character began. We need to be determined based on the index selected prefix length selectivity.


(10) database engine to reports, features and differences between innodb and myisam of.

Innodb: InnoDB is the default MySQL transactional storage engine, InnoDB only when needed unsupported features, before considering the use of other storage engines. Implements four standard isolation level, the default level is repeatable read. In the isolation level repeatable read by multi-version concurrency control (MVCC) + gap lock (next-key locking) prevent ghost reading. Main index is a clustered index, the data stored in the index, thus avoiding direct reading disk, so there is much room for improvement on query performance. Do a lot of internal optimization, including the use of data read from the disk to read predictability, it is possible to speed up a read operation and automatically creates adaptive hash index, the buffer can be inserted into the insertion operation of the accelerator and the like.

Myisam: MyISAM provides a number of features, including compression tables, indexes and other spatial data. It does not support transactions. It does not support row-level locking, can only lock on the entire table, while reading will need to read all shared locks on tables, writing tables plus the exclusive lock. However, while the table has a read operation, a new record can be inserted into the table, this is referred to concurrently inserted.

Compare

Services: InnoDB is transactional, you can use the Commit and Rollback statements.

Concurrency: MyISAM supports only table-level locking, and InnoDB supports row-level locking.

Foreign key: InnoDB support foreign keys.

Crash Recovery: the probability of damage occurring after a crash is much higher than MyISAM InnoDB, and the speed of recovery is also slower.

Backup: InnoDB hot backup support online.

Other features: MyISAM support for data compression table and index space.

 

( 11 ) the role of database connection pool

What is the database connection pool: create a database connection is a very time-consuming operation, it is likely to pose a threat to the database. Therefore, in the program initialization, centrally create multiple database connections, and their centralized management for program use, you can ensure a faster database read and write speeds, but also more secure. Database connection pool is responsible for the distribution, management and release of database connections, which allows an application to reuse an existing database connection, rather than re-create one. Connection pooling must ensure a conn only be assigned to one thread within a certain time. Conn different transactions are independent of each other.

 

Java applications to the traditional mechanisms to access the database connection process: ① load database driver; ② database connection is established via JDBC; ③ access to the database, execute SQL statements; ④ disconnect from the database.

Use the database connection pool mechanism: ① create a connection pool initialization procedure. ② use application ③ available connections to the connection pool after use, to return the connection to the pool. ④ when the program exits, disconnect all connections, and free up resources.

 

(12) talk about all locks you used, in addition to a mutex lock and reader such as spin locks, lock recursion, optimistic locking, pessimistic locking

1. Share lock (also known as a read lock), an exclusive lock (also known as a write lock):

InnoDB engine locking mechanism: InnoDB supports transactions, supports row and table locks used more, Myisam does not support transactions, only supports table locks.

 Shared lock (S): allows a transaction to read a row to prevent other transactions to obtain the same data set exclusive lock.

Exclusive lock (X): to allow him to acquire an exclusive lock transaction update data, preventing other transactions made to share the same data set read locks and exclusive write locks.

Intention shared lock (IS): Transaction intend to add rows of data rows shared lock before the transaction to a data line shared locks must obtain the IS lock on the table.

Intent exclusive locks (IX): Rights intends to add to the data line row exclusive lock, the transaction must be obtained before his lock IX lock on the table to add a row of data rows.

2. optimistic locking, pessimistic locking:

Pessimistic locking: pessimistic locking, as its name suggests, it refers to the data by outsiders (including other current affairs of the systems, and transaction processing from external systems) to modify a conservative approach, therefore, the entire data processing process, data is locked. Implement pessimistic locking, often rely lock mechanism provided by the database (the database layer, only lock mechanism provided in order to truly ensure exclusive access to the data, otherwise, to achieve a locking mechanism even in this system, there is no guarantee does not modify the external system data)

Optimistic lock:

Optimistic locking (Optimistic Locking) relatively pessimistic locking, the optimistic assumption that the lock will not cause data conflict under normal circumstances, so be time to submit updated before formal conflict or not the data is detected in the data, if the conflict found , then let users return an error message, allowing users to decide what to do ( usually roll back the transaction ). So how do we achieve optimistic locking it, in general the following two ways:

 (13) two-phase locking protocol

Locking and unlocking is divided into two phases. Affairs to follow the two-phase locking protocol is to ensure sufficient condition for scheduling can be serialized.

(Fully non-essential, must follow to ensure serializability, also could not follow serialization) can be serialized scheduling means, through concurrency control, so that the results of concurrently executing transactions with the same results with a serial transaction execution .

(14) the difference between relational and non-relational databases (respective merits)

 (15) paradigm database

Does not meet the paradigm of relations (tables), it will produce a lot of abnormal paradigm theory to solve the anomaly. Common redundant data abnormalities, abnormal modify, delete anomaly, insertion anomaly. High-level paradigm relies on low-level paradigm, 1NF is the lowest level of paradigm.

1NF: property inseparable. Does not meet 1NF is not a valid table.

2NF: Each non-primary key attribute is fully dependent on the function code. Does not meet the 2NF the four kinds of exception may occur. It can be met by decomposition table.

3NF: non-primary property does not depend on the transfer function of the key code. 3NF does not meet the above-mentioned four kinds of exception may occur. It can be met by decomposition table.

 

Two-phase locking protocol

Locking and unlocking is divided into two phases. Affairs to follow the two-phase locking protocol is to ensure sufficient condition for scheduling can be serialized.

(Fully non-essential, must follow to ensure serializability, also could not follow serialization)

 

Serializable scheduling means, by concurrency control, so that the result of the transaction with the transaction concurrently executing the same results with a serial execution.


 Mysql table space type, their own characteristics.

innodb engine so Mysql table space and have to share exclusive tablespace two data storage.

 

Shared table space : all table data, index data to a database shared a table space. Mixing a plurality of tables and indexes stored in the table space.

Advantage: table space into a plurality of files to be stored on each disk, a table can be distributed over the file are not synchronized.

Table space can be divided into multiple files stored in each disk, so the table will be split into multiple files stored on disk, unlimited file size of the table, the disk size. Facilitate management of data and files on together.

Disadvantages: a plurality of mixing tables stored in the table and index space, when a lot table delete operation will have a lot of table space voids. Shared table space will not shrink.

 

A separate table space: Each table has its own separate table space. Each table and index data will exist in their own tablespaces.

Other information, such as rollback (undo) information, is still in the default table space. The default table space will not shrink space.

Advantages: single table may be implemented in different databases movement. Use a separate table for table space, no matter how removed the table space debris will not be too severe performance impact, but also the opportunity to deal with. A separate table space size can shrink.

Disadvantages: single table is increased too much, such as more than 100 G.

Summary: In comparison, the use of exclusive table space efficiency and performance will be a little higher. Conversion between the shared and exclusive table space tablespace can be done by setting the parameters in the configuration file innodb_file_per_table. Description is OFF is used exclusively tablespace.


(18) Distributed Transaction

A major operation by the composition of different small operation, these small operations distributed across different servers, and belong to different applications, distributed transactions need to ensure that these small operations either all succeed, or all fail. In essence, a distributed transaction is to ensure the consistency of data from different databases.

The role and use of (21) views (how to delete, etc.)

A view is a virtual table, the operation of the views of ordinary table and operation of the same. But the view itself does not contain data, it can not be indexed operations. View has the following advantages: an access right view to the user via, to ensure the security of data; 2 simplify complex SQL operations such complicated connection; 3 changes the data format and representation; 4 actual table.... part of the data;

CREATE VIEW myview AS

SELECT Concat(col1, col2) AS concat_col, col3*col4 AS compute_col

FROM mytable

WHERE col5 = val;

Summary: The data is not included in the table view, a package of the select query. The data processing can be simplified, reformat the data base.

(22) sub-library sub-table, separated from the master copy, read and write. (I would not, did not come across)

Sub-library sub-table:

Level segmentation: also known Sharding, which is split record in the same table to a plurality of the same configuration tables. When a data table increasing, Sharding is an inevitable choice, which data may be distributed to different nodes of the cluster, so that the pressure of a single database cache. Services: Using distributed transactions to resolve the affairs issues. JOIN: JOIN original query can be decomposed into a plurality of single-table queries, then the user program in the JOIN.

 

How the data structure is implemented in memcache

memcache distributed cache servers is to reduce the number of cache data read from the database by the object in memory and to improve the speed of the dynamic database-driven site. Memcache saved in memcache data are stored in the built-in memory storage space. Restart memcache, the operating system will lead to the disappearance of all the data. When the content memcache capacity reaches a specified value, based on the LRU algorithm memcache 4. Eliminate cache. memcache cache server itself is designed, so not too much data to consider permanent problems.

memcache, redis internal data storage principle

https://blog.csdn.net/session_time/article/details/52618215

Like memcached and redis, in order to ensure efficiency, the data is cached in memory. Redis difference is that will periodically update the data written to disk or to modify the operation of writing additional log file, and on this basis realize the master-slave (master and slave) synchronization.


(23) Where the project uses a database, how to use the

Guess you like

Origin www.cnblogs.com/springwind2006/p/12093007.html