Database (MySQL) surface by

1 Please tell us about the three paradigms?

  1. The first paradigm (1NF): database table fields are a single property, can not be divided. This property consists of a single basic types, including integer, real number, character, logic type, date type.
  2. The second paradigm (2NF): database table no non-key field dependent on any one part of the function key field candidates (partial functional dependency means that there is some combination of keywords in the field of non-key field decision ), i.e. all non-key fields are totally dependent on any set of candidate keys.
  3. The third paradigm (3NF): on the basis of the second paradigm, if there is no data in the table the non-dependent key field transfer function according to any one candidate keyword section is in line with the third paradigm. The so-called transfer function dependent, means that if there is "A → B → C" determines the relationship between the transfer function depends on C A. Non-key fields → X → non-key field Key Fields y: Thus, the third normal database tables should not exist as dependencies

2 What is a B-Tree?

B_TREE is a balanced multi-channel search tree, is a dynamic lookup very efficient tree structure. The maximum value of the child node to all nodes of B_TREE called B_TREE order, usually B_TREE m represents the order, simply referred to as m-ary tree. In general, it should be m> = 3. An order m B_TREE or an empty tree that satisfies the following conditions or m-ary tree:

  1. Each node in the tree up to m child nodes
  2. If the root is not a leaf node, the root node has at least two child nodes
  3. In addition to the root node, other nodes at least (on boundary m / 2) number of child nodes
  4. Node structure shown below, wherein, for the n-number of nodes in the keywords (the upper bound of m / 2 of) -1 <= n <= m-1; di (1 <= i <= n) for the key value of n in the i-th node, and di <d (i + 1); ci (0 <= i <= n) the node pointer for that child node, and pointed ci key for a node is equal to or greater than di and less than d (i + 1)
    Here Insert Picture Description
  5. All leaf nodes are on the same layer, and without information (can be seen as an external lookup failure node or nodes, these nodes do not actually exist, the node pointer pointing to these empty)

Here Insert Picture Description
Find B_TREE binary sort tree look similar, the difference is that the B- tree each node is an ordered list of multi-key, first look at the time table in order to reach a node, if found , the lookup is successful; otherwise, according to the pointer information corresponding to the subtree pointed to find, when it reaches a leaf node, then the tree is not a corresponding key. Due to the high retrieval efficiency of B_TREE, B- tree in the main application file systems and databases, for large database files are stored on your hard disk, you can reduce the number of accesses the hard drive greatly, greatly improve the efficiency of data retrieval.

Whether binary search tree or AVL tree, when a large amount of data, will be due to the depth of the tree caused by excessive I / O read and write too often, leading to poor query efficiency, so for the index, it has become a multi-tree structure choice. In particular, various operations of the B-Tree B-tree can be kept low height, so as to ensure an efficient search efficiency.

3 What is a B + Tree?

A B + Tree index implemented InnoDB storage engine. B + Tree is a tree to be one kind of modification required B_TREE tree generated file system. An m-order difference B_TREE B + tree of order m and that the following three points:

  1. Subtree node n contains n key code;
  2. All leaf nodes contains information about all of the key, and the key contains pointers to records, and the leaf node itself according to the size of the key and a large childhood linked sequentially;
  3. Non-terminal nodes can be seen as part of the index, the node contains only the root node in the subtree which the maximum (or minimum) key.

The figure below shows a third order B + tree. There are usually two head pointer in the B + tree, a pointer to the root node, pointing to another keyword minimum leaf node. Thus two search operation can be performed on the B + tree: one is to find a keyword in order from the smallest, the other is from the root node, random search. Random look at the B + tree, the insertion and deletion process is basically similar to the B- tree. Only in the lookup, if not key on the terminal nodes is equal to a given value, it does not terminate, but continues down to the leaf nodes. Thus, for B + tree, whether successful or not find, once for each is taking a path from the root to leaf nodes.
Here Insert Picture Description

4 Why is B + tree is more suitable database index file index and the practical application of the operating system than the B-tree?

  1. B + tree lower the cost of disk reads and writes: internal nodes of the B + tree pointers to specific information is not a keyword, so the internal node B is relatively smaller trees. If all of the same internal node key stored in the same disk block, the disk block number of keywords can accommodate the more. Disposable read into memory in the keyword you want to find the more, relatively speaking, it reduces the number of IO read and write;
  2. B + tree query efficiency is more stable: Due to an internal node is not the end point node file content, but only in the keyword index leaf node, so any search key must take a leaf from the root to knot waypoints. All the paths of the same length as the keyword query, the query results in considerable efficiency of each data;
  3. Mainly due to the use of B + tree database indexes rather than B-tree: B + tree leaf node traversal as long as you can bring the whole tree traversal, and the database query range is based on very frequent, and the B-tree can only preorder all node, efficiency is too low.

Set the index at 5 but can not use what circumstances?

  1. With "% (represents any zero or more characters)" at the beginning of the LIKE statement, fuzzy matching
  2. OR before and after the statement has no index is used
  3. There is an implicit data type conversion (e.g. varchar without single quotation marks may then automatically converted to an int)
  4. For multi-column index, you must meet the leftmost matching principle, for example, there are multiple-column index col1, col2 and col3, in effect, including the index case or col1 col1, col2 or col1, col2, col3.

6 advantages and disadvantages index?

advantage:

  1. The reason greatly accelerate the speed of data retrieval, which is the main index of the creation of
  2. The connection between tables and table acceleration
  3. When using packet data retrieval and sorting clause, can also significantly reduce the query time grouping and ordering
  4. By creating a unique index, you can guarantee the uniqueness of each row of data in a database table

Disadvantages:

  1. Time: the creation and maintenance of indexes index takes time, particularly, when the data in the table to add, delete and modify, the index should be dynamic maintenance, thus reducing the speed of data maintenance
  2. Space: the need to occupy physical space index

Frequent queries required for data indexing, if you want to frequently change the data does not recommend the use of an index.

7, which has several indexes?

  1. hash index: for the equivalent of a query, you can not sort, query range can not be
  2. Ordered arrays: for equivalence queries and range queries. However, due to the interposition of a data record you must move all of the back, high cost, only for static storage engine, i.e., the data table will not be modified once they are created.
  3. B + Index: suitable for data orderly, range queries

8 What kind of field for creating an index? Under what circumstances should not be indexed?

For creating an index:

  1. Often selected as the query field
  2. Fields table as often connected
  3. Often in order by, group by, distinct fields behind

Not suitable for creating an index:

  1. For the columns in the query rarely involved in repeat or more of the value of the column, not indexed
  2. For some special data types, not indexed, such as a text field (text), etc.

What should you watch 9 to create an index?

  1. Non-empty field: It should be specified as NOT NULL, unless you want to store NULL. In MySQL, the column contains a null value is difficult to query optimization, because they make the index, the index statistics, and comparison operations more complex. You should use 0, or a special value NULL value instead of an empty string
  2. The value of the field in front of large discrete :( degree of difference between the respective values ​​of the variables) into the joint index of the column, you can () function to see the difference in value of the field by count, the greater the return value of the more unique value field high degree of dispersion field
  3. Index field as small as possible: data stored in the database page as a unit, the more data a store, the greater the IO operation once the data obtained, the higher the efficiency

Category 10 index?

  1. General index and unique index: uniqueness of the value of the index column
  2. A single index and composite index: the number of columns included in the column index
  3. Clustered and non-clustered index index: In InnoDB clustered index is also called the primary key index that leaf node is stored entire row of data. Main primary key query needs to scan the primary key index. Secondary index, also known as non-primary key index that leaf node is the primary key value of the content. Two by two the index needs to scan the index tree to find the primary key before scanning the primary key index. This process is called back to the table.

11 primary key index difference and unique index?

Primary key index refers to the primary key, a primary key index, it is a special type of unique index. Create a primary key, the default database will create a unique primary key index; unique index represents the value of the index columns must be unique, but allow free value. A primary key is a unique index, say yes; but on the other hand, is a unique index on the primary key error, because the only index to allow null values, primary keys do not allow null values, you can not say unique index is the primary key.

12 What is a transaction?

Database transaction is an indivisible sequence of operations is the basic unit of database concurrency control, the results of which must be performed by a database from one consistent state variable to another consistent state.

It features 13 transactions (ACID)?

  1. Atomicity (Atomicity): a transaction must be atomic unit of work; for which data modification, either all executed or not executed full.
  2. Consistency (Consistency): transactional consistency refers to the execution of a transaction before and after the implementation of the database must be in a consistent state. The results of the implementation of the database transaction must be changed from one consistent state to another consistent state.
  3. Isolation (Isolation): isolation on the database transaction isolation level offers a variety of: execution of a transaction can not interfere with other matters. I.e., operation and use of the data inside a transaction other concurrent transactions are isolated and can not interfere with each other between the respective transaction executed concurrently.
  4. Persistent (Durability): After the transaction is complete, it is for data in the database changes are permanent. The modification even if the system failure will remain.

14 transaction isolation level?

  1. Read Uncommitted: A transaction has not been committed, other transactions can see the change it made
  2. Read Committed: The default isolation level for most database systems, a transaction after the submission of other transactions to see it change
  3. Repeatable read: mysql default isolation level, the transaction is a read data between the start and submission is consistent, it is not submitted in time, other transactions can not see it does change
  4. Serialization: the recording of the same line, write and write-locks, read and add a read lock, write lock when there is conflict, after the visit of the transaction must wait before executing the transaction to proceed

15 transaction concurrency problems caused?

  1. Dirty read: A transaction reads data from another uncommitted transactions
  2. Non-repeatable read: non-repeatable read focus is modified, under the same conditions twice read a different result, i.e., data is read may be modified by other transactions
  3. Reading Magic: Magic reading emphasis added or deleted, under the same conditions twice read out the number of records that are not the same
    Here Insert Picture Description

16 MySQL transaction support?

MySQL transaction support is not tied to the MySQL server itself, but related to the storage engines: MyISAM does not support transactions, while InnoDB is transactional.

17 How to optimize MySQL?

You may consider the following four aspects:

  1. SQL optimization and indexing
  2. Optimization of the structure of the table
  3. System configuration optimization
  4. Hardware optimization

18 Optimization MySQL - SQL optimization and indexing?

For SQL statements, we can monitor SQL efficiency issues through slow query log, explain and analyze the SQL query execution plans through. Queries can be read using explain order table, the data read operation operation type, which indexes can be used, which indexes are actually used, and the reference table between the number of rows in each table by the query optimizer and other factors can That analysis query performance bottlenecks statement or table structure.

So how to optimize SQL statements do?

  1. Optimization insert statement: insert more than one value
  2. Should be avoided in the where clause! = Or <> operator, otherwise the engine to give up using the index and a full table scan
  3. Should be avoided fields null value judgment in the where clause, will cause the engine to give up using the index and full table scan
  4. Nested query optimization: sub-query can be more efficient connections (Join) Alternative
  5. It is a good choice in many cases instead of using exists

For optimizing the index can refer to what I mentioned above the field for creating an index and not indexed under what circumstances.

19 Optimization MySQL - optimizing database table structure?

  1. Select the appropriate data type
  2. Optimization paradigm table
  3. Table vertical split
  4. The split-level table

Select the appropriate data type

  1. Using smaller data types to solve the problem
  2. Use simple data types (int easier to handle than MySQL varchar)
  3. Use custom fields as much as possible not null
  4. Avoid the use of the text type, it is not advisable to consider when using a non-sub-table

Optimization paradigm table

The above-mentioned reference had three major paradigms

Table vertical split

The splitting table comprising a plurality of columns into a plurality of tables, the table width solve the problem, following splitting means comprises:

  1. The single field is not used in the same table
  2. The large field into a separate table
  3. The field is often used together

The advantage of this is very obvious, including: a clear business after the split, the split rules are clear, easy to integrate or extend between systems, simple data maintenance.

The split-level table

Table level split for solving the data sheet data is too big problem, split level structure of each table is exactly the same. Generally, the method used to bisecting N data tables include the following two:

  1. Performs the hash function to the ID, if you want to split into five tables, mod (id, 5) values ​​taken 0-4
  2. For different hashID store data in different tables

Table level split will bring some problems and challenges, including the problem of cross-partition table data query, statistics and operational background reports, etc., but also brought some tangible benefits:

  1. After splitting can reduce the number of pages in the table when the query needs to read the data and indexes, while also reducing the number of layers of index to improve query speed
  2. Data in the table already has the independence of such data in the table were recorded in all regions or in different periods, particularly in some of the data used, while others are not commonly used data
  3. The need to store data across multiple databases and improve overall system availability (sub-libraries, can not put eggs in one basket)

20 What is a stored procedure? What are the advantages and disadvantages?

Stored procedures are prebuilt SQL statements. I understood more straightforward: the stored procedure can be a set of records, it is a code block by a number of T-SQL statements, such as T-SQL statement codes as a method to realize some functions (by single table or multiple tables deletion check), then give this block a name, call him on the line when use this function. A stored procedure is a precompiled code blocks, the efficiency is relatively high, a large number of alternative stored procedure T_SQL statements, network traffic can be reduced to improve the communication rate, data security can be ensured to some extent.

The difference between 21 drop, delete and truncate the? Drop, delete and truncate were used under what scenario?

SQL in the drop, delete, truncate means to delete all but three there are some differences:

  1. Delete to delete all or part of the data table row, after executing delete, users need to submit (commmit) or rollback (rollback) to delete or undo delete, delete all the delete command will trigger triggers on the table
  2. Truncate to delete all the data in the table, this operation can not be rolled back, and will not trigger the trigger on this table, TRUNCATE faster and take up less space than delete
  3. Drop command to delete the table from the database, all the data rows, indexes, and permissions will be deleted, all DML triggers will not be triggered, this command can not be rolled back

Therefore, when no longer need a table, with a drop; in line when you want to delete some data, use delete; and delete all data in the retention time table with truncate.

22 What is a view? And the use of the scene view of what?

  1. A view is a virtual table, has the same functionality as a physical table. Can be increased to view, change, operation, there is usually attempted to a table or a subset of the plurality of rows or columns of the table. Changes to the view do not affect the basic table. It allows us to obtain data more easily, compared to multi-table queries
  2. Only the exposed portions of the field to the visitor, so we built a virtual table, is the view
  3. Query data from different tables, queries and wishes in a uniform manner queries, so you can create a view, the query results together multiple tables, the query only needs to get data directly from the view, regardless of the data source differences in table brought

23 What is the trigger?

Triggers are database objects associated with the table, while meeting the definition of trigger conditions, trigger and perform a set of statements defined. This feature can help trigger the application on the database side to ensure the integrity of the database.

What optimistic and pessimistic locking 24 database is?

Concurrency control task database management system is to ensure that access to both barrier properties and does not destroy the uniformity and unity of the database transaction database in the same data in a plurality of transactions. Optimistic concurrency control (optimistic locking) and pessimistic concurrency control (pessimistic locking) concurrency control techniques are mainly used.

  1. Pessimistic locking: assume concurrency violation occurs, the shield may be in violation of all operational data integrity. Its characteristics are characterized by first acquiring a lock, then business operations. Generally in the database we use select ... for update to implement pessimistic locking. When the database to perform select ... would acquire row locks the data row select the time for update, so other concurrent execution select ... for update if you try to select the same line will exclude (waiting for row lock is released) occurs, thus achieving lock Effect. select for update acquire a row lock automatically released at the end of the current transaction, and therefore must be used in a transaction. (Note: in MySQL, select ... the implementation of all scanned lines will be locked for update statement, so be sure to determine if MySQL with pessimistic locking use of the index, rather than scanning the entire table)
  2. Optimistic locking: Suppose concurrency conflicts will not occur, if only to check data integrity violation when a commit operation. Optimistic locking features to business operations, only to check whether the data is updated, if not been updated, the update was successful at the time of last update actual data; otherwise, fail and try again. Optimistic locking The general practice is to add a version number on the need to lock the data or time stamp. Then follows:
1. SELECT data AS old_data, version AS old_version FROM;
2. 根据获取的数据进行业务操作,得到new_data和new_version
3. UPDATE SET data = new_data, version = new_version WHERE version = old_version
if (updated row > 0) {
    // 乐观锁获取成功,操作完成
} else {
    // 乐观锁获取失败,回滚并重试
}

In general, reading and writing less is more suitable for use optimistic locking, read less write more pessimistic locking is more suitable. Optimistic locking does not occur in the case of failure to take the lock overhead is smaller than pessimistic locking, but once failed to roll back overhead occur is relatively large, and therefore suitable for use in relatively small probability of failure to take a lock of the scene, can improve system concurrency.

The difference between 25 MyISAM and InnoDB?

MySQL MyISAM and InnoDB storage engines are, their differences are as follows:

  1. Storage: MyISAM can be compressed to occupy a small storage space; the InnoDB requires more memory and storage, it will create its own dedicated buffer pool in the main memory for caching data and indexes.
  2. Transaction support: MyISAM stressed that the performance of each query be atomic, which performs several times faster than InnoDB type, but does not provide transaction support; InnoDB provides advanced database features transactions, foreign keys, etc., with transaction commit, and rollback crash repair capacity.
  3. Table locks difference: MyISAM supports only table-level locking, user operation MyISAM table, select, update, delete and insert statements are automatically locked to the table; InnoDB supports transactions and row-level locking. Row lock greatly improve the performance of multi-user concurrent operation, but InnoDB row lock, except in the WHERE primary key is effective, non-primary key of WHERE will lock the whole table.
  4. Foreign key: MyISAM does not support foreign keys, and support for InnoDB foreign keys.
  5. Specific number of rows in the table: the total number of rows to save MyISAM table, but not the number of rows to save InnoDB tables.
    Here Insert Picture Description

Reference: interview / written test third bomb - Database Interview Questions Collection

Published 116 original articles · won praise 210 · views 10000 +

Guess you like

Origin blog.csdn.net/Geffin/article/details/104043595