DDIA study notes --chapter3

DDIA_Charpter3    study notes

Index: If you are not indexed, so when we query the data from the Database, there is no query rule, so we can traverse all the data when a query of data, time complexity is O (N), as data volumes increase big, fast query performance will be reduced. So create an index can improve query performance, but due to the need to synchronize the time of writing the index structure maintenance, thus reducing writing efficiency.

 

Hash index:

Maintains a key-value pair, Key index value, Value for the address data stored in the hard disk. Hash index defect is not the scope of inquiry, within a certain range of data for only traversal.

B-tree:

Each node of the tree pointing section B of disk space, typically 4KB is size or larger. Data for each B-tree node contains: a highly ordered set of index values, and the corresponding value of the corresponding index data (pointer to the data, B + non-leaf nodes of the tree does not contain such data, and therefore the tree is more low), a set of pointers pointing to other nodes.

When adding B-tree index, first find the B-tree index of the target node is located, and adding an index ordered index value, if the node has sufficient space directly added; if the space is insufficient, then the node into two nodes, and then added.

The bottom B-tree is a direct write operation to cover the entire B-tree nodes: Accordingly, the database provides an additional data structure: Write-Ahead Logging (the WAL, Write-Ahead-log) (also called a redo log (redo log) ), when the B-tree needs to be modified is first written to the log needs to operate, and then when the crash can restore the database to a consistent state tree based on the log B - because data may have a plurality of indexes, it is necessary to maintain multiple B-tree, when more than half of the B-tree database maintenance to collapse, causing inconsistencies between the structure of multiple indexes.

However, this also causes the performance overhead - as for first need to write additional data structures write operation; B-tree and the bottom B of the direct write operation tree node for Update, which leads to a waste of resources (as a B tree nodes actually storing a plurality of index values)

 

Clustered index ( clustered index) (all rows storing data) and non-clustered index (index NONCLUSTERED) (corresponding to the index value and stores clustered indexes)

 

Clustered index: present at more than InnoDB engine, InnoDB engine selected by default primary key as a clustered index, without specifying the primary key of the first non-null unique index is selected as the clustered index, if such an index does not exist, generating a hidden as a unique primary key clustered index.

Clustered index: clustered index node records the whole data clustered index uniquely identifies the corresponding data, and both a clustered index data having a one to one relationship.

Non-clustered index: a non-unique index can not query directly to the corresponding data as a database table may have a plurality of different data corresponding to the same index value, and therefore, requires an index uniquely identifies the corresponding data. This index was become clustered index, other indexes are called non-clustered index, as the index of the auxiliary - both to the query index through the auxiliary clustered indexes, clustered index node records corresponding to the entire data ( this process is to be back to the table)

Covering indexes: index covering index is not a type, but because of the non-clustered indexes also carries its own corresponding field value, if the query only query the value by the value of the non-clustered index contains can be done, then it becomes the non-poly clustered index to cover (a) index.

——————————————————————————————————————————————————————————————————————————————————————

Counter-intuitive that the performance benefits of in-memory database is not because of the fact that they do not need to read from disk. Even disk storage engines may never need to read from the disk, because the operating system caches recently used disk blocks in memory based. Instead, they are faster because memory data structure eliminates the encoded disk data structure overhead. [44].

OLTP Online Transaction Processing - even if the database began to be used in many different types of blog posts, action games, address book contacts, and so on, is still the basic access mode is similar to a business transaction. Applications typically use the index to find small number of records by a key. Insert or update the record according to a user's input. As these applications are interactive, so the access mode is referred to as online transaction processing (OLTP, OnLine Transaction Processing) .

Online analytical processing OLAP - However, the database also began increasingly used for data analysis, data analysis have very different access patterns. Typically, analysis of a large number of queries need to scan records, each record just read columns, and calculates summary statistics (such as count, the sum or average), rather than the original data returned to the user. These queries are usually written by business analysts, and offer to help companies make better management decisions (BI) reports. To distinguish this transaction mode using the database, it is called online analytical processing (OLAP, OnLine Analytice Processing) .

 

Database typically used in two different scenarios do different optimization, although the surface is to use SQL statement to query the data, aggregate, but different internal optimization.

Guess you like

Origin www.cnblogs.com/ybonfire/p/12172841.html