java face questions: Do you know the mysql B + TREE index it?

  • What is the index?

  The index is stored in a distributed data structure in order to accelerate the retrieval of rows in the table that is created.

  working principle:

  

  We have established through the index can be found by hitting the fast disk address table data we need, compared to a full table scan to query the data, can greatly enhance the search efficiency.

  In a relational database, the index is a hard disk-level index.

  • Common data structure

  1) binary tree

  

  I believe we are very familiar with the binary tree, with a node as a root, take the left than the value of the node small and large values ​​go right. When searching for data, as long as the data link through which you can find the appropriate data. The right to a node in a specific part.

  Disadvantages: binary tree has a very fatal flaw, when inserted into the binary data is sequentially ascending or descending order, became a form of unilateral chain.

  

  In this case, look no efficiency at all.

  2) balanced binary search tree

  Balanced binary search tree as a binary search tree advanced version, improved binary search tree shortcomings.

  Definition: It is an absolute value of an empty tree or a left and right subtrees height difference of not more than 1, and left and right sub-trees are a balanced binary tree.

  That is, when the balanced binary search tree data insertion, a tree structure is formed if found in violation of its definition, then the tree deformation occurs at this time, the composition of a relatively balanced search tree.

  If the sequence is inserted in the above 1-5, the tree structure is formed

  

  Red-black tree is balanced for implementing binary search tree.

  Is not recommended for frequent changes of data column is indexed, because in this process, the structure of the index change will inevitably bring CPU and IO streams loss.

  Disadvantages:

    1. The lack of search efficiency.

      Because the final analysis, or binary tree, so the large amount of data when the height of the high tree, which means that may need to find a number of times over the IO.

    2. node data content too.

      Save each node in the data value is not enough to fill a memory and disk interaction. If an interaction memory and disk is 4KB, then the data stored in a node in fact, a lot of waste this space.

  3) B tree (multi-channel balanced search trees, absolute balance)

  

  So how to solve the shortcomings of balanced binary tree B do?

  上图是一个三路平衡查找树,因为Btree是多路平衡查找树的原因,它可以是三路,四路,五路六路,路越多,就意味着树的高度越低,那么一次搜索的最大IO次数也就越少。

  也因为它的多路特性,一次IO交互的磁盘块中可能保存着巨大数量的节点数据。我们以id为int为例,一次IO交互假设为4KB,那么一次IO交互最多能保存1024个关键字!(当然一个节点中有数据区和子节点引用,粗略计算)。

  因此我们在设置数据库的字段类型和字段长度的时候,控制字段类型合理,字段长度合理,就能保证每次索引磁盘块的加载能包含更多数据,从而提升我们的查找效率。

  4)B+tree(加强版多路绝对平衡查找树)

  我们还是以三路查找树为例

  

  B+树的特点:

  1. 非叶节点不保存数据相关信息,只保存关键字和子节点的引用。
  2. 所有的数据都保存的叶子节点中。
  3. 采用左闭合区间。
  4. 叶子节点中的数据顺序排列,并且相邻节点具有顺序引用的关系。  

  B+树相比B树的优点:

  1. 扫库、扫表能力更强
  2. 磁盘读写能力更强(因为少保存了个数据区)
  3. 排序能力更强
  4. 查找效率更加稳定(B树可能一次命中,也可能多次命中,而B+树因为没有保存数据区的缘故,树的高度相对更低,但每次都要查找到最多次数)
  • mysql中B+tree的具体落地形式(未完待续)
  1. myisam
  2. innodb

  

   

 

Guess you like

Origin www.cnblogs.com/keeplearningclc/p/10960890.html