C ++ (Data Structures and Algorithms): 70 --- balanced search tree Tree B-

First, indexed sequential access method

  • When the dictionary is sufficiently small, the whole can reside in memory, AVL trees and red-black tree can ensure good performance time. For those large dictionary (dictionary or external files) must be stored on the disk, requiring a higher degree of search tree to improve the performance of the dictionary operations. Prior to study such a search tree, look at the external dictionary Indexed Sequential Access Method (ISAM). This method of sequential and random access have good performance time

Concept of block

  • In ISAM method, the available disk space is divided into many blocks
  • Block is used to store the data in the disk space of the smallest unit
  • Block generally has the same length and the track, and can be completed in a search input or output and delay
  • Dictionary elements stored in ascending order in the block . And these blocks are organized in accordance with a sequential, such that the time from the block to another block

When sequential access

  • When access sequence, sequentially input blocks, each block element searches ascending
  • If each block contains m elements , the number of searches required disk access for each element of 1 / m

Random access (ISAM indexing technology)

  • To support random access, it is necessary to maintain an index. The index includes the biggest key for each block . Thus, the same number of keywords and the number of blocks in the index , and can store a lot of elements (i.e., generally larger value of m) of each block, and therefore the entire index sufficient to reside in memory
  • In order to randomly access the keyword k elements, first lookup table index, the determination block element belongs, then the corresponding internal search quickly taken out from the disk , to determine the element
  • As a result, a disk access is sufficient to complete a random access 

ISAM application in large dictionary

  • This technique can be extended to a larger dictionary, this dictionary is stored in a number of disks. This time, the elements in ascending order are assigned to different disks, a disk element in ascending order in turn are assigned to different blocks
  • Each disk has a block index , which holds the biggest key for each block in the disk. In addition , there is a disk index, which holds the biggest key for each disk. Disk index generally reside in memory
  • To access a random element:
    • First you want to search the disk index resides in memory, disk to determine the element belongs
    • Then removed from the disk block index searches to determine if the block belongs to the element
    • Finally, remove the blocks from the disk for internal search, to determine the element
  • Thus, a random access requires access to two disks (one block index is removed, and another block is removed)
  • An array describing the method ISAM method is essentially, therefore, when performing insertions and deletions, we will face a big problem. To partially alleviate the problems of stress, some space can be reserved in each block:
    • Such that upon insertion oligoelements , need not be moved between the block and the block elements
    • Similarly, in after the delete operation can be retained vacant space is not necessary to save space in the mobile element between the blocks
  • Use ISAM ways to solve these problems, but at the cost of improved sequential access . Stored in any order elements, each keyword index will check that all elements are accessed via the index. Data stored on the disk, B- tree data structure is a suitable method to the index

Two, m binary search tree

  • m binary search tree may be an empty tree. If not empty, it must meet the following characteristics:
    • ① corresponding expansion in the search tree (i.e., after the replacement of the external node with a null pointer obtained search tree), each internal node may have children up to m and 1 ~ m-1 element (excluding external nodes and elements child)
    • ② Each node contains p elements have p + 1 child
    • ③ for any node containing a p elements:
      • Set K1, K2, ......, Kp are the keywords of these elements, these elements sequential ordering, i.e., K1 <K2 <...... <Kp
      • Set C0, C1, ......, Cp is the node p + 1 child
      • In the subtree rooted C0 keyword elements are less than K1
      • In order Ci subtree rooted, Ki <keyword element <K (i + 1). Where 1 <= i <p
      • Cp is the subtree root element is greater than in the keyword Kp
  • In the definition of m binary search tree, the incoming external node contains is useful, but in the actual code, no external node specifically described, instead of using a null pointer can be a
  • The figure is a seven binary search tree, where black represents the black node, the other is an internal node. For example there are two root elements and three children

Search of m binary search tree

  • Method: starting from the root, compared with the node keyword, if a keyword than the nodes are large or small, to the corresponding sub-tree to find out. Until you find or not find so far
  • Find keywords such as 31:
    • Start looking at the root node, located between 31 10 and 80, then to the second child of the root node to find
    • When the second lookup child from the child to the root, located between 30 and 31 found 40, then to the third child of the root lookup
    • Came third child, found smaller than 32, then to the root node to the first child in a look, and found to be empty, then the absence of 31

m binary search tree insertion

  • Method: starting at the root node, a node if there is free space and larger than all the child nodes of the space, then insert the corresponding position of the node; if smaller than a child node elements, then move down ... and so on
  • Such as inserting a key 31:
    • Of the root node, found between 10 and 80 can not be inserted, since some elements smaller than a child node
    • Then move to a second child at the child, but it can not be inserted between 30 and 40, because the element values ​​than the child in the child node to be smaller
    • Then came the child again child nodes, found 32 can be inserted on the left, then inserted on the left side 32
  • Such as inserting a key 65:
    • Starting from the root, found to be between 10 and 80 is inserted, because some elements smaller than a child node
    • Then move to the second sub-children, the discovery may be inserted between 60 and 70, then insert success

M Delete the binary search tree

  • method:
    • If there is no child node, then delete
    • If you have a child (children left / right child): You can use the largest element in the left child or right child of the smallest element instead delete this node (if the child node element exists)
  • Such as deleting keyword 20:
    • This element has no child found the child, it can be deleted directly. At this node becomes [30,40,50,60,70]
  • Such as deleting keyword 84:
    • This element has no child found child can also delete directly. At this node becomes [82,86,88]
  • Such as deleting keyword 5:
    • After a node elements, then delete only the node becomes empty
    • But its left child is non-empty, you can use the left child node after the largest element (4) to replace the deleted
  • Such as deleting keyword 10:
    • This element has two sub-children
    • You can use the left child of the largest element (5) instead of
    • You can also use the right child in the smallest element (20) in place of
    • But if the child node also needs to move up recursively delete the child to consider sub-operation

M of high binary search tree

  • A height h m binary search tree (without external node) preferably has elements h (each node, each node contains a element), up to m^{h}-1elements
  • The upper limit is calculated as: from 1 to h-1 layer, each node comprising a child m, h-layer node without children, this time is the number of nodes, and each node has at most m-1 elements, Therefore, the number of elementsm^{h}-1
  • Since the height h m binary search tree, the number of elements to h m^{h}-1between, so that a binary search tree height m n of elements in the n between
  • For example, a binary search tree height 200 receiving element 5 can do more is to32* 10^{10}-1, at least 5. Similarly, one containing32* 10^{10}-1200 elements of the binary search tree, 5 to its height32* 10^{10}-1between the
  • When the search tree stored on the disk, search, insert, delete time depends on the number of disk accesses (assuming the size of each node is not larger than a disk block). When h is the height of the tree when this number is O (h), therefore, to reduce disk access, it is necessary to ensure the height of the tree is close , for which we must take advantage of m binary search tree

Three, m order B- Tree

  • B- tree of order m is an m binary search tree. If the B- tree is not empty, then the following rules:
    • ① root node has at least two children
    • ② except the root node, all internal nodes of at least m / 2 children
    • ③ all external nodes in the same layer

Full Binary Tree

  • In the second-order B- tree, each internal node will not have more than two children, and each internal node has at least two children, so all internal nodes of the second-order B- tree has exactly two children
  • And because all external nodes are on the same level, so the tree is a full binary tree second-order B-
  • Thus, for some integer H, only when the number of elements 2^{h}-1, the presence of such a tree only
  • Following a seven bands B- tree:
    • Root node has three children: ① meet rules
    • Internal node has at least 3, to meet the rules ②
    • All external nodes in the same layer, satisfies rules ③

  • B- Tree Second Order (man binary)
    •  

 

 

 

Released 1481 original articles · won praise 1026 · Views 380,000 +

Guess you like

Origin blog.csdn.net/qq_41453285/article/details/104208309