Getting Started with Java source code analysis Hashtable collection series (XI)

Foreword

We realized on a hashing algorithm and conflict resolution we used two ways to address law and open chain address law, in this section we analyze in detail the source code, the source code to see which way the conflict is for the use and comparative we have achieved, what place can be modified.

Hashtable source code analysis

We analyze the sample code behind what is new in the console operated by instantiating Hashtable and add the key, as follows:

 public static void main(String[] args) {

        Hashtable hashtable = new Hashtable();
        hashtable.put(-100, "first");
 }

Next we look at when we initialize Hashtable, behind prepared to do what work?

public  class the Hashtable <K, V>
     the extends the Dictionary <K, V>
     the implements the Map <K, V> , the Cloneable, the java.io.Serializable { 

    // store the key value data 
    Private  transient the Entry <,??> [] Table; 

    // store data size 
    Private  transient  int COUNT; 

    // threshold:. (int) (* Capacity loadFactor)) 
    Private  int threshold; 

    // load factor: time and space cost tradeoffs default 0.75. Because although higher value will reduce the space overhead but increase the element of time to find cost 
    Private  float loadFactor; 

    // specified capacity and load factor constructor 
    public Hashtable ( int initialCapacity,float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal Load: "+loadFactor);

        if (initialCapacity==0)
            initialCapacity = 1;
            
        this.loadFactor = loadFactor;
        
        table = new Entry<?,?>[initialCapacity];
        
        //The default threshold. 8 
        threshold = ( int ) Math.min (initialCapacity * loadFactor, MAX_ARRAY_SIZE +. 1 ); 
    } 

   // specified capacity constructors 
    public the Hashtable ( int initialCapacity) {
         the this (initialCapacity, 0.75f ); 
    } 

    // Default parameters None constructor (initial capacity of 11, the load factor is 0.75f) 
    public the Hashtable () {
         the this (11, 0.75f ); 
    } 
    
    
    Private  static  class the Entry <K, V> the implements of Map.Entry <K, V> {
         Final  int the hash ;
         Final  K Key;
        V value;
        Entry<K,V> next;

        protected Entry(int hash, K key, V value, Entry<K,V> next) {
            this.hash = hash;
            this.key =  key;
            this.value = value;
            this.next = next;
        }
    }
}

Internal Entry Hashtable array by storing data, by using the Entry-chain structure may be seen to resolve hash collision address method, when initializing Hashtable capacity and load factor are not specified, the default initial capacity of 11, the load factor is 0.75, the threshold value is 8, if the exception is thrown capacity is less than 0, if the capacity is equal to 0 to 1 and the threshold capacity value of 0 otherwise specified threshold value calculated in the capacity or in a specified capacity * 0.75 * specified load factor calculation prevail. 

We will soon be able to come to the definition above conclusions, this point we do not have too much further discussion, then we look at when we add the above data as key-value pairs, the interior is through how to do it as source code and variables ?

public synchronized V put(K key, V value) {
        if (value == null) {
            throw new NullPointerException();
        }

        Entry<?,?> tab[] = table;

        int hash = key.hashCode();
       
        int index = (hash & 0x7FFFFFFF) % tab.length;
        
        Entry<K,V> entry = (Entry<K,V>)tab[index];
        
        for(; entry != null ; entry = entry.next) {
            if ((entry.hash == hash) && entry.key.equals(key)) {
                V old = entry.value;
                entry.value = value;
                return old;
            }
        }

        addEntry(hash, key, value, index);
        
        return null;
    }    

We analyze step by step, first add an empty value if an exception is thrown, followed by the hash value is added to obtain the key, the focus here, what role the following code fragment is it?

 int index = (hash & 0x7FFFFFFF) % tab.length;

Because the array index can not be negative, so here the logical operation by the hash value of the key conversion and positive, that is, essentially in order to ensure the index is positive, then the  int index = (the hash & 0x7FFFFFFF)% Tab. length;  how to calculate it? 0x7FFFFFFF Binary 1111111111111111111111111111111 is, since it is positive symbol 0 i.e. 01111111111111111111111111111111, and we add to the value of -100, the binary is 11111111111111111111111110011100, converts them to binary logical sum operation, the final result is 01111111111111111111111110011100, to decimal 2147483548 result, which is to explain the principles of our calculation, we actually can by subtracting decimal the decimal 0x7FFFFFFF is 2147483647, then we subtract directly on the basis of (100-1) i.e. 99, finally obtained also 2,147,483,548. Finally, taking the results of the initial capacity of the mold 11 is of index 1. If the key is the hash value is positive then there is no problem, that is, by logical AND operation resulting hash value is the original value. Then I get the position corresponding to the index in the array, and then loop, the question is why does the array cycle? That is, the following code fragment:

       for(; entry != null ; entry = entry.next) {
            if ((entry.hash == hash) && entry.key.equals(key)) {
                V old = entry.value;
                entry.value = value;
                return old;
            }
        }

In order to solve the above is the same as the corresponding key value will be covered, or not understand? We add a line of code in the console as follows:

public static void main(String[] args) {

        Hashtable hashtable = new Hashtable();
        
        hashtable.put(-100, "first");

        hashtable.put(-100, "second");
}

As we add keys are -100, by our analysis of the source of the circulation, this time replacing the first value is above the first row second, the value of coverage in other words, when we add the same key and the latter occur the former value, and we can also return worth to know, if the return value is null does not appear to explain the situation covered, otherwise there is a return value, indicating the presence of the same key and the return value is covered . We Hashtable printed out by the data may be obtained, and this point C # Hashtable different operations, when the same bond is present directly thrown.

        Enumeration keys = hashtable.keys();

        while (keys.hasMoreElements()) {

            Object key =  keys.nextElement();

            String values = (String) hashtable.get(key);
            System.out.println(key + "------>" + values);
        }

Not finished, we continue to analyze down the following code, the key-value pairs to add to the array:

Private  void the addEntry ( int the hash, Key K, V value, int index) { 
        ModCount ++ ; 
        
        // definition storing data variables 
        the Entry Tab [] = <,??> Table; 
        
        // if the element exceeds or equals the threshold array array expansion 
        IF (COUNT> = threshold) { 
            the rehash (); 

            Tab = Table; 
            the hash = key.hashCode (); 
            index = (the hash & 0x7FFFFFFF)% tab.length; 
        } 

        // adds a key value and a hash value of the storage array 
        the Entry <K, V> E = (the Entry <K, V> ) Tab [index]; 
        Tab [index]= new Entry<>(hash, key, value, e);
        count++;
}

In adding data to the storage array is bound to determine whether or when to have exceeded the threshold, in the final analysis is to expansion hash table, then we look at the specific realization of what is it?

protected  void the rehash () { 

        // Get the current capacity of the memory array 
        int oldCapacity = table.length; 
        the Entry [] = oldMap <,??> Table; 

        // new capacity = the current capacity. 1 + 2 * 
        int newCapacity = (<< oldCapacity 1) + 1 ; 
        
        // determines whether the new size of the array exceeds the maximum capacity, then the maximum capacity of more than a defined maximum array size 
        iF (newCapacity - MAX_ARRAY_SIZE> 0 ) {
             iF (oldCapacity == MAX_ARRAY_SIZE) 
                return ; 
            newCapacity = MAX_ARRAY_SIZE; 
        } 
the Entry
<?,?> [] = newMap new new the Entry <?,?>[newCapacity]; ModCount ++ ; // recalculate threshold threshold = ( int ) Math.min (newCapacity * loadFactor, MAX_ARRAY_SIZE +. 1 ); // storage array after infusion Table = newMap; // cycle of the current array storage updating the stored data after expansion in the array for ( int I = oldCapacity; i--> 0 ;) { for (the Entry <K, V> Old = (the Entry <K, V>) oldMap [I];! = Old null ;) { the Entry <K, V> E = Old; Old = old.next; int index = (e.hash & 0x7FFFFFFF)% newCapacity; e.next = (Entry<K,V>)newMap[index]; newMap[index] = e; } } }

As explained above has been very clear, then we add the following code in the console: 

public static void main(String[] args) {

        Hashtable hashtable = new Hashtable();

        hashtable.put(-100, "first");

        hashtable.put(-100, "second");

        hashtable.put("Aa", "third");
        hashtable.put("BB", "fourth");

        Enumeration keys = hashtable.keys();

        while (keys.hasMoreElements()) {

            Object key =  keys.nextElement();

            String values = (String) hashtable.get(key);
            System.out.println(key + "------>" + values);
        }
}

When we add the above two lines of code, then we think the data will print out the results of what is it? as follows:

 Hey, seems to find little problem, we obviously start by adding the above keys to Aa, do first printed Aa should not do? BB key is how it? Let us not only Heart doubts, mainly because the key and key Aa BB computed hash value, like lead, do not believe that we can print out both the corresponding hash values ​​are 2112, as follows:

 System.out.println("Aa".hashCode());
 System.out.println("BB".hashCode());

Next we look at the final storage array to go inside, specifically how to operate it? We extract the above code fragment, the following:

  Entry<K,V> e = (Entry<K,V>) tab[index];
  tab[index] = new Entry<>(hash, key, value, e);

The problem lies in this place, we explain in a hash algorithm to solve the conflict using a chain address law, we are adding key computed the same hash value to the tail of a single list, just the opposite here, take here is added subsequently into a single linked list head, it is added into the next reference. Since the above-described first key corresponding to the index has been added Aa taken out, and then re-key data store instances of BB, i.e. its next (next) is directed Aa, so only the results of the print, where we need Note down. So why do it? The comparison section our implementation, the main difference data structure definition, the one we used to loop through mode, but using the next assignment constructor embodiment referenced in the source code, of course, is the best performance as source code, because Free We went to loop through. Well, then we look at the delete method, we continue to add the following code in the console:

hashtable.remove("Aa");

We also look at the source code corresponding to how delete operations, the source code is as follows:

public  the synchronized V Remove (Object Key) { 

    // definition storage array variable 
    <??,> the Entry Tab [] = Table; 
    
    // calculate the hash value of the key 
    int the hash = key.hashCode (); 
    
    // Get key index 
    int index = (the hash & 0x7FFFFFFF)% tab.length; 
    
    // Get key index data stored in 
    the Entry <K, V> E = (the Entry <K, V> ) Tab [index]; 
    
    for (the Entry <K, V> PREV = null ;! E = null ; PREV = E, E = e.next) { 
        
        // if the delete data in a single linked list head proceeds to the statement, or continue to the next cycle 
        IF ((== e.hash the hash) && E. key.equals (Key)) { 
            
            ModCount ++ ;
            
            // If the data is not deleted singly linked list head proceeds to the statement 
            IF (PREV =! Null ) { 
                prev.next = e.next; 
            } the else {
                 // if the delete data in the memory array index header statement proceeds to the 
                tab [ index] = e.next; 
            } 
            
            // length minus. 1 
            count-- ; 
            
            // return value to delete 
            V = oldValue e.Value; 
            
            // to delete a value as null 
            e.Value = null ; 
            
            return oldValue; 
        } 
    } 
    return  null ; 
}

Through the above analysis of the delete operation, in which case we delete key Aa, the head of the key at this time a single list for the BB, so will the next cycle, and finally into the second if statement, if the delete button BB, because at this time it is there is a single chain head, so prev empty into the else statement element deletion operation. Analysis of this Hashtable source end, such as to obtain the other key corresponding to the key value or is included in the memory array is relatively simple, it is not described here.

to sum up

In this section we analyze in detail the source of the Hashtable, Hashtable using chain address law to resolve hash collision, and when a conflict occurs, the conflicting data is stored in a single list head, just as the head of the next data reference, Hashtable insert does not allow any empty keys and values, modification methods that keyword synchronized Hashtable is thread-safe, while the default initial capacity of 11, the load factor is 0.75f, 0.75f reason the load factor as that: if a conflict or collision very frequently will slow down the operation uses elements, because the index is not enough to just only know of at this time need to traverse the list to find the stored elements, therefore, very important to reduce the number of collisions, the larger the array, the smaller the chance of collision, the load factor determines the balance between performance and the size of the array, which means that when 75% of the bucket is empty and the size of the array will be the expansion, the operation performed by the rehash () method . The next section we learn more hashCode, equals and hashCode calculation principle and source code analysis HashMap, thank you for reading, see the next section.

Guess you like

Origin www.cnblogs.com/CreateMyself/p/11525112.html