ArrayMap and SparseArray - a better Map under Android

Comparison of three containers

project ArrayMap SparseArray HashMap
concept Memory optimized version of HashMap The performance-optimized version of ArrayMap whose key is int A Map with O(1) access to a large amount of memory
data structure Two arrays:
one stores the Hash of the Key
and the other stores the Key and Value
Two arrays:
one for Key
and the other for Value
Array + linked list/red-black tree
Application Scenario 1. The amount of data is less than one thousand;
2. There is a Map in the data;
1. The amount of data is less than one thousand;
2. The key must be int;
Use when the first two are not suitable
  • All three are thread-unsafe

ArrayMap

data structure

Two arrays:

  1. mHashes: store the hash value of the key
  2. mArray: An Object array, the even index stores the key value, and the odd index stores the value
    two arrays
mHashes[index] = hash;
//用位运算优化
mArray[index<<1] = key;  //等同于 mArray[index * 2] = key;
mArray[(index<<1)+1] = value; //等同于 mArray[index * 2 + 1] = value;

So the size of mArray is twice the size of mHashes.

storage logic

get()

  1. Calculate its hash value according to Key (if key is null, take 0 directly)
  2. Find the index corresponding to the hash value in the mHashes array
  3. If index<0, it means not found, return null; if index>0, take out Value from mArray and return

Among them, the second step takes the most time and is optimized by binary search.
Therefore, the query time complexity of ArrayMap is O(log n).

put()

  1. Calculate its hash value according to Key (if key is null, take 0 directly)
  2. Find the index corresponding to the hash value in the mHashes array
  3. If index>0, mArray already has a value, just replace it with a new value.
  4. If index<0, it means that the current key does not exist. Invert the index to get the subscript to be inserted.
  5. Judging whether expansion is required based on index
  6. Copy the array to free up the position of the index
  7. Put the new value into the index position

Key point: remove the subscript indexOf()

int indexOf(Object key, int hash) {
        final int N = mSize;

        // Important fast case: 如果当前是空表,直接返回
        if (N == 0) {
            return ~0;//返回的是最大的负数
        }
	//二分查找,返回要插入下标的取反
        int index = binarySearchHashes(mHashes, N, hash);

        //小于0,说明key原来没有存,返回
        if (index < 0) {
            return index;
        }
	
        // 如果index>0,而且key刚好对应上,那刚刚好找到,返回
        if (key.equals(mArray[index<<1])) {
            return index;
        }
	//进行Hash碰撞下的查找
        // 向后查找
        int end;
        for (end = index + 1; end < N && mHashes[end] == hash; end++) {
            if (key.equals(mArray[end << 1])) return end;
        }

        // 向前查找
        for (int i = index - 1; i >= 0 && mHashes[i] == hash; i--) {
            if (key.equals(mArray[i << 1])) return i;
        }

        //如果上面都没找到, 那么最后的end就是将要插入的位置,类似的,先取反再返回
        return ~end;
    }

caching mechanism

  • In many scenarios in Android, the amount of data is relatively small at the beginning. In order to avoid frequent creation and recycling of array objects, ArrayMap designed two buffer pools mBaseCache and mTwiceBaseCache .
  • Only when the size is 4 and 8, the caching mechanism will be triggered. So it is best to use new ArrayMap(4)or new ArrayMap(8)This can minimize the creation of objects.
  • The cache multiplexing is an array of size 4 or 8.
  • Implementation idea: string the arrays to be cached into a one-way linked list , create a cache is to insert elements at the head of the linked list, and reuse is to delete elements at the head of the linked list.
private static final int BASE_SIZE = 4;

create cache

private static void freeArrays(final int[] hashes, final Object[] array, final int size) {
    if (hashes.length == (BASE_SIZE * 2)) {  //缓存大小为8的数组的hashes,和array
        synchronized (ArrayMap.class) {
            // 当大小为8的缓存池的数量小于10个,则将其放入缓存池
            if (mTwiceBaseCacheSize < CACHE_SIZE) {
                //把 array[0]看作单向链表节点的next即可, 此处是把array接上原有缓存
                array[0] = mTwiceBaseCache;
                //将hashes保存到节点中
                array[1] = hashes;
                for (int i = (size << 1) - 1; i >= 2; i--) {
                    //清除不需要的数据
                    array[i] = null;  
                }
                //把头结点指向array, 完成链表插入头部节点
                mTwiceBaseCache = array; 
                mTwiceBaseCacheSize++;
            }
        }
    } else if (hashes.length == BASE_SIZE) {  //同理
        synchronized (ArrayMap.class) {
            if (mBaseCacheSize < CACHE_SIZE) {
                array[0] = mBaseCache;
                array[1] = hashes;
                for (int i = (size << 1) - 1; i >= 2; i--) {
                    array[i] = null;
                }
                mBaseCache = array;
                mBaseCacheSize++;
            }
        }
    }
}

3-tier cache structure:
3-tier cache structure

freeArrays() trigger timing:

opportunity reason
removeAt() removes the last element The array is not needed at this time, trying to recycle
Execute clear() to clean up ditto
Execute ensureCapacity() when the current capacity is less than the expected capacity Sample
Execute put() when the capacity is full ditto

multiplexing cache

private void allocArrays(final int size) {
    if (size == (BASE_SIZE*2)) {  //当分配大小为8的对象,先查看缓存池
        synchronized (ArrayMap.class) {
            if (mTwiceBaseCache != null) { // 当缓存池不为空时
                final Object[] array = mTwiceBaseCache;
                //从缓存池中取出mArray
                mArray = array;
                //链表删除头结点, 即把原头结点的下一位变成头结点
                mTwiceBaseCache = (Object[])array[0]; 
                //从原头结点中取出mHashes
                mHashes = (int[])array[1];  
                //清空原头结点的数据
                array[0] = array[1] = null;
                mTwiceBaseCacheSize--;  //缓存池大小减1
                return;
            }
        }
    } else if (size == BASE_SIZE) { //同理
        synchronized (ArrayMap.class) {
            if (mBaseCache != null) {
                final Object[] array = mBaseCache;
                mArray = array;
                mBaseCache = (Object[])array[0];
                mHashes = (int[])array[1];
                array[0] = array[1] = null;
                mBaseCacheSize--;
                return;
            }
        }
    }

    // 分配大小除了4和8之外的情况,则直接创建新的数组
    mHashes = new int[size];
    mArray = new Object[size<<1];
}

The data structure of the 2-tier cache:
The data structure of the 2-tier cache

allocArrays trigger timing

opportunity reason
Execute the constructor of ArrayMap Construct the initial array
Execute removeAt() when the capacity tightening mechanism is satisfied try array multiplexing
Execute ensureCapacity() when the current capacity is less than the expected capacity Sample
Execute put() when the capacity is full ditto

capacity adjustment mechanism

expansion

  • In put, check and expand if satisfied
public V put(K key, V value) {
...
    final int osize = mSize;
    //当mSize大于或等于mHashes数组长度时需要扩容
    if (osize >= mHashes.length) {
        //不足4的补到4,不足8的补到8, 否则扩容1.5倍(osize + (osize >> 1))
        final int n = osize >= (BASE_SIZE * 2) ? (osize + (osize >> 1))
                : (osize >= BASE_SIZE ? (BASE_SIZE * 2) : BASE_SIZE);
        allocArrays(n);
    }
}

reduce

public V removeAt(int index) {
    final int osize = mSize;
    //当mSize大于1的情况,需要根据情况来决定是否要收紧
    if (osize > 1) {
        //当数组容量mSize大于8而且存储的数据量不足容量的1/3时, 进行缩减
        if (mHashes.length > (BASE_SIZE * 2) &amp;&amp; mSize < mHashes.length / 3) {
            //mSize不足8时补8, 否则按照mSize的1.5倍申请数组
            final int n = osize > (BASE_SIZE * 2) ? (osize + (osize >> 1)) : (BASE_SIZE * 2);
            allocArrays(n);
        }
    }
}
  • When the utilization rate of the array is less than 1/3, it will be reduced by 50% or more ( 1 3 \frac{1}{3}31*1.5= 1 2 \frac{1}{2} 21)

Parameter mIdentityHashCode

  • The default is false, which allows the same value to appear in mHashes.
  • Realize the logic:
    public V put(K key, V value) {
        ...
	//如果为true, 使用的系统默认的HashCode,即根据内存地址计算, 每个对象都不同; 若为false, 则调用重写过的, 可以避免对象重复.
        hash = mIdentityHashCode ? System.identityHashCode(key) : key.hashCode();
        index = indexOf(key, hash);
        ...
}

SparseArray

  • The Key of SparseArray must be int .
  • When the amount of data is at level 100, the performance is better than HashMap, and the performance is about 0-50% higher ;
  • The bottom layer is two arrays mKeys and mValues ​​respectively store Key and value corresponding to each other.

Expansion mechanism

public static int growSize(int currentSize) {
        //扩容规则:当前容量小于5返回8;否则扩容一倍
        return currentSize <= 4 ? 8 : currentSize * 2;
    }
  • Therefore, it does not set a maximum capacity,

Delayed deletion mechanism

  • When SparseArray is deleted, the array is not directly manipulated, but the position to be deleted is set as a flag DELETED, and there will be a flag mGarbageto identify whether there is invalid data that is potentially delayed in deletion (as long as there is a delete operation, it will be set to true) .
  • Advantages: Reduce array operations and improve efficiency. For example, frequent deletion does not need to operate the array; when inserting elements DELETED, there is no need to operate the array.
//一个常量Object充当标志位
 private static final Object DELETED = new Object();
  • When inserting an element, if it is, DELETEDit can be replaced directly, and it does not need to operate the array, which is more efficient. If not, first mGarbageclear the data according to the data gc(), and then perform the array operation.
    //将正常元素往前挤,挤掉DELETED的位置
    private void gc() {
        //n代表gc前数组的长度;
        int n = mSize;
        int o = 0;//有效的下标长度
        int[] keys = mKeys;
        Object[] values = mValues;
 
        for (int i = 0; i < n; i++) {
            Object val = values[i];
            //每遇到一次DELETED,则i-o的大小+1;
            if (val != DELETED) {
                //之后遇到非DELETED数据,则将后续元素的key和value往前挪
                if (i != o) {
                    keys[o] = keys[i];
                    values[o] = val;
                    values[i] = null;
                }
 
                o++;
            }
        }
        //此时无垃圾数据,o的序号表示mSize的大小
        mGarbage = false;
        mSize = o;
 
    }

Changes before and after gc():
Changes before and after gc()

access logic

put()

  1. Find the index of the key in two copies in mKeys
  2. If index>0, it means the data already exists, replace it directly.
  3. If index<0 or the corresponding Value is DELETE, it means that the data does not exist, so replace it directly.
  4. Otherwise, first perform gc(), re-find the insertion position index, and then insert the data into mKeysandmValues

get()

  1. Find the index of the key in two copies in mKeys
  2. If index<0 or the corresponding Value is DELETE, indicating that the data does not exist, return null. Otherwise return value.
  • Because the Key of SparseArray must be an int, there is no need to handle the case where the key is null

References
ArrayMap Analysis 1
ArrayMap Analysis 2
ArrayMap Analysis 3
SparseArray Analysis 1
SparseArray Analysis 2

Guess you like

Origin blog.csdn.net/Reven_L/article/details/120310809