Java BitSet部分源码解析

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接: https://blog.csdn.net/lidelin10/article/details/80636186

1、BitSet存储位的方式

private long[] words;

2、BitSet主要构造方法

public BitSet(long[] longs){
     words=Arrays.copyOf(longs,longs.length);
}
public static BitSet valueOf(byte[] bytes){
     return BitSet.valueOf(ByteBuffer.wrap(bytes));
}
public static BitSet valueOf(ByteBuffer bb) {
        bb = bb.slice().order(ByteOrder.LITTLE_ENDIAN);
        int n;
        //去除高位的0
        for (n = bb.remaining(); n > 0 && bb.get(n - 1) == 0; n--)
            ;
        long[] words = new long[(n + 7) / 8];
        //重新定位缓冲区界限
        bb.limit(n);
        int i = 0;
        while (bb.remaining() >= 8)
            words[i++] = bb.getLong();
        //若后面位数不满8个字节,则移位处理
        for (int remaining = bb.remaining(), j = 0; j < remaining; j++)
            words[i] |= (bb.get() & 0xffL) << (8 * j);
        return new BitSet(words);
    }
        for (int remaining = bb.remaining(), j = 0; j < remaining; j++)
            words[i] |= (bb.get() & 0xffL) << (8 * j);

我认为这两个语句是值得深思的,关于怎么把byte转化成long,bb.get() && 0xffL 是把字节转化成了long值,我测试了这种方式和普通强制转换的性能差距,普通强制转换会快一点,不过基本都是1~3ms的差距,所以这里这样实现应该是为了增强可读性。把上述的转换结果移位再与原结果进行或运算,8*j这个部分我尝试着改成过j<<3,性能测试结果反而变慢了,此处笔者也抱有疑问?可能是CPU(Pentium G4560)的问题。

3、位操作

    //获取该位在long数组的位置
    private static int wordIndex(int bitIndex) {
        return bitIndex >> ADDRESS_BITS_PER_WORD;//该常量值为6
    }
     public void set(int fromIndex, int toIndex) {
        checkRange(fromIndex, toIndex);

        if (fromIndex == toIndex)
            return;

        // Increase capacity if necessary
        int startWordIndex = wordIndex(fromIndex);
        int endWordIndex   = wordIndex(toIndex - 1);
        expandTo(endWordIndex);
        //WORD_MASK=0xffffffffffffffffL
        long firstWordMask = WORD_MASK << fromIndex;
        long lastWordMask  = WORD_MASK >>> -toIndex;
        if (startWordIndex == endWordIndex) {
            //当fromIndex和toIndex所操作的位在同一个long位里
            words[startWordIndex] |= (firstWordMask & lastWordMask);
        } else {
            // 当fromIndex和toIndex所操作的位在多个long位里
            //处理第一个long
            words[startWordIndex] |= firstWordMask;

            // Handle intermediate words, if any
            for (int i = startWordIndex+1; i < endWordIndex; i++)
                words[i] = WORD_MASK;

            // Handle last word (restores invariants)
            words[endWordIndex] |= lastWordMask;
        }

        checkInvariants();
    }

这个方法是设置一段长度的位为1的方法,对于一个long值,这个方法最多这能设置连续的63位值为0,这是wordIndex方法和不包括的toIndex位决定的。

     long lastWordMask  = WORD_MASK >>> -toIndex;

这个语句有个特别的地方,>>>右侧的值是负数,这种操作对0xffffffffffffffffL有特别的操作,对于 0xffffffffffffffffL>>>-i 结果是 从最右往左数连续i个1 ,0没有操作,例如:0xffffffffffffffffL>>>-4 结果为 0x000..00FL。
举个例子,这个方法的意思就通过 long firstWordMask = WORD_MASK << fromIndex 生成的0x11111….L 位,1聚集在右侧,而 long lastWordMask = WORD_MASK >>> -toIndex 生成 0x….1111L,1聚集在左侧,如下图所示,与对应的long或运算即可实现置1操作。

    public void clear(int fromIndex, int toIndex) {
        checkRange(fromIndex, toIndex);

        if (fromIndex == toIndex)
            return;

        int startWordIndex = wordIndex(fromIndex);
        if (startWordIndex >= wordsInUse)
            return;

        int endWordIndex = wordIndex(toIndex - 1);
        if (endWordIndex >= wordsInUse) {
            toIndex = length();
            endWordIndex = wordsInUse - 1;
        }

        long firstWordMask = WORD_MASK << fromIndex;
        long lastWordMask  = WORD_MASK >>> -toIndex;
        if (startWordIndex == endWordIndex) {
            // Case 1: One word
            words[startWordIndex] &= ~(firstWordMask & lastWordMask);
        } else {
            // Case 2: Multiple words
            // Handle first word
            words[startWordIndex] &= ~firstWordMask;

            // Handle intermediate words, if any
            for (int i = startWordIndex+1; i < endWordIndex; i++)
                words[i] = 0;

            // Handle last word
            words[endWordIndex] &= ~lastWordMask;
        }

        recalculateWordsInUse();
        checkInvariants();
    }
 这个方法实现的是范围清0操作,这个原理跟set一样,就不介绍了。关于clear我做过如下改动:
public void clearJDK0(int from, int to) {
             if (from > to && to == 0) {
                    return;
             }
             int fromWordIndex = wordIndex(from);
             int toWordIndex = wordIndex(to);
             //减少取反操作  
             long startWord = ~(MASK << from);
             // to=0,~(0xffff..fL)
             long toWord = ~(MASK >>> -to);
             // System.out.println("from:" + ByteUtil.toBinaryString(startWord));

             if (fromWordIndex == toWordIndex) {
                    words[fromWordIndex] &= startWord | toWord;
             } else {
                    words[fromWordIndex] &= startWord;
                    for (int i = fromWordIndex + 1; i < toWordIndex; ++i) {
                           words[i] |= 0x0L;
                    }
                    words[toWordIndex] &= toWord;
             }
       }
       public void clear(int from, int to) {
             if (from > to) {
                    return;
             }
             int fromWordIndex = wordIndex(from);
             int toWordIndex = wordIndex(to);
             //改动
             long startWord = from == 0 ? 0x0L : MASK >>> -from;
             long toWord = MASK << to;
             // System.out.println("from:" + ByteUtil.toBinaryString(startWord));

             if (fromWordIndex == toWordIndex) {
                    words[fromWordIndex] &= startWord | toWord;
             } else {
                    words[fromWordIndex] &= startWord;
                    for (int i = fromWordIndex + 1; i < toWordIndex; ++i) {
                           words[i] |= 0x0L;
                    }
                    words[toWordIndex] &= toWord;
             }
       }

JDK中的操作精简:

    public void clearJDK1(int from, int to) {
             if (from > to) {
                    return;
             }
             int fromWordIndex = wordIndex(from);
             int toWordIndex = wordIndex(to);

             long startWord = MASK << from;
             long toWord = MASK >>> -to;
             // System.out.println("from:" + ByteUtil.toBinaryString(startWord));

             if (fromWordIndex == toWordIndex) {
                    words[fromWordIndex] &= ~(startWord & toWord);
             } else {
                    words[fromWordIndex] &= ~startWord;
                    for (int i = fromWordIndex + 1; i < toWordIndex; ++i) {
                           words[i] |= 0x0L;
                    }
                    words[toWordIndex] &= ~toWord;
             }
       }

性能都只是1~2ms的差距,没有什么优化的必要

    public boolean get(int bitIndex) {
        if (bitIndex < 0)
            throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);

        checkInvariants();

        int wordIndex = wordIndex(bitIndex);
        return (wordIndex < wordsInUse)
            && ((words[wordIndex] & (1L << bitIndex)) != 0);
    }
 这个方法是获取某一个位置的位值,清楚怎么获取某一位的值就可以了
 关于BitSet的模拟代码已上传github:https://github.com/delin10/frame
 下面是项目框架

这里写图片描述

猜你喜欢

转载自blog.csdn.net/lidelin10/article/details/80636186