【Hollow系列】深入分析Hollow内存布局

image.png

前言

上一篇文章中,详细介绍了Hollow的数据模型,如果对数据模型不太熟悉的朋友,可以详细看下。言归正传,本文将在上一篇数据模型的基础上,进一步介绍Hollow数据模型定义的深层次逻辑,也就是Hollow数据模型的内存布局。

数据类型定义

基础数据类型

基础数据类型是任何编程语言或框架的基础,Hollow也不例外。Hollow的基本数据类型的存储和编码方式。以下表格中整理了Hollow的8中基础数据类型的存储和编码方式,其中也需要特别关注最后一列的NULL表示。

字段类型 存储 编码 NULL表示
INT An integer value up to 32-bits 详见ZigZag 被编码为全为 1 的值
LONG An integer value up to 64-bits 详见ZigZag 被编码为全为 1 的值
FLOAT A 32-bit floating-point value 未编码Hollow并不推荐使用double和float等浮点型数据的原因也在于此 被编码为特殊位序列
DOUBLE A 64-bit floating-point value 未编码Hollow并不推荐使用double和float等浮点型数据的原因也在于此 被编码为特殊位序列
BOOLEAN true or false 固定大小
STRING An array of characters byteArray存储,最前面使用一个固定长度来表示数组offset,具体byteArray紧挨着offset值存储。 通过在每个字段的开头设置一个指定的 null 位来编码的,然后是该字段最后填充值的结束offset。
BYTES An array of bytes byteArray存储,最前面使用一个固定长度来表示数组offset,具体byteArray紧挨着offset值存储 通过在每个字段的开头设置一个指定的 null 位来编码的,然后是该字段最后填充值的结束offset。
REFERENCE A reference to another specific type. The referenced type must be defined by the schema. 存储引用的记录本身 被编码为全为 1 的值

FieldType定义如下:

/**
* All allowable field types.
*
*/
public enum FieldType {
    /**
    * A reference to another field.  References are typed, and are fixed-length fields are encoded as the ordinal of the referenced record.
    */
    REFERENCE(-1, false),
    /**
    * An integer value up to 32 bits.  Integers are fixed-length fields encoded with zig-zag encoding.
    * The value Integer.MIN_VALUE is reserved for a sentinel value indicating null.
    */
    INT(-1, false),
    /**
    * An integer value up to 64 bits.  Longs are fixed-length fields encoded with zig-zag encoding.
    * The value Long.MIN_VALUE is reserved for a sentinel value indicating null.
    */
    LONG(-1, false),
    /**
    * A boolean value.  Booleans are encoded as fields requiring two bits each.  Two bits are required
    * because boolean fields can carry any of the three values: true, false, or null.
    */
    BOOLEAN(1, false),
    /**
    * A floating-point number.  Floats are encoded as fixed-length fields four bytes long.
    */
    FLOAT(4, false),
    /**
    * A double-precision floating point number.  Doubles are encoded as fixed-length fields eight bytes long.
    */
    DOUBLE(8, false),
    /**
    * A String of characters.  All Strings for all records containing a given field are encoded in a packed array
    * of variable-length characters.  The values are ordered by the ordinal of the record to which they belong.
    * Each individual record contains a fixed-length field which holds an integer which points to the end of the
    * array range containing the value for the specific record.  The beginning of the range is determined by
    * reading the pointer from the previous record.
    */
    STRING(-1, true),
    /**
    * A byte array.  All byte arrays for all records containing a given field are encoded in a packed array
    * of bytes.  The values are ordered by the ordinal of the record to which they belong.
    * Each individual record contains a fixed-length field which holds an integer which points to the end of the
    * array range containing the value for the specific record.  The beginning of the range is determined by
    * reading the pointer from the previous record.
    */
    BYTES(-1, true);

    private final int fixedLength;
    private final boolean varIntEncodesLength;

    FieldType(int fixedLength, boolean varIntEncodesLength) {
        this.fixedLength = fixedLength;
        this.varIntEncodesLength = varIntEncodesLength;
    }

    public int getFixedLength() {
        return fixedLength;
    }

    public boolean isVariableLength() {
        return varIntEncodesLength;
    }
}
复制代码

下文中我们将结合源码详细分析每一种数据类型的逻辑

实现

首先,从最简单的INT开始。Hollow定义了HInteger,如下。

package com.netflix.hollow.core.type;

import com.netflix.hollow.api.custom.HollowAPI;
import com.netflix.hollow.api.objects.HollowObject;
import com.netflix.hollow.core.type.delegate.IntegerDelegate;

public class HInteger extends HollowObject {

    public HInteger(IntegerDelegate delegate, int ordinal) {
        super(delegate, ordinal);
    }

    public int getValue() {
        return delegate().getValue(ordinal);
    }

    public Integer getValueBoxed() {
        return delegate().getValueBoxed(ordinal);
    }

    public HollowAPI api() {
        return typeApi().getAPI();
    }

    public IntegerTypeAPI typeApi() {
        return delegate().getTypeAPI();
    }

    protected IntegerDelegate delegate() {
        return (IntegerDelegate)delegate;
    }

}
复制代码

查看上述源码,可以看出IntegerDelegate负责了具体INT的编码工作。IntegerDelegate有两种类型的具体实现,分别为IntegerDelegateCachedImplIntegerDelegateLookupImpl,顾名思义,IntegerDelegateCachedImpl对所有的数据类型做了缓存处理,而IntegerDelegateLookupImpl则是直接访问原始数据。关于这两者的区别,将在下文中详细展开。让我们先来看下在Int中的定义,如下:

package com.netflix.hollow.core.type.delegate;

import com.netflix.hollow.api.custom.HollowTypeAPI;
import com.netflix.hollow.api.objects.delegate.HollowCachedDelegate;
import com.netflix.hollow.api.objects.delegate.HollowObjectAbstractDelegate;
import com.netflix.hollow.core.read.dataaccess.HollowObjectTypeDataAccess;
import com.netflix.hollow.core.schema.HollowObjectSchema;
import com.netflix.hollow.core.type.IntegerTypeAPI;

public class IntegerDelegateCachedImpl extends HollowObjectAbstractDelegate implements HollowCachedDelegate, IntegerDelegate {

    private final Integer value;
    private IntegerTypeAPI typeAPI;

    public IntegerDelegateCachedImpl(IntegerTypeAPI typeAPI, int ordinal) {
        this.value = typeAPI.getValueBoxed(ordinal);
        this.typeAPI = typeAPI;
    }

    @Override
    public int getValue(int ordinal) {
        if(value == null)
            return Integer.MIN_VALUE;
        return value.intValue();
    }

    @Override
    public Integer getValueBoxed(int ordinal) {
        return value;
    }

    @Override
    public HollowObjectSchema getSchema() {
        return typeAPI.getTypeDataAccess().getSchema();
    }

    @Override
    public HollowObjectTypeDataAccess getTypeDataAccess() {
        return typeAPI.getTypeDataAccess();
    }

    @Override
    public IntegerTypeAPI getTypeAPI() {
        return typeAPI;
    }

    @Override
    public void updateTypeAPI(HollowTypeAPI typeAPI) {
        this.typeAPI = (IntegerTypeAPI) typeAPI;
    }

}
复制代码
package com.netflix.hollow.core.type.delegate;

import com.netflix.hollow.api.objects.delegate.HollowObjectAbstractDelegate;
import com.netflix.hollow.core.read.dataaccess.HollowObjectTypeDataAccess;
import com.netflix.hollow.core.schema.HollowObjectSchema;
import com.netflix.hollow.core.type.IntegerTypeAPI;

public class IntegerDelegateLookupImpl extends HollowObjectAbstractDelegate implements IntegerDelegate {

    private final IntegerTypeAPI typeAPI;

    public IntegerDelegateLookupImpl(IntegerTypeAPI typeAPI) {
        this.typeAPI = typeAPI;
    }

    @Override
    public int getValue(int ordinal) {
        return typeAPI.getValue(ordinal);
    }

    @Override
    public Integer getValueBoxed(int ordinal) {
        return typeAPI.getValueBoxed(ordinal);
    }

    @Override
    public IntegerTypeAPI getTypeAPI() {
        return typeAPI;
    }

    @Override
    public HollowObjectSchema getSchema() {
        return typeAPI.getTypeDataAccess().getSchema();
    }

    @Override
    public HollowObjectTypeDataAccess getTypeDataAccess() {
        return typeAPI.getTypeDataAccess();
    }

}
复制代码

上文中详细剖析了Int在Hollow的实现过程,其他的类型LongBooleanDoubleFloatString有着类似的实现逻辑,相信大家都可以轻松理解,本文中将不再过多的赘述,如果有兴趣可以查看Hollow的源码。在此将各种数据类型delegate的继承关系整理在下图:

HollowRecordDelegate.png

编码

上文中我们详细介绍了Hollow中Integer的实现方式,当有具体的一个int类型的数据需要存储时,Hollow会经过如下步骤编码:

  1. 判断int值是否为最小值,Hollow会将此字段处理为NULL;
  2. 通过字段名找到相应的字段索引,验证对应位置的数据是否为INT类型;
  3. 从相应索引位置上通过固定长度值,读取ByteDataArray的buffer数据
  4. 将需要赋值的数据通过zigzag编码方式写入到buffer中。

具体代码如下:

public void setInt(String fieldName, int value) {
    if(value == Integer.MIN_VALUE) {
        setNull(fieldName);
    } else {
        int fieldIndex = getSchema().getPosition(fieldName);

        validateFieldType(fieldIndex, fieldName, FieldType.INT);

        ByteDataArray buf = getFieldBuffer(fieldIndex);

        // zig zag encoding
        VarInt.writeVInt(buf, ZigZag.encodeInt(value));
    }
}
复制代码

IntegerTypeAPI

IntegerDelegate的实现中,会使用到API类IntegerTypeAPI,此类是Producer和Consumer具体使用类型的接口类,通过API类可以从HollowObjectTypeDataAccess获取intInteger具体数据。

package com.netflix.hollow.core.type;

import com.netflix.hollow.api.custom.HollowAPI;
import com.netflix.hollow.api.custom.HollowObjectTypeAPI;
import com.netflix.hollow.core.read.dataaccess.HollowObjectTypeDataAccess;
import com.netflix.hollow.core.type.delegate.IntegerDelegateLookupImpl;

public class IntegerTypeAPI extends HollowObjectTypeAPI {

    private final IntegerDelegateLookupImpl delegateLookupImpl;

    public IntegerTypeAPI(HollowAPI api, HollowObjectTypeDataAccess typeDataAccess) {
        super(api, typeDataAccess, new String[] {
            "value"
        });
        this.delegateLookupImpl = new IntegerDelegateLookupImpl(this);
    }

    public int getValue(int ordinal) {
        if(fieldIndex[0] == -1)
            return missingDataHandler().handleInt("Integer", ordinal, "value");
        return getTypeDataAccess().readInt(ordinal, fieldIndex[0]);
    }

    public Integer getValueBoxed(int ordinal) {
        int i;
        if(fieldIndex[0] == -1) {
            i = missingDataHandler().handleInt("Integer", ordinal, "value");
        } else {
            boxedFieldAccessSampler.recordFieldAccess(fieldIndex[0]);
            i = getTypeDataAccess().readInt(ordinal, fieldIndex[0]);
        }
        if(i == Integer.MIN_VALUE)
            return null;
        return Integer.valueOf(i);
    }

    public IntegerDelegateLookupImpl getDelegateLookupImpl() {
        return delegateLookupImpl;
    }

}
复制代码

HollowObjectTypeAPI的实现有两种方式,一种是手工实现具体的API类,另一种是通过定义Schema文件,借助HollowGeneratorAPI自动生成。这时候一定要提到HollowFactory了。

import com.netflix.hollow.api.custom.HollowTypeAPI;
import com.netflix.hollow.core.read.dataaccess.HollowTypeDataAccess;

/**
 * A HollowFactory is responsible for returning objects in a generated Hollow Object API.  The HollowFactory for individual
 * types can be overridden to return hand-coded implementations of specific record types.
 */
public abstract class HollowFactory<T> {

    public abstract T newHollowObject(HollowTypeDataAccess dataAccess, HollowTypeAPI typeAPI, int ordinal);

    public T newCachedHollowObject(HollowTypeDataAccess dataAccess, HollowTypeAPI typeAPI, int ordinal) {
        return newHollowObject(dataAccess, typeAPI, ordinal);
    }
}
复制代码

仍然以INT的为例,IntegerHollowFactory的实现如下。

package com.netflix.hollow.core.type;

import com.netflix.hollow.api.custom.HollowTypeAPI;
import com.netflix.hollow.api.objects.provider.HollowFactory;
import com.netflix.hollow.core.read.dataaccess.HollowTypeDataAccess;
import com.netflix.hollow.core.type.delegate.IntegerDelegateCachedImpl;

public class IntegerHollowFactory extends HollowFactory<HInteger> {

    @Override
    public HInteger newHollowObject(HollowTypeDataAccess dataAccess, HollowTypeAPI typeAPI, int ordinal) {
        return new HInteger(((IntegerTypeAPI)typeAPI).getDelegateLookupImpl(), ordinal);
    }

    @Override
    public HInteger newCachedHollowObject(HollowTypeDataAccess dataAccess, HollowTypeAPI typeAPI, int ordinal) {
        return new HInteger(new IntegerDelegateCachedImpl((IntegerTypeAPI)typeAPI, ordinal), ordinal);
    }

}
复制代码

这个可以和上问中提及的IntegerDelegateLookupImplIntegerDelegateCachedImpl形成闭环。

REFERENCE

REFERENCE是一种特殊的基础类型,通过REFERENCE可以关联到数据其他的数据类型定义。Hollow会将所有的Reference类型字段存放在HollowObjectSchemareferencedTypes[]数组中。

public class HollowObjectSchema extends HollowSchema {

    private final Map<String, Integer> nameFieldIndexLookup;
    private final String fieldNames[];
    private final FieldType fieldTypes[];
    protected final String referencedTypes[];
    private final HollowTypeReadState referencedFieldTypeStates[];  /// populated during deserialization
    private final PrimaryKey primaryKey;

    private int size;

    public HollowObjectSchema(String schemaName, int numFields, String... keyFieldPaths) {
        this(schemaName, numFields, keyFieldPaths == null || keyFieldPaths.length == 0 ? null : new PrimaryKey(schemaName, keyFieldPaths));
    }

    public HollowObjectSchema(String schemaName, int numFields, PrimaryKey primaryKey) {
        super(schemaName);

        this.nameFieldIndexLookup = new HashMap<>(numFields);
        this.fieldNames = new String[numFields];
        this.fieldTypes = new FieldType[numFields];
        this.referencedTypes = new String[numFields];
        this.referencedFieldTypeStates = new HollowTypeReadState[numFields];
        this.primaryKey = primaryKey;
    }
}
复制代码

在解析Hollow的数据类型是,会遍历所有的字段类型,如下。其中如果FieldTypeREFERENCE,需要进一步的解析引用的类。

PrimaryKey primaryKey = isNullableObjectEquals(this.primaryKey, otherSchema.getPrimaryKey()) ? this.primaryKey : null;
HollowObjectSchema commonSchema = new HollowObjectSchema(getName(), commonFields, primaryKey);

for (int i = 0; i < fieldNames.length; i++) {
    int otherFieldIndex = otherSchema.getPosition(fieldNames[i]);
        if (otherFieldIndex != -1) {
        if (fieldTypes[i] != otherSchema.getFieldType(otherFieldIndex)
                || !referencedTypesEqual(referencedTypes[i], otherSchema.getReferencedType(otherFieldIndex))) {
            String fieldType = fieldTypes[i] == FieldType.REFERENCE ? referencedTypes[i]
                : fieldTypes[i].toString().toLowerCase();
            String otherFieldType = otherSchema.getFieldType(otherFieldIndex) == FieldType.REFERENCE
                ? otherSchema.getReferencedType(otherFieldIndex)
                : otherSchema.getFieldType(otherFieldIndex).toString().toLowerCase();
            throw new IncompatibleSchemaException(getName(), fieldNames[i], fieldType, otherFieldType);
        }

        commonSchema.addField(fieldNames[i], fieldTypes[i], referencedTypes[i]);
    }
}
复制代码

集合数据类型

Hollow集合数据模型基类为HollowSchema,让我们首先看下具体的继承体系。

HollowSchema.png

接下来将分别分析每种集合类型的实现原理和方法。

List

编码方式

List是一个有序的集合。由两个FixedLengthElementArrays构成,其中一个存储offset,一个存储具体的元素。offset数组包含记录元素结束的元素数组的固定长度偏移量。 为了确定序数为 n 的记录的开始元素,读取元素 (n-1) 的结束值。

List不允许有NULL值存储。

public class HollowListSchema extends HollowCollectionSchema {

    private final String elementType;

    private HollowTypeReadState elementTypeState;

    public HollowListSchema(String schemaName, String elementType) {
        super(schemaName);
        this.elementType = elementType;
    }
    // other code 
}
复制代码

存储结构

image.png

Set

编码方式

Set是一个无序的集合。将元素hash后,散列到具体的hash表中。由两个FixedLengthElementArrays构成,其中一个存储offset,一个存储块元素(hash表)。块存储中每条记录数是 2 的幂,并且足够大,以使记录的所有元素都可以装入负载因子不大于 70% 的桶中。offset数组包含每条记录的两个固定长度字段:集合的大小,以及记录数据结束的桶的offset。

Set同样不允许有NULL值存储。

public class HollowSetSchema extends HollowCollectionSchema {

    private final String elementType;
    private final PrimaryKey hashKey;

    private HollowTypeReadState elementTypeState;

    public HollowSetSchema(String schemaName, String elementType, String... hashKeyFieldPaths) {
        super(schemaName);
        this.elementType = elementType;
        this.hashKey = hashKeyFieldPaths == null || hashKeyFieldPaths.length == 0 ? null : new PrimaryKey(elementType, hashKeyFieldPaths);
    }
    // other code 
}
复制代码

存储结构

image.png

Map

编码方式

Map与Set类似,也是一个无序的集合。区别是可以存储键值对的数据,其中key经过hash后,散列到具体的hash表中。由两个FixedLengthElementArrays构成,其中一个存储offset,一个存储块元素(hash表)。同样,key存储中每条记录数是 2 的幂,并且足够大,以使记录的所有元素都可以装入负载因子不大于 70% 的桶中。

Map的Key和Value都不允许有NULL值存在。

public class HollowMapSchema extends HollowSchema {

    private final String keyType;
    private final String valueType;
    private final PrimaryKey hashKey;

    private HollowTypeReadState keyTypeState;
    private HollowTypeReadState valueTypeState;

    public HollowMapSchema(String schemaName, String keyType, String valueType, String... hashKeyFieldPaths) {
        super(schemaName);
        this.keyType = keyType;
        this.valueType = valueType;
        this.hashKey = hashKeyFieldPaths == null || hashKeyFieldPaths.length == 0 ? null : new PrimaryKey(keyType, hashKeyFieldPaths);
    }
    // other code
}
复制代码

存储结构

image.png

内存布局

针对Object在内存布局结构优化,是Hollow的一大特点。其核心点是池化,在之前的一篇文章中 性能优化利器 - 池化 中有详细阐述,有兴趣同学可以看下。

本文中将不对池化的方式和逻辑进行过多详细介绍,将着重介绍Hollow是如何实现池化的。Hollow 并未使用 POJO 作为“内存中”的具体呈现,而是使用了一种更紧凑的定长强类型数据编码方式。

该编码方式可将数据集的堆占用空间和随时访问数据时的 CPU 消耗降至最低。所有编码后的记录会打包为可重用的内存块(Slab),并在 JVM 堆的基础之上进行池化,借此避免服务器高负载时对 GC 行为产生影响。

ArraySegmentRecycler就是关于池化的详细实现,的定义如下:

package com.netflix.hollow.core.memory.pool;

import com.netflix.hollow.core.memory.SegmentedByteArray;
import com.netflix.hollow.core.memory.SegmentedLongArray;

/**
* An ArraySegmentRecycler is a memory pool.
* <p>
* Hollow pools and reuses memory to minimize GC effects while updating data.  
* This pool of memory is kept arrays on the heap.  Each array in the pool has a fixed length.  
* When a long array or a byte array is required in Hollow, it will stitch together pooled array 
* segments as a {@link SegmentedByteArray} or {@link SegmentedLongArray}.  
* These classes encapsulate the details of treating segmented arrays as contiguous ranges of values.
*/
public interface ArraySegmentRecycler {

    public int getLog2OfByteSegmentSize();

    public int getLog2OfLongSegmentSize();

    public long[] getLongArray();

    public void recycleLongArray(long[] arr);

    public byte[] getByteArray();

    public void recycleByteArray(byte[] arr);

    public void swap();

}
复制代码

下图ArraySegmentRecycler继承体系。

ArraySegmentRecycler.png

这里有个问题,Hollow的BLOB最底层是以何种结构存储的呢?我们可以在ByteData接口中找到答案,ByteData定义了读取longint数据的方式,

public interface ByteData {

    default long readLongBits(long position) {
        long longBits = (long) (get(position++) & 0xFF) << 56;
        longBits |= (long) (get(position++) & 0xFF) << 48;
        longBits |= (long) (get(position++) & 0xFF) << 40;
        longBits |= (long) (get(position++) & 0xFF) << 32;
        longBits |= (long) (get(position++) & 0xFF) << 24;
        longBits |= (get(position++) & 0xFF) << 16;
        longBits |= (get(position++) & 0xFF) << 8;
        longBits |= (get(position) & 0xFF);
        return longBits;
    }

    default int readIntBits(long position) {
        int intBits = (get(position++) & 0xFF) << 24;
        intBits |= (get(position++) & 0xFF) << 16;
        intBits |= (get(position++) & 0xFF) << 8;
        intBits |= (get(position) & 0xFF);
        return intBits;
    }

    default long length() {
        throw new UnsupportedOperationException();
    }

    /**
     * Get the value of the byte at the specified position.
     * @param index the position (in byte units)
     * @return the byte value
     */
    byte get(long index);

}
复制代码

此外,Hollow定义了两个关于内存布局的接口类,分别是定长的FixedLengthData和可变长度的VariableLengthData,其中VariableLengthData继承自ByteData。

绝大部分情况下我们使用FixedLengthData已经可以满足我们的使用,VariableLengthData适用于无法获取准确长度的场景(当一个字节写入大于当前分配的数组/缓冲区的索引时,它将自动增长。)。

image.png

定长数据

数据写入(setElementValue):根据数据模型的类型定义,将涉及到的类型固定长度相加即为一条记录的总长度。记录和记录之间是紧密相连。因此当记录总量确定后,整个记录占用的内存空间也就是明确的。

数据读取(getElementValue):在读取数据时,已知数据模型的类型定义,可以计算出每一条记录占用的空间大小,按照计算得到的长度,迭代循环可以读取出所有的记录内容。

package com.netflix.hollow.core.memory;

import com.netflix.hollow.core.memory.encoding.VarInt;
import com.netflix.hollow.core.read.HollowBlobInput;
import java.io.IOException;

public interface FixedLengthData {

    long getElementValue(long index, int bitsPerElement);

    long getElementValue(long index, int bitsPerElement, long mask);

    long getLargeElementValue(long index, int bitsPerElement);

    long getLargeElementValue(long index, int bitsPerElement, long mask);

    void setElementValue(long index, int bitsPerElement, long value);

    void copyBits(FixedLengthData copyFrom, long sourceStartBit, long destStartBit, long numBits);

    void incrementMany(long startBit, long increment, long bitsBetweenIncrements, int numIncrements);

    void clearElementValue(long index, int bitsPerElement);

    static void discardFrom(HollowBlobInput in) throws IOException {
        long numLongs = VarInt.readVLong(in);
        long bytesToSkip = numLongs * 8;

        while(bytesToSkip > 0) {
            bytesToSkip -= in.skipBytes(bytesToSkip);
        }
    }

    static int bitsRequiredToRepresentValue(long value) {
        if(value == 0)
            return 1;
        return 64 - Long.numberOfLeadingZeros(value);
    }

}
复制代码

FixedLengthElementArrayFixedLengthData的一种实现,定长数据是通过ArraySegmentRecycler来实现内存的回收或者说池化功。

public class FixedLengthElementArray extends SegmentedLongArray implements FixedLengthData {

    private static final Unsafe unsafe = HollowUnsafeHandle.getUnsafe();

    private final int log2OfSegmentSizeInBytes;
    private final int byteBitmask;

    public FixedLengthElementArray(ArraySegmentRecycler memoryRecycler, long numBits) {
        super(memoryRecycler, ((numBits - 1) >>> 6) + 1);
        this.log2OfSegmentSizeInBytes = log2OfSegmentSize + 3;
        this.byteBitmask = (1 << log2OfSegmentSizeInBytes) - 1;
    }
}
复制代码

可变长度数据

可变长度数据主要作用于Hollow的池化,内存池管理由ArraySegmentRecycler负责,内存池以在堆上byte数组的形式存在,池中的每个数组都有固定的长度,当 Hollow 中需要长数组或字节数组时,它会将池化的数组段拼接在一起作为SegmentedByteArraySegmentedLongArray。 这些类封装了将分段数组视为连续值范围的细节。

可变长度VariableLengthData中定义如下,通过loadFrom方法,从输入流HollowBlobInput中读取length长度的数据。通过copy以及orderedCopy方法可以调整可变长度数据的大小。

package com.netflix.hollow.core.memory;

import com.netflix.hollow.core.read.HollowBlobInput;
import java.io.IOException;

public interface VariableLengthData extends ByteData {

    void loadFrom(HollowBlobInput in, long length) throws IOException;

    void copy(ByteData src, long srcPos, long destPos, long length);

    void orderedCopy(VariableLengthData src, long srcPos, long destPos, long length);

    long size();
}
复制代码

注意:在集合类型存储的字段中的每一个都使用固定数量的位数进行编码,该位数等于表示所有记录中该字段的最大值所需的位数。这种方式会造成一定的空间浪费。

缓存布局

上文中,提及到HollowObjectAbstractDelegate的实现有两种方式,一种是一般意义的LookUp实现,另一种是继承自HollowCachedDelegate的缓存类。

以上提及的所有的Delegate都继承自HollowRecordDelegate

/**
 * A HollowRecordDelegate is used by a generated Hollow Objects API to access data from the data model.
 * <p>
 * Two flavors of delegate currently exist -- lookup and cached.
 * <p>
 * The lookup delegate reads directly from a HollowDataAccess.  The cached delegate will copy the data from a HollowDataAccess,
 * then read from the copy of the data.  The intention is that the cached delegate has the performance profile of a POJO, while
 * the lookup delegate imposes the minor performance penalty incurred by reading directly from Hollow.
 * <p>
 * The performance penalty of a lookup delegate is minor enough that it doesn't usually matter except in the tightest of loops.  
 * If a type exists which has a low cardinality but is accessed disproportionately frequently, then it may be a good candidate
 * to be represented with a cached delegate. 
 * 
 */
public interface HollowRecordDelegate {
}
复制代码

HollowObjectAbstractDelegate

HollowObjectAbstractDelegateHollowRecordDelegate的默认通用实现。从源码中可以看出,Hollow的基本数据类型都是取自HollowObjectTypeDataAccess,鉴于篇幅有限,HollowObjectTypeDataAccess的讲解我将放在后续【Hollow的Producer】中讲解。

package com.netflix.hollow.api.objects.delegate;

import com.netflix.hollow.core.read.dataaccess.HollowObjectTypeDataAccess;
import com.netflix.hollow.core.read.missing.MissingDataHandler;

/**
 * Contains some basic convenience access methods for OBJECT record fields.
 * 
 * @see HollowRecordDelegate
 */
public abstract class HollowObjectAbstractDelegate implements HollowObjectDelegate {

    @Override
    public boolean isNull(int ordinal, String fieldName) {
        try {
            HollowObjectTypeDataAccess dataAccess = getTypeDataAccess();
            int fieldIndex = getSchema().getPosition(fieldName);

            if(fieldIndex == -1)
                return missingDataHandler().handleIsNull(getSchema().getName(), ordinal, fieldName);

            return dataAccess.isNull(ordinal, fieldIndex);
        } catch(Exception ex) {
            throw new RuntimeException(String.format("Unable to handle ordinal=%s, fieldName=%s", ordinal, fieldName), ex);
        }
    }

    @Override
    public boolean getBoolean(int ordinal, String fieldName) {
        HollowObjectTypeDataAccess dataAccess = getTypeDataAccess();
        int fieldIndex = getSchema().getPosition(fieldName);

        Boolean bool = (fieldIndex != -1) ?
                dataAccess.readBoolean(ordinal, fieldIndex)
                : missingDataHandler().handleBoolean(getSchema().getName(), ordinal, fieldName);

        return bool == null ? false : bool.booleanValue();
    }

    @Override
    public int getOrdinal(int ordinal, String fieldName) {
        HollowObjectTypeDataAccess dataAccess = getTypeDataAccess();
        int fieldIndex = getSchema().getPosition(fieldName);

        if(fieldIndex == -1)
            return missingDataHandler().handleReferencedOrdinal(getSchema().getName(), ordinal, fieldName);

        return dataAccess.readOrdinal(ordinal, fieldIndex);
    }

    @Override
    public int getInt(int ordinal, String fieldName) {
        HollowObjectTypeDataAccess dataAccess = getTypeDataAccess();
        int fieldIndex = getSchema().getPosition(fieldName);

        if(fieldIndex == -1)
            return missingDataHandler().handleInt(getSchema().getName(), ordinal, fieldName);

        return dataAccess.readInt(ordinal, fieldIndex);
    }

    @Override
    public long getLong(int ordinal, String fieldName) {
        HollowObjectTypeDataAccess dataAccess = getTypeDataAccess();
        int fieldIndex = getSchema().getPosition(fieldName);

        if(fieldIndex == -1)
            return missingDataHandler().handleLong(getSchema().getName(), ordinal, fieldName);

        return dataAccess.readLong(ordinal, fieldIndex);
    }

    @Override
    public float getFloat(int ordinal, String fieldName) {
        HollowObjectTypeDataAccess dataAccess = getTypeDataAccess();
        int fieldIndex = getSchema().getPosition(fieldName);

        if(fieldIndex == -1)
            return missingDataHandler().handleFloat(getSchema().getName(), ordinal, fieldName);

        return dataAccess.readFloat(ordinal, fieldIndex);
    }

    @Override
    public double getDouble(int ordinal, String fieldName) {
        HollowObjectTypeDataAccess dataAccess = getTypeDataAccess();
        int fieldIndex = getSchema().getPosition(fieldName);

        if(fieldIndex == -1)
            return missingDataHandler().handleDouble(getSchema().getName(), ordinal, fieldName);

        return dataAccess.readDouble(ordinal, fieldIndex);
    }

    @Override
    public String getString(int ordinal, String fieldName) {
        HollowObjectTypeDataAccess dataAccess = getTypeDataAccess();
        int fieldIndex = getSchema().getPosition(fieldName);

        if(fieldIndex == -1)
            return missingDataHandler().handleString(getSchema().getName(), ordinal, fieldName);

        return dataAccess.readString(ordinal, fieldIndex);
    }

    @Override
    public boolean isStringFieldEqual(int ordinal, String fieldName, String testValue) {
        HollowObjectTypeDataAccess dataAccess = getTypeDataAccess();
        int fieldIndex = getSchema().getPosition(fieldName);

        if(fieldIndex == -1) {
            return missingDataHandler().handleStringEquals(getSchema().getName(), ordinal, fieldName, testValue);
        }

        return dataAccess.isStringFieldEqual(ordinal, fieldIndex, testValue);
    }

    @Override
    public byte[] getBytes(int ordinal, String fieldName) {
        HollowObjectTypeDataAccess dataAccess = getTypeDataAccess();
        int fieldIndex = getSchema().getPosition(fieldName);

        if(fieldIndex == -1) {
            return missingDataHandler().handleBytes(getSchema().getName(), ordinal, fieldName);
        }

        return dataAccess.readBytes(ordinal, fieldIndex);
    }

    private MissingDataHandler missingDataHandler() {
        return getTypeDataAccess().getDataAccess().getMissingDataHandler();
    }
}
复制代码

HollowCachedDelegate

HollowCachedDelegate是对一般意义的HollowRecordDelegate的扩展,负责经过缓存后的delegate

package com.netflix.hollow.api.objects.delegate;

import com.netflix.hollow.api.custom.HollowTypeAPI;
import com.netflix.hollow.api.objects.provider.HollowObjectCacheProvider;

/**
 * This is the extension of the {@link HollowRecordDelegate} interface for cached delegates.
 * 
 * @see HollowRecordDelegate
 */
public interface HollowCachedDelegate extends HollowRecordDelegate {

    /**
     * Called by the {@link HollowObjectCacheProvider} when the api is updated.
     * @param typeAPI the type api that is updated
     */
    void updateTypeAPI(HollowTypeAPI typeAPI);

}
复制代码

总结

Hollow通过优化内存布局,极大的提升了内存的利用率,减少了堆重复创建的开销。

内存布局的思路并不仅仅局限于Hollow,其中的池化、编码等思想可以延伸几乎任何的系统性能调优方面。

希望本文不仅仅让大家认识到Hollow,更多的可以开阔思路,应用到更多实际的应用场景中,优化内存使用率。

结束语

上海今天一直在下雨,分享首轻音乐 淅淅沥沥的雨。大家周末愉快。

8aececffa347a0d34ebb0563ae1ae133.jpeg

猜你喜欢

转载自juejin.im/post/7108287309681786917