在使用Kryo序列化之前需要将被序列化的类通过register()方法注册到其中去。
在register的过程中,实则是要根据要序列化的类生成对应的Registration,Registration中记录了类的唯一id与对应的序列化类,在Kryo中,默认的序列化对象是FieldSerializer,没有特别指明的,都将以FieldSerializer来进行序列化。
public Registration register (Class type, Serializer serializer) {
Registration registration = classResolver.getRegistration(type);
if (registration != null) {
registration.setSerializer(serializer);
return registration;
}
return classResolver.register(new Registration(type, serializer, getNextRegistrationId()));
}
public int getNextRegistrationId () {
while (nextRegisterID != -2) {
if (classResolver.getRegistration(nextRegisterID) == null) return nextRegisterID;
nextRegisterID++;
}
throw new KryoException("No registration IDs are available.");
}
上面是具体的register()方法,会分配一个自增的唯一id给对应的注册类,生成Registration,在Kyro中,通过该id作为key,Registration作为Value的形式存在Map当中。
public Registration register (Registration registration) {
if (registration == null) throw new IllegalArgumentException("registration cannot be null.");
if (registration.getId() != NAME) {
if (TRACE) {
trace("kryo", "Register class ID " + registration.getId() + ": " + className(registration.getType()) + " ("
+ registration.getSerializer().getClass().getName() + ")");
}
idToRegistration.put(registration.getId(), registration);
} else if (TRACE) {
trace("kryo", "Register class name: " + className(registration.getType()) + " ("
+ registration.getSerializer().getClass().getName() + ")");
}
classToRegistration.put(registration.getType(), registration);
if (registration.getType().isPrimitive()) classToRegistration.put(getWrapperClass(registration.getType()), registration);
return registration;
}
在注册的最后,实则是建立了两组键值对分别存放在两个map中,一个是唯一id对应Registration,一个是需要序列化的类对应Registration。
在完成了需要序列化的类的注册之后,则可以通过writeClassAndObject()方法开始进行序列化需要序列化的类的实体对象。核心代码如下。
try {
if (object == null) {
writeClass(output, null);
return;
}
Registration registration = writeClass(output, object.getClass());
if (references && writeReferenceOrNull(output, object, false)) {
registration.getSerializer().setGenerics(this, null);
return;
}
if (TRACE || (DEBUG && depth == 1)) log("Write", object);
registration.getSerializer().write(this, output, object);
首先,会通过writeClass()方法在开头写入类的Registration的对应的唯一id。
具体代码如下:
Registration registration = kryo.getRegistration(type);
if (registration.getId() == NAME)
writeName(output, type, registration);
else {
if (TRACE) trace("kryo", "Write class " + registration.getId() + ": " + className(type));
output.writeVarInt(registration.getId() + 2, true);
}
此处,在序列化的开头写入了类对应id+2(因为0和1在kryo中用来表示null和非null),在序列化开头代表的类的类型。
Kryo在写入int类型的数据的时候,会通过writeVarInt()方法来写入,这里的int会根据大小写入对应的长度,而不是直接写入4个字节。
实现代码如下:
if (value >>> 7 == 0) {
require(1);
buffer[position++] = (byte)value;
return 1;
}
if (value >>> 14 == 0) {
require(2);
buffer[position++] = (byte)((value & 0x7F) | 0x80);
buffer[position++] = (byte)(value >>> 7);
return 2;
}
if (value >>> 21 == 0) {
require(3);
buffer[position++] = (byte)((value & 0x7F) | 0x80);
buffer[position++] = (byte)(value >>> 7 | 0x80);
buffer[position++] = (byte)(value >>> 14);
return 3;
}
if (value >>> 28 == 0) {
require(4);
buffer[position++] = (byte)((value & 0x7F) | 0x80);
buffer[position++] = (byte)(value >>> 7 | 0x80);
buffer[position++] = (byte)(value >>> 14 | 0x80);
buffer[position++] = (byte)(value >>> 21);
return 4;
}
在完成类名的写入之后,将会根据Kryo的reference参数判断是否需要处理循环依赖,默认为true,如果为true的话,将会通过writeReferenceOrNull()方法写入该类的循环id。
boolean writeReferenceOrNull (Output output, Object object, boolean mayBeNull) {
if (object == null) {
if (TRACE || (DEBUG && depth == 1)) log("Write", null);
output.writeVarInt(Kryo.NULL, true);
return true;
}
if (!referenceResolver.useReferences(object.getClass())) {
if (mayBeNull) output.writeVarInt(Kryo.NOT_NULL, true);
return false;
}
// Determine if this object has already been seen in this object graph.
int id = referenceResolver.getWrittenId(object);
// If not the first time encountered, only write reference ID.
if (id != -1) {
if (DEBUG) debug("kryo", "Write object reference " + id + ": " + string(object));
output.writeVarInt(id + 2, true); // + 2 because 0 and 1 are used for NULL and NOT_NULL.
return true;
}
// Otherwise write NOT_NULL and then the object bytes.
id = referenceResolver.addWrittenObject(object);
output.writeVarInt(NOT_NULL, true);
if (TRACE) trace("kryo", "Write initial object reference " + id + ": " + string(object));
return false;
}
首先如果目标对象非空,会判断是否是基础类型的包装类,如果是则会直接返回false。
在这里,会通过referenceResolve保存已经读过的对象,通过list或者map进行存储,如果能在referenceResolve获取已经保存过得对象,那么说明此处涉及到了循环引用,那么可以直接保存该对象在referenceResolvedeid,写到output中去,而不用再重复写入一次该对象,并返回true,表示不需要继续对该独享进行实例化的操作。
如果得不到,说明是第一次,那么直接保存,并记录非空到output即可,返回false,继续实例化该对象。
在完成上述的操作之后,就可以正式通过注册的serializer通过write()方法实例化该对象。
registration.getSerializer().write(this, output, object);
回到默认的serializer实现,FileSerializer。
private CachedField[] fields = new CachedField[0];
每个FileSerializer会和对应的类绑定,在执行write()方法时,会依次通过fields数组来实例化对象的对应类,fields是一个CachedField数组,在FileSerializer的构造方法中,在传入所需要绑定的类之后,就会通过rebuildCachedFields()方法去填充这一CachedField数组。
List<Field> allFields = new ArrayList();
Class nextClass = type;
while (nextClass != Object.class) {
Field[] declaredFields = nextClass.getDeclaredFields();
if (declaredFields != null) {
for (Field f : declaredFields) {
if (Modifier.isStatic(f.getModifiers())) continue;
allFields.add(f);
}
}
nextClass = nextClass.getSuperclass();
}
ObjectMap context = kryo.getContext();
// Sort fields by their offsets
if (useMemRegions && !useAsmEnabled && unsafeAvailable) {
try {
Field[] allFieldsArray = (Field[])sortFieldsByOffsetMethod.invoke(null, allFields);
allFields = Arrays.asList(allFieldsArray);
} catch (Exception e) {
throw new RuntimeException("Cannot invoke UnsafeUtil.sortFieldsByOffset()", e);
}
}
// TODO: useAsm is modified as a side effect, this should be pulled out of buildValidFields
// Build a list of valid non-transient fields
validFields = buildValidFields(false, allFields, context, useAsm);
// Build a list of valid transient fields
validTransientFields = buildValidFields(true, allFields, context, useAsm);
在这里,会不断从类的继承关系不断往上,获取所有的field的属性,之后根据各个属性在该类上的偏移量进行排序,之后通过buildVaildFields(),在默认情况下,会对没有开启安全检查的属性开启安全检查,并检查所有Optional注解的字段,如果在上下文中,没有该注解的值,那么该字段也会被忽略。
private List<Field> buildValidFields (boolean transientFields, List<Field> allFields, ObjectMap context, IntArray useAsm) {
List<Field> result = new ArrayList(allFields.size());
for (int i = 0, n = allFields.size(); i < n; i++) {
Field field = allFields.get(i);
int modifiers = field.getModifiers();
if (Modifier.isTransient(modifiers) != transientFields) continue;
if (Modifier.isStatic(modifiers)) continue;
if (field.isSynthetic() && ignoreSyntheticFields) continue;
if (!field.isAccessible()) {
if (!setFieldsAsAccessible) continue;
try {
field.setAccessible(true);
} catch (AccessControlException ex) {
continue;
}
}
Optional optional = field.getAnnotation(Optional.class);
if (optional != null && !context.containsKey(optional.value())) continue;
result.add(field);
// BOZO - Must be public?
useAsm.add(!Modifier.isFinal(modifiers) && Modifier.isPublic(modifiers)
&& Modifier.isPublic(field.getType().getModifiers()) ? 1 : 0);
}
return result;
}
在完成了field的获取排序之后,将会通过createCachedFields()方法将其依次转为最后所需要的CachedField。
在FieldSerializerUnsafeUtilImpl的createUnsafeCachedFieldsAndRegions()方法中,将会执行具体的转换。
Field field = validFields.get(i);
int accessIndex = -1;
if (serializer.access != null && useAsm.get(baseIndex + i) == 1)
accessIndex = ((FieldAccess)serializer.access).getIndex(field.getName());
fieldOffset = unsafe().objectFieldOffset(field);
fieldEndOffset = fieldOffset + fieldSizeOf(field.getType());
if (!field.getType().isPrimitive() && lastWasPrimitive) {
// This is not a primitive field. Therefore, it marks
// the end of a region of primitive fields
endPrimitives = lastFieldEndOffset;
lastWasPrimitive = false;
if (primitiveLength > 1) {
if (TRACE)
trace("kryo", "Class " + serializer.getType().getName()
+ ". Found a set of consecutive primitive fields. Number of fields = " + primitiveLength
+ ". Byte length = " + (endPrimitives - startPrimitives) + " Start offset = " + startPrimitives
+ " endOffset=" + endPrimitives);
// TODO: register a region instead of a field
CachedField cf = new UnsafeRegionField(startPrimitives, (endPrimitives - startPrimitives));
cf.field = lastField;
cachedFields.add(cf);
} else {
if (lastField != null)
cachedFields.add(serializer.newCachedField(lastField, cachedFields.size(), lastAccessIndex));
}
cachedFields.add(serializer.newCachedField(field, cachedFields.size(), accessIndex));
} else if (!field.getType().isPrimitive()) {
cachedFields.add(serializer.newCachedField(field, cachedFields.size(), accessIndex));
} else if (!lastWasPrimitive) {
// If previous field was non primitive, it marks a start
// of a region of primitive fields
startPrimitives = fieldOffset;
lastWasPrimitive = true;
primitiveLength = 1;
} else {
primitiveLength++;
}
lastAccessIndex = accessIndex;
lastField = field;
lastFieldEndOffset = fieldEndOffset;
在依次生成CachedField的过程,将会把所有连续的基本类型进行合并,在遍历field的过程中,所有连续的基本类型都会被合并成一个大的UnsafeRegionField进行序列化,而其他的类型将会正常保存,此处也是之前对field根据偏移量进行排序的必要。
把目光回到之前序列化的最后一步,当通过FieldSerializer的write()方法进行序列化之后,将是上述生成的CachedField依次调用write()方法去序列化目标对象。
CachedField[] fields = this.fields;
for (int i = 0, n = fields.length; i < n; i++)
fields[i].write(output, object);
当该field是一个非基本类型时,其write()方法中将会获取该对象,调用writeClass()方法进行序列化,如果存在循环引用,也将在前文的解决措施中进行解决。
如果是基本类型,则通过native方法直接获取从该对象偏移量上的数据即可。
最后,如果是前文连续基本类型导致的UnsafeRegionField,将会不断一个字节一个字节读取进行序列化,直到完毕。
long off;
Unsafe unsafe = unsafe();
for (off = offset; off < offset + len - 8; off += 8) {
output.writeLong(unsafe.getLong(object, off));
}
if (off < offset + len) {
for (; off < offset + len; ++off) {
output.write(unsafe.getByte(object, off));
}
}