Kryo序列化实现源码分析

在使用Kryo序列化之前需要将被序列化的类通过register()方法注册到其中去。

在register的过程中,实则是要根据要序列化的类生成对应的Registration,Registration中记录了类的唯一id与对应的序列化类,在Kryo中,默认的序列化对象是FieldSerializer,没有特别指明的,都将以FieldSerializer来进行序列化。

public Registration register (Class type, Serializer serializer) {
   Registration registration = classResolver.getRegistration(type);
   if (registration != null) {
      registration.setSerializer(serializer);
      return registration;
   }
   return classResolver.register(new Registration(type, serializer, getNextRegistrationId()));
}

public int getNextRegistrationId () {
   while (nextRegisterID != -2) {
      if (classResolver.getRegistration(nextRegisterID) == null) return nextRegisterID;
      nextRegisterID++;
   }
   throw new KryoException("No registration IDs are available.");
}

上面是具体的register()方法,会分配一个自增的唯一id给对应的注册类,生成Registration,在Kyro中,通过该id作为key,Registration作为Value的形式存在Map当中。

public Registration register (Registration registration) {
   if (registration == null) throw new IllegalArgumentException("registration cannot be null.");
   if (registration.getId() != NAME) {
      if (TRACE) {
         trace("kryo", "Register class ID " + registration.getId() + ": " + className(registration.getType()) + " ("
            + registration.getSerializer().getClass().getName() + ")");
      }
      idToRegistration.put(registration.getId(), registration);
   } else if (TRACE) {
      trace("kryo", "Register class name: " + className(registration.getType()) + " ("
         + registration.getSerializer().getClass().getName() + ")");
   }
   classToRegistration.put(registration.getType(), registration);
   if (registration.getType().isPrimitive()) classToRegistration.put(getWrapperClass(registration.getType()), registration);
   return registration;
}

在注册的最后,实则是建立了两组键值对分别存放在两个map中,一个是唯一id对应Registration,一个是需要序列化的类对应Registration。

 

在完成了需要序列化的类的注册之后,则可以通过writeClassAndObject()方法开始进行序列化需要序列化的类的实体对象。核心代码如下。

try {
   if (object == null) {
      writeClass(output, null);
      return;
   }
   Registration registration = writeClass(output, object.getClass());
   if (references && writeReferenceOrNull(output, object, false)) {
      registration.getSerializer().setGenerics(this, null);
      return;
   }
   if (TRACE || (DEBUG && depth == 1)) log("Write", object);
   registration.getSerializer().write(this, output, object);

首先,会通过writeClass()方法在开头写入类的Registration的对应的唯一id。

具体代码如下:

Registration registration = kryo.getRegistration(type);
if (registration.getId() == NAME)
   writeName(output, type, registration);
else {
   if (TRACE) trace("kryo", "Write class " + registration.getId() + ": " + className(type));
   output.writeVarInt(registration.getId() + 2, true);
}

此处,在序列化的开头写入了类对应id+2(因为0和1在kryo中用来表示null和非null),在序列化开头代表的类的类型。

Kryo在写入int类型的数据的时候,会通过writeVarInt()方法来写入,这里的int会根据大小写入对应的长度,而不是直接写入4个字节。

实现代码如下:

if (value >>> 7 == 0) {
   require(1);
   buffer[position++] = (byte)value;
   return 1;
}
if (value >>> 14 == 0) {
   require(2);
   buffer[position++] = (byte)((value & 0x7F) | 0x80);
   buffer[position++] = (byte)(value >>> 7);
   return 2;
}
if (value >>> 21 == 0) {
   require(3);
   buffer[position++] = (byte)((value & 0x7F) | 0x80);
   buffer[position++] = (byte)(value >>> 7 | 0x80);
   buffer[position++] = (byte)(value >>> 14);
   return 3;
}
if (value >>> 28 == 0) {
   require(4);
   buffer[position++] = (byte)((value & 0x7F) | 0x80);
   buffer[position++] = (byte)(value >>> 7 | 0x80);
   buffer[position++] = (byte)(value >>> 14 | 0x80);
   buffer[position++] = (byte)(value >>> 21);
   return 4;
}

在完成类名的写入之后,将会根据Kryo的reference参数判断是否需要处理循环依赖,默认为true,如果为true的话,将会通过writeReferenceOrNull()方法写入该类的循环id。

boolean writeReferenceOrNull (Output output, Object object, boolean mayBeNull) {
   if (object == null) {
      if (TRACE || (DEBUG && depth == 1)) log("Write", null);
      output.writeVarInt(Kryo.NULL, true);
      return true;
   }
   if (!referenceResolver.useReferences(object.getClass())) {
      if (mayBeNull) output.writeVarInt(Kryo.NOT_NULL, true);
      return false;
   }

   // Determine if this object has already been seen in this object graph.
   int id = referenceResolver.getWrittenId(object);

   // If not the first time encountered, only write reference ID.
   if (id != -1) {
      if (DEBUG) debug("kryo", "Write object reference " + id + ": " + string(object));
      output.writeVarInt(id + 2, true); // + 2 because 0 and 1 are used for NULL and NOT_NULL.
      return true;
   }

   // Otherwise write NOT_NULL and then the object bytes.
   id = referenceResolver.addWrittenObject(object);
   output.writeVarInt(NOT_NULL, true);
   if (TRACE) trace("kryo", "Write initial object reference " + id + ": " + string(object));
   return false;
}

首先如果目标对象非空,会判断是否是基础类型的包装类,如果是则会直接返回false。

在这里,会通过referenceResolve保存已经读过的对象,通过list或者map进行存储,如果能在referenceResolve获取已经保存过得对象,那么说明此处涉及到了循环引用,那么可以直接保存该对象在referenceResolvedeid,写到output中去,而不用再重复写入一次该对象,并返回true,表示不需要继续对该独享进行实例化的操作。

如果得不到,说明是第一次,那么直接保存,并记录非空到output即可,返回false,继续实例化该对象。

 

 

在完成上述的操作之后,就可以正式通过注册的serializer通过write()方法实例化该对象。

registration.getSerializer().write(this, output, object);

回到默认的serializer实现,FileSerializer。

private CachedField[] fields = new CachedField[0];

每个FileSerializer会和对应的类绑定,在执行write()方法时,会依次通过fields数组来实例化对象的对应类,fields是一个CachedField数组,在FileSerializer的构造方法中,在传入所需要绑定的类之后,就会通过rebuildCachedFields()方法去填充这一CachedField数组。

List<Field> allFields = new ArrayList();
Class nextClass = type;
while (nextClass != Object.class) {
   Field[] declaredFields = nextClass.getDeclaredFields();
   if (declaredFields != null) {
      for (Field f : declaredFields) {
         if (Modifier.isStatic(f.getModifiers())) continue;
         allFields.add(f);
      }
   }
   nextClass = nextClass.getSuperclass();
}

ObjectMap context = kryo.getContext();

// Sort fields by their offsets
if (useMemRegions && !useAsmEnabled && unsafeAvailable) {
   try {
      Field[] allFieldsArray = (Field[])sortFieldsByOffsetMethod.invoke(null, allFields);
      allFields = Arrays.asList(allFieldsArray);
   } catch (Exception e) {
      throw new RuntimeException("Cannot invoke UnsafeUtil.sortFieldsByOffset()", e);
   }
}

// TODO: useAsm is modified as a side effect, this should be pulled out of buildValidFields
// Build a list of valid non-transient fields
validFields = buildValidFields(false, allFields, context, useAsm);
// Build a list of valid transient fields
validTransientFields = buildValidFields(true, allFields, context, useAsm);

在这里,会不断从类的继承关系不断往上,获取所有的field的属性,之后根据各个属性在该类上的偏移量进行排序,之后通过buildVaildFields(),在默认情况下,会对没有开启安全检查的属性开启安全检查,并检查所有Optional注解的字段,如果在上下文中,没有该注解的值,那么该字段也会被忽略。

private List<Field> buildValidFields (boolean transientFields, List<Field> allFields, ObjectMap context, IntArray useAsm) {
   List<Field> result = new ArrayList(allFields.size());

   for (int i = 0, n = allFields.size(); i < n; i++) {
      Field field = allFields.get(i);

      int modifiers = field.getModifiers();
      if (Modifier.isTransient(modifiers) != transientFields) continue;
      if (Modifier.isStatic(modifiers)) continue;
      if (field.isSynthetic() && ignoreSyntheticFields) continue;

      if (!field.isAccessible()) {
         if (!setFieldsAsAccessible) continue;
         try {
            field.setAccessible(true);
         } catch (AccessControlException ex) {
            continue;
         }
      }

      Optional optional = field.getAnnotation(Optional.class);
      if (optional != null && !context.containsKey(optional.value())) continue;

      result.add(field);

      // BOZO - Must be public?
      useAsm.add(!Modifier.isFinal(modifiers) && Modifier.isPublic(modifiers)
         && Modifier.isPublic(field.getType().getModifiers()) ? 1 : 0);
   }
   return result;
}

在完成了field的获取排序之后,将会通过createCachedFields()方法将其依次转为最后所需要的CachedField。

在FieldSerializerUnsafeUtilImpl的createUnsafeCachedFieldsAndRegions()方法中,将会执行具体的转换。

Field field = validFields.get(i);

int accessIndex = -1;
if (serializer.access != null && useAsm.get(baseIndex + i) == 1)
   accessIndex = ((FieldAccess)serializer.access).getIndex(field.getName());

fieldOffset = unsafe().objectFieldOffset(field);
fieldEndOffset = fieldOffset + fieldSizeOf(field.getType());

if (!field.getType().isPrimitive() && lastWasPrimitive) {
   // This is not a primitive field. Therefore, it marks
   // the end of a region of primitive fields
   endPrimitives = lastFieldEndOffset;
   lastWasPrimitive = false;
   if (primitiveLength > 1) {
      if (TRACE)
         trace("kryo", "Class " + serializer.getType().getName()
            + ". Found a set of consecutive primitive fields. Number of fields = " + primitiveLength
            + ". Byte length = " + (endPrimitives - startPrimitives) + " Start offset = " + startPrimitives
            + " endOffset=" + endPrimitives);
      // TODO: register a region instead of a field
      CachedField cf = new UnsafeRegionField(startPrimitives, (endPrimitives - startPrimitives));
      cf.field = lastField;
      cachedFields.add(cf);
   } else {
      if (lastField != null)
         cachedFields.add(serializer.newCachedField(lastField, cachedFields.size(), lastAccessIndex));
   }
   cachedFields.add(serializer.newCachedField(field, cachedFields.size(), accessIndex));
} else if (!field.getType().isPrimitive()) {
   cachedFields.add(serializer.newCachedField(field, cachedFields.size(), accessIndex));
} else if (!lastWasPrimitive) {
   // If previous field was non primitive, it marks a start
   // of a region of primitive fields
   startPrimitives = fieldOffset;
   lastWasPrimitive = true;
   primitiveLength = 1;
} else {
   primitiveLength++;
}

lastAccessIndex = accessIndex;
lastField = field;
lastFieldEndOffset = fieldEndOffset;

在依次生成CachedField的过程,将会把所有连续的基本类型进行合并,在遍历field的过程中,所有连续的基本类型都会被合并成一个大的UnsafeRegionField进行序列化,而其他的类型将会正常保存,此处也是之前对field根据偏移量进行排序的必要。

 

把目光回到之前序列化的最后一步,当通过FieldSerializer的write()方法进行序列化之后,将是上述生成的CachedField依次调用write()方法去序列化目标对象。

CachedField[] fields = this.fields;
for (int i = 0, n = fields.length; i < n; i++)
   fields[i].write(output, object);

当该field是一个非基本类型时,其write()方法中将会获取该对象,调用writeClass()方法进行序列化,如果存在循环引用,也将在前文的解决措施中进行解决。

如果是基本类型,则通过native方法直接获取从该对象偏移量上的数据即可。

最后,如果是前文连续基本类型导致的UnsafeRegionField,将会不断一个字节一个字节读取进行序列化,直到完毕。

long off;
Unsafe unsafe = unsafe();
for (off = offset; off < offset + len - 8; off += 8) {
   output.writeLong(unsafe.getLong(object, off));
}

if (off < offset + len) {
   for (; off < offset + len; ++off) {
      output.write(unsafe.getByte(object, off));
   }
}
发布了141 篇原创文章 · 获赞 19 · 访问量 10万+

猜你喜欢

转载自blog.csdn.net/weixin_40318210/article/details/89346564