[Java Basics] Commonly used serialization techniques and methods

Serialization is relatively important and simple, but also a type of basic knowledge that is easily overlooked. Serialization is widely used in network applications, especially in today's distributed systems. Understanding serialization is the basis for understanding network applications and distributed system architectures.

It will also be frequently asked during the interview, so the content of this part still needs to be mastered.

1. Serialization and deserialization

Serialization : It is the process of converting objects into transferable bytes.
Deserialization : It is the process of restoring transferable bytes into objects.

The ultimate goal of serialization is to enable cross-platform storage and network transmission of objects, and our way of cross-platform storage and network transmission is IO, and the data format supported by IO is byte array.

Summary: Serialization and deserialization are essentially a process of data conversion .

2. Serialization technical solution

There are many common serialization technology implementation solutions, here are the following: JDK native serialization, Json serialization, ProtoBuf serialization, Hessian serialization, Kryo serialization .

3. Key points of serialization technology selection

When choosing which serialization technology to apply, you can refer to the following basic considerations, usually:

  1. Performance : Performance requirements need to be considered, the higher the serialization speed and performance, the better.

  1. Security : The need to consider security. JDK serialization may have a thread stuck vulnerability.

  1. Occupied space : The space occupied by the serialized result. The serialized byte data is usually persisted to the hard disk (occupied storage resources) or transmitted to other servers on the network (occupied bandwidth resources). This indicator is of course The smaller the better.

  1. Cross-language : Whether there is a need for cross-language communication in the internal system of the enterprise.

  1. Maintainability : The technology is popular, the more popular the technology, the higher the maintainability and the lower the maintenance cost.

The so-called cross-language is the process you write in Java. After the data is serialized, it can be deserialized in the process written in other languages ​​such as Python, PHP, C#, etc., which is the so-called cross- language .

4. JDK native serialization

The Java class implements the serialization of objects of this class by implementing the Serializable interface.

The more classic and underappreciated serialization technology, the serialization technology that comes with JDK.

Developers can implement the java.io.Serializable interface. Then use the java.io.ObjectOutputStream class for serialization, and the java.io.ObjectInputStream class for deserialization.

The serialization technology that comes with JDK has many shortcomings, so it is not recommended. For example, performance issues, serialized data take up a lot of space, and serious deserialization security vulnerabilities. (Refer to "12. Serialization Item 85: Prefer alternatives to Java Serialization" in the book "Effective Java Third Edition")

Five, Json serialization

The most popular serialization method at present is nothing more than JSON, and JSON is the most widely used format for front-end and back-end interaction. As the most general format, it is supported by various languages ​​and can support complex objects.

JSON (JavaScript Object Notation) is a lightweight data exchange format. JSON serialization is the conversion of data objects into JSON strings. The type information is discarded during the serialization process, so the deserialization can be accurately deserialized only when the type information is provided during deserialization. JSON is more readable and easy to debug.

Common JSON serialization frameworks include: fastJSON, Jackson, Gson, etc.

Six, ProtoBuf serialization

Introduced by Google, it is a language-independent, platform-independent, and extensible method of serializing structured data, which can be used for communication protocols, data storage, etc. After serialization, the volume is small, and it is generally used in systems with high requirements on transmission performance.

But using Protobuf will be relatively troublesome, because it has its own syntax and its own compiler. If you need to use it, you must invest in the learning of this technology.

​Protobuf has a disadvantage that the structure of each class transmitted must generate a corresponding proto file. If a class is modified, the proto file corresponding to the class must be regenerated.

Seven, Hessian serialization

Hessian is a lightweight binary web service protocol, mainly used to transmit binary data.

Hessian supports serializing objects into binary streams before transferring data. Compared with JDK native serialization, Hessian serialization is smaller in size and better in performance.

Eight, Kryo serialization

Kryo is an open source high-performance Java serialization/deserialization library. The Kryo library is very efficient and the byte[] as a serialization result takes up less space, so it is used in many top projects.

Kryo is a Java serialization framework, known as the fastest serialization framework in Java. Kryo has an advantage in serialization speed, and the bottom layer relies on the bytecode generation mechanism.

Since it can only be limited to the JVM language, Kryo does not support cross-language use.

end!

Guess you like

Origin blog.csdn.net/weixin_44299027/article/details/129050484