Java SE foundation consolidation (VIII): Serialization

In data processing, converting the data into other structures or objects usable format, and make persistent storage or sending it to the network stream, this behavior is serialization, deserialization is the opposite.

Now between each other using the popular micro-services, service RPC or HTTP to communicate, when the message is sent out of the object when it needs to be serialized, or the recipient may not be recognized under the (micro-service architecture, each language used by the service is not the same), when the recipient accepts the message on the data structure in accordance with a certain protocol system deserialized into recognizable.

Now there is a way Java serialization main two: one is serialized Java native, will be converted Java object into a byte stream, but this approach is dangerous, we will discuss later, the other is to use a third-party structured data structures, such as JSON and Google Protobuf, I will briefly explain the following two ways.

1 Java primeval ranking

This class manner serialized object belongs must implement the Serializable interface, which is just a marker interface, there is no abstract methods, this is achieved when the interface does not require any override method. To make a Java object serialization, you need java.io.ObjectOutputStream class, which has a writeObject method parameters need to serialize objects, Java objects to deserialize do need to use java.io.ObjectInputStream class, he has a readObject () method, which takes no parameters. The following is a Java objects serialization and deserialization example:

ObjectOutputStream and ObjectInputStream class are the BIO-oriented system stream of bytes, it can be inferred from this sequence of native Java byte stream oriented.

public class User {

    private Long id;

    private String username;

    private String password;
	
    //setter and getter
}
public class Main {

    public static void main(String[] args) throws IOException, ClassNotFoundException {
        String fileName = "E:\\Java_project\\effective-java\\src\\top\\yeonon\\serializable\\origin\\user.txt";
        User user = new User();
        user.setId(1L);
        user.setUsername("yeonon");
        user.setPassword("yeonon");

        //序列化
        ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(fileName));
        out.writeObject(user); //写入
        out.flush(); //刷新缓冲区
        out.close();

        //反序列化
        ObjectInputStream in = new ObjectInputStream(new FileInputStream(fileName));
        User newUser = (User) in.readObject();
        in.close();

        //比较两个对象
        System.out.println(user);
        System.out.println(newUser);
        System.out.println(newUser.equals(user));
    }
}
复制代码

ObjectOutputStream and ObjectInputStream code uses serialization and de-serialization code is very simple, not to say. Directly run the program, you should get a java.io.NotSerializableException abnormal, what causes it? Because the User class does not implement Serializable interface, it gives he added, as follows:

public class User implements Serializable {

    private Long id;

    private String username;

    private String password;
}
复制代码

Now run the program again, output probably as follows:

top.yeonon.serializable.User@61bbe9ba
top.yeonon.serializable.User@4e50df2e
false
复制代码

The first output line before the object is serialized, the second line is output after the object through the serialization and deserialization, numbered from the point of view, the two objects are different, the third line is using the equals method to compare two objects, the output is false? Why, even if two different objects, comparing with the equals method should return true ah, because of their status fields are the same? This is actually a question equals method, our User class does not override euqals method, so using the equals method of the Object class, Object equals method just two simple comparison reference is just the same, as follows:

    public boolean equals(Object obj) {
        return (this == obj);
    }
复制代码

So the results will return false, but if we rewrite the equals method in the User class, can be an opportunity for euqlas method returns true, on how to properly implement euqals method, the content is not discussed in this article, it is recommended to see "Effective Java "the relevant sections of the third edition of Object topic.

This completes the first serialization and de-serialization operations, while also creating a user.txt directory file, the file is a binary file, the contents inside a virtual machine can recognize the bytecode, we can put this files to another computer, that computer if the JVM and the same side of the computer, then you can use directly ObjectInputStream read and deserialize the object of (there is also a prerequisite for the computer program in the presence of java User class).

Introduction to Java native serialization and de-serialization, the next will introduce based on serialization and de-serialization of third-party structured data structures, introduces two formats: JSON and Google Protobuf.

2 JSON serialization and deserialization

That JSON JavaScript Object Notation, is a lightweight data-interchange language, JSON is designed to JavaScript services, widely used in the Web field, but JSON and JS Web has not only for the environment, and become an independent language of structured data format, but JSON is a text-based, its contents have high readability, humans can easily understand.

2.1 serialization and deserialization pure object

Here we use a third-party library of java jackson to demonstrate how to convert Java objects to JSON and Java objects are converted into JSON:

public class Main {

    //ObjectMapper对象,jackson中所有的操作都需要通过该对象
    private static final ObjectMapper objectMapper = new ObjectMapper();

    public static void main(String[] args) throws IOException {
        User user = new User();
        user.setId(1L);
        user.setUsername("yeonon");
        user.setPassword("yeonon");
        //序列化成json字符串
        String jsonStr = objectMapper.writeValueAsString(user);
        System.out.println(jsonStr);

        //反序列化成Java对象
        User newUser = objectMapper.readValue(jsonStr, User.class);

        System.out.println(newUser);
        System.out.println(user);
        System.out.println(newUser.equals(user));
    }
}

复制代码

The first is to create a ObjectMapper objects, jackson in all the operations required by the object. Then call writeValueAsString (Object) method to a String object into a string, i.e. serialized by readValue (String, Class <?>) Method to convert the JSON string to Java objects, i.e. deserialized. Output shown as follows:

{"id":1,"username":"yeonon","password":"yeonon"}
top.yeonon.serializable.User@675d3402
top.yeonon.serializable.User@51565ec2
false
复制代码

Note that the first line of output, which is expressed in JSON format characters, about JSON syntax, format, etc. is recommended to find information online to learn, very simple. In addition three lines before, have explained before, again no longer explain.

2.2 serialization and deserialization collection object

Jackson's very rich and powerful, not only can this pure sequence of user objects, also can be serialized collection, but deserialization trouble when some of it (but still relatively simple), as follows:

public class Main {

    //ObjectMapper对象,jackson中所有的操作都需要通过该对象
    private static final ObjectMapper objectMapper = new ObjectMapper();

    public static void main(String[] args) throws IOException {
        User user1 = new User(1L, "yeonon", "yeonon");
        User user2 = new User(2L, "weiyanyu", "weiyanyu");
        User user3 = new User(3L, "xiangjinwei", "xiangjinwei");

        Map<Long, User> map = new HashMap<>();
        map.put(1L, user1);
        map.put(2L, user2);
        map.put(3L, user3);

        //序列化集合
        String jsonStr = objectMapper
                .writerWithDefaultPrettyPrinter()
                .writeValueAsString(map);
        System.out.println(jsonStr);

        //反序列化集合
        JavaType javaType = objectMapper
                .getTypeFactory()
                .constructParametricType(Map.class, Long.class, User.class);
        Map<Long, User> newMap = objectMapper.readValue(jsonStr, javaType);

        newMap.forEach((k, v) -> {
            System.out.println(v);
        });

    }
}
复制代码

Serialization before, but here used to output writerWithDefaultPrettyPrinter become more Pretty (beautiful) Some (not recommended to use this in a production environment, because the space is often more, not as good as the original compact). The key deserialization that, as a direct call objectMapper.readValue (jsonStr, Map.class) If you like before; will find that although the results of a Map, but which contains the keys and values ​​element is not the Long and User types, but List value of type String and construction types, this is clearly not the result we want.

Therefore, for the collection, additional processing need to do more, to produce a first JavaType object that represents be grouped together to form a new type of several types, constructParametricType () method takes two parameters, the first parameter is rawType that primitive type in the code that is Map, after the collection represents the type of element classes, because the Map has two types of elements, so the incoming two types, namely, Long and User, the last call readValue () another overloads string and passed to javaType, This completes the deserialization operation.

Running the program, the output shown as follows:

{
  "1" : {
    "id" : 1,
    "username" : "yeonon",
    "password" : "yeonon"
  },
  "2" : {
    "id" : 2,
    "username" : "weiyanyu",
    "password" : "weiyanyu"
  },
  "3" : {
    "id" : 3,
    "username" : "xiangjinwei",
    "password" : "xiangjinwei"
  }
}
----------------------------------
User{id=1, username='yeonon', password='yeonon'}
User{id=2, username='weiyanyu', password='weiyanyu'}
User{id=3, username='xiangjinwei', password='xiangjinwei'}
复制代码

Jackson There are many powerful features, if you want to learn more, search suggestions information on their own view of learning. Here's another structured data format: Google Protobuf.

3 Google Protobuf serialization and de-serialization

Protobuf serialized tool from Google. It mainly has the following characteristics:

  • Language-independent, platform-independent
  • concise
  • high performance
  • Good compatibility

Language, platform-independent format because it is based on some kind of agreement, the serialization and de-serialization when the need to follow this agreement, will be able to achieve natural-independent platform. Although it is simple, but not easy to read, it is not like JSON, XML and other text-based format, but binary-based format, which is the reason for its high-performance, below is the comparison of the performance of it and other tools:

iGgLEn.png

iGgONq.png

3.1 Preparations

First, you need to download the corresponding official website tools, because my computer operating system is a win, so the download is protoc-3.6.1-win32.zip this file. After downloading the good into the bin directory, find protoc.exe, this stuff is, so will we want to use the.

3.2 protobuf write file

After preparation work, he started writing protobuf files, protobuf file format is very close to C ++, on the format more details, see the recommendations go to the official website, the official website written in a very clear, here is an example:

syntax = "proto2";

option java_package = "top.yeonon.serializable.protobuf";
option java_outer_classname = "UserProtobuf";

message User {
	required int64 id= 1;
	required string name = 2;
	required string password = 3;
}
复制代码

Briefly explain it:

  • syntax. That which syntax is used, proto2 indication proto2 syntax, proto3 indication proto3 syntax.
  • java_package options. It represents the java package name.
  • java_outer_classname options. It represents the java class name generation, do not set it to default hump indicates the file name as the class name.
  • message. Must be simple it is understood to represent classes, e.g. Message User code to be generated on a User class represents.
  • required. Not necessary to indicate the time sequence of the field must have a value, otherwise the sequence of failure.

3.3 to generate Java classes

At this point you must use protoc.exe executable files, execute the following command:

protoc.exe -I=$SRC_DIR --java_out=$DST_DIR $SRC_DIR/addressbook.proto
复制代码

SRC_DIR that is the source file directory, DST_DIR that is, you want to generate classes in which directory, here is my test case:

protoc.exe -I=C:\Users\72419\Desktop\ --java_out=E:\Java_project\effective-java\src C:\Users\72419\Desktop\user.proto
复制代码

After finished, you can E: \ Java_project \ see under effective-java \ src directory top directory, from the beginning, is to continue to create the directory according to the package name before protobuf file settings, so eventually we'll E : \ Java_project \ effective-java \ src \ top \ yeonon \ serializable \ see under protobuf \ directory UserProtobuf.java file, which is generated Java classes, but this is not what we really want class, we really want should be the User class, a User class is an internal class actually UserProtobuf class can not be instantiated directly, you need to be constructed by Buidler class. The following is a simple example of use:

public class Main {
    public static void main(String[] args) throws InvalidProtocolBufferException {
        UserProtobuf.User user = UserProtobuf.User.newBuilder()
                .setId(1L)
                .setName("yeonon")
                .setPassword("yeonnon").build();

        System.out.println(user);

        //序列化成字节流
        byte[] userBytes = user.toByteArray();

        //反序列化
        UserProtobuf.User newUser = UserProtobuf.User.parseFrom(userBytes);

        System.out.println(user);
        System.out.println(newUser);
        System.out.println(newUser.equals(user));
    }
}
复制代码

First, to construct objects associated with the use Builder, then, this case can be transmitted over the network byte stream toByteArray generating a byte stream or persistence. Deserialization is also very simple, call parseFrom () method can be. Running the program, the output is as follows:

id: 1
name: "yeonon"
password: "yeonnon"

true
复制代码

Found here it returns true, and the above are not the same, why? Because the tool then generates when the User class, but also the way to rewrite the equals method, the euals method compares the field value in both objects, what is the value of this field must be the same, it will eventually return true.

These are simple Protobuf uses, in fact Protobuf far more than these features, there are many powerful features, because I am also "to sell off now", so do not incompetence, it is recommended online search information on a deep level of learning.

4 sequence and singleton

The most basic requirement is that the entire single-mode embodiment of present application only one example, no matter what implemented method is done around the entire object. The serialization could undermine singleton, more accurate to say that deserialization will destroy the singleton. why? In fact, from the above description, it has been probably know the reason, we find that after the anti-sequence objects and original objects are not the same object, that is, there is more than one instance of a class throughout the system, apparently destroyed a single case, which why enum class readObject method (enum is a singleton) implementation is a direct throw an exception (in the article about the enumerated mentioned). So, if you want to maintain a single embodiment, it is best not allowed to be serialized and deserialized.

5 Why serialization is dangerous

The reason to be serialized, because you want to object to persistent storage or network transmission, it will cause the system to lose control of the object. For example, the user object now want to be serialized and transmitted over the network to other systems, if the network transfer, the serialized byte stream to be tampered with, but no damage to the structure, the content recipient may deserialization it is sent by the sender and the content is inconsistent, may cause serious system crash recipient.

Or the system does not know the source of the received byte stream and deserializes it, assuming no network attacks, i.e. the byte stream has not been tampered, but deserialization time is very long, and even lead OOM or StackOverFlow. For example if the transmission is sent a deep discharge of Map construct (i.e., the Map Map embedded), it is assumed that there are n layers, then the receiver for java deserialized into objects, layer by layer would have to unlock, time complexity would be 2 ^ n, that time complexity index level, the machine is likely to never be finished, and most likely lead to StackOverFlow and eventually cause the entire system to crash, a lot of denial of service attack is so dry. It is worth mentioning that some structured data structures (e.g. JSON) can be effectively avoided.

6 Summary

This article briefly describes what serialization and de-serialization, but also the way a bit simple and Protobuf of using JSON. Serialization and de-serialization is a certain risk, if not necessary, to try to avoid, if you have to use, it is best to use a number of structured data structures, such as JSON, Protobuf etc., so at least you can avoid a dangerous (in the fifth summary there mentioned).

Guess you like

Origin juejin.im/post/5d6e808a5188252d43758f8b