Saved! 70% of programmers do not know the serialization details, clear

1. Implement the serializable interface carefully

problem

The serialization process is to "encode an object into a byte stream", and the opposite process is called the "deserialization process". When an object is serialized, its encoding can be transferred from one virtual machine to another, and can be saved on the disk for later deserialization . There has been a misunderstanding for a long time that in order to achieve serialization, you only need to implement the Serializable interface. In fact, this method has many harms. The convenience of this serialization method will bring long-term maintenance costs. What are the precautions regarding Serilizable?

answer

Disadvantages of Serializable

Directly implementing the Serializable interface has the following disadvantages:

Reduce flexibility : If a class implements the Serializable interface, its byte stream encoding also becomes a part of its exported API. Once this class is widely used, it must always support this serialization method. Moreover, if the default Serializable is used, the private and package-level private instance domains in this class will become part of the exported API, which does not conform to the design principle of the minimum access level of the domain. **In addition, if the internal structure of the class is changed, the client attempts to use the old version of the class for serialization, and the new version is used for deserialization, the program will go wrong. If the serialized class does not show the specified serialVersionUID identifier (serial version UID), the system will automatically call a complex calculation process to generate the identifier based on this class. This identifier is a tag number generated based on the class name, interface name, and all public and protected member names. If you change the internal structure of the class, such as adding a method, the automatically generated sequence version UID will also change. Therefore, if a version number is not explicitly declared, compatibility will be broken, resulting in an InvalidClassException at runtime.

It is easier to cause bugs and security vulnerabilities :

General objects are created by the constructor, and serialization is also an object creation mechanism, and deserialization can also construct objects. Since there is no explicit constructor in the deserialization mechanism,

Deserialization must ensure:

The constraint relationship established by the real constructor does not allow the attacker to access the internal information of the object being constructed . Relying on the default deserialization mechanism, it is easy to destroy the constraint relationship of the object and suffer illegal access. Related testing burden is increased : When a serializable class is modified, it is necessary to check "Serialize an instance in the new version and deserialize in the old version" and "Serialize an instance in the old version and reverse it in the new version" Whether serialization is normal or not, when the release version increases, the amount of testing is proportional to the product of the “number of serializable classes and the release version number”. 2.Serializable applicable scenarios If a class needs to be added to a framework, and the framework relies on serialization to achieve object transmission and persistence, then it is necessary for the class to implement Seriablizable. From a further point of view, a class belongs to a component. If the parent component implements the Seriablizable interface, then the class also needs to implement the Seriablizable interface. According to experience, value classes such as Date and BigInteger should implement Serializable, and most collection classes also need to be implemented. 3.Serializable not applicable scenarios  Classes designed for inheritance should implement the Serializable interface as little as possible, and the user interface should not inherit the Serializable interface as much as possible , because the subclass or implementation class also bears the risk of serialization. In most cases, this principle needs to be followed. Very special circumstances can break this principle. For example, the classes that implement the Serializable interface include the Throwable class (exceptions can be passed from the server to the client), the Component class (GUI can be sent, Save and restore), HttpServlet abstract class (session session can be cached); internal classes should not implement Serializable , internal classes need to save references to external class instances and save the values ​​of local variables from external scopes. How these fields correspond to the class definition is uncertain. Therefore, the default serialization form of the inner class is unclear.

in conclusion

In short, do not equate serialization as simply implementing the Serilizable interface, and you should consider the application scenarios of Seriablizable and the precautions mentioned above.

Consider using a custom serialization form

problem

Designing the serialization form of a class is as important as designing the API of the class, so don't use the default serialization behavior before seriously considering whether the default serialization form is appropriate . Before making a decision, you need to examine this encoding form from multiple perspectives of flexibility, performance, and correctness . Generally speaking, you can accept the default serialization form only when the custom serialization form you design yourself is basically the same as the default form. What are the precautions for choosing the appropriate serialization method?

answer

The default serialization form describes the data contained in the object and the internal data of each other object that can be reached from this object, that is, it completely describes the topology structure of all objects connected. For an object, the ideal serialization form should only contain the logical data represented by the object, and the logical data and physical representation should be independent of each other. In other words, if the physical representation of an object is equivalent to its logical content, the default serialization form is suitable. There is such an example public class Name implements Serializable {private final String lastName; private final String firstName; private final String middleName; ... ...}

From a logical point of view, the Name class can simply be represented by three attributes, lastName, firstName, and middleName, that is, these three attributes can accurately reflect its logical content. Therefore, in this case, the default serialization form can be used, and parameter validity detection and protective copying are also required in readObject. Using the default serialization form, when one or more field fields are marked as transient, if deserialization is to be performed, these field fields will be initialized to their type default values , such as the object reference field is set to null, the value is basically The default value of the domain is 0, and the default value of the boolean domain is false. If these values ​​cannot be modified by any transient fields, you must provide a readObject method. It first calls defaultReadObject, and then restores these transient fields to their previous initial values; similarly, in the serialization process, the transient modified instance fields will be omitted. In the serialization process, the virtual machine tries to call the object class In the writeObject() and readObject(), you can implement your own serialization logic in the readObject and writeObject methods. Even if no specific logic is implemented, you should call the default ObjectOutputStream.defaultWriteObject() and ObjectInputStream.defaultReadObject() methods, so that you can ensure forward or backward compatibility; no matter which serialization form you choose, it must be Each serializable class you write declares an explicit serial version UID. This can prevent the serial version UID from becoming a potential source of incompatibility, and will also bring a small performance benefit because there is no need to calculate the serial version UID.

in conclusion

When you decide to design a class to be serializable, you should carefully consider what serialization form should be used. Only when the default serialization form can reasonably describe the logical state of the object, can the default serialization form be used. Otherwise, it is necessary to design a custom serialization form, through which the state of the object can be reasonably described.

Use the readObject method with caution

problem

In order to make the program more secure and reliable, it is necessary to make a protective copy in the constructor and access method for the variable domain, for example, the following code: public final static class Period{ private final Date start; private final Date end; public Period(Date start , Date end){ this.start = new Date(start.getTime()); this.end =new Date(end.getTime()); if(this.start.compareTo(this.end)>0){ throw newIllegalArgumentException(start + "after" +end);}} public Date getStart() {return newDate(start.getTime());} public Date getEnd() {return new Date(end.getTime());}}

But if this class is serialized, it may appear that this class does not satisfy the constraint relationship of start and end. Then, how should we ensure that the key constraint relationship of the object can also be guaranteed during serialization?

answer

In addition to constructing objects by constructors, deserialization is also a way to construct objects. Therefore, parameter validity checks and protective copying are also required when constructing objects . Therefore, the readObject method also needs to ensure that the key constraints of Period remain unchanged and maintain its immutability:

  private void readObject(ObjectInputStream s)
  throws IOException, ClassNotFoundException {
      s.defaultReadObject();
      // Defensively copy our mutable components
      start = new Date(start.getTime());
      end = new Date(end.getTime());
      // Check that our invariants are satisfied
      if (start.compareTo(end) > 0)
          throw new InvalidObjectException(start +" after "+ end);
      }
  }

And it should be noted that the protective copy is before the parameter validity check, and the clone method cannot be used to copy the object.

in conclusion

All in all, whenever you write a readObject method, think like this:

You are writing a public constructor, no matter what byte stream is passed to it, it must produce a valid instance. The following experience helps to write a more robust readObject method: The object reference domain must be kept private, and every object in these domains must be protected copy. Mutable components of immutable classes fall into this category; for any constraint, if the check fails, an InvalidObjectException will be thrown. These checking actions should follow all protective copies; if the entire object graph must be validated after being deserialized, the ObjectInputValidation interface should be used; the overridable method should not be called in the readObject method, whether indirectly or indirectly Direct way

Use enumeration to implement singleton

problem

For Singleton, the simplest way is:

public class Elvis {public static final Elvis INSTANCE = new Elvis(); private Elvis() {...} public void leaveTheBuilding() {...}}  If the class is serialized, regardless of the default serialization method Or use a custom serialization method, or perform the so-called processing in the readObject method, this class will not be a singleton. So how to achieve this kind of singleton that needs to be serializable?

answer

To satisfy the serializable singleton, there are two ways:

Use the readResolve method :

The readResolve feature allows you to replace another instance with an instance created by readObject. For an object that is being deserialized, if its class defines a readResolve method and has the correct declaration, then after deserialization, the readResolve method on the newly created object will be called. Then, the object reference returned by this method will be returned, replacing the newly created object. Therefore, each time you deserialize, you can return the previous instance object in the readResolve method, so that you can ensure that there will only be one object after multiple deserialization. The sample code is:  // readResolve for instance control-you can do better!  private Object readResolve() { // Return the one true Elvis and let the garbage collector // take care of the Elvis impersonator.  return INSTANCE;} The method ignores the object being deserialized, and only returns the special Elvis instance created when the class was initialized. In fact, if you rely on readResolve for instance control, all instance domains with object reference types must be declared as transient . Otherwise, the singleton implemented by the readResolve method will also be attacked.  

Use enumeration to achieve :

Enumerations can be used to implement serializable singletons. This security is guaranteed by JVM, and the code is very concise, and the instance domain does not need to be modified with transient:  // Enum singleton-the preferred approach  public enumElvis {INSTANCE; private String [] favoriteSongs ={ "Hound Dog", "Heartbreak Hotel" };public void printFavorites() {System.out.println(Arrays.toString(favoriteSongs));}}

in conclusion

The simplest and safest way to achieve serialization is to use the form of enumeration, which should be used as much as possible. If implemented by readResolve, it can be ensured that all instance fields of this class are of basic type or transient.

Author: Listen ___

Link: https://juejin.im/post/6883434777416990728

Source: Nuggets

Guess you like

Origin blog.csdn.net/GYHYCX/article/details/109098162