Flink - The object probably contains or references non serializable fields

I. Introduction

When using Flink to customize the Source to generate data, the cluster submits a task and displays org.apache.log4j.Logger@72c927f1 is not serializable. The object probably contains or references non serializable fields. Error serialization-related errors are reported:

2. Problem solving

1. Scala Class initialization does not require corresponding variables

error code:

val logger = Logger.getLogger(classOf[T])

Correct code:

The serialization of the variable is ignored through the scala lazy loading function and the @transient keyword, provided that the variable is not required when the corresponding class[T] is initialized. If a variable is called when the class[T] is initialized, add @transient keyword will cause the variable to be null and report a null pointer error.

  @transient lazy val logger = Logger.getLogger(classOf[T])

2. Scala Class initialization requires corresponding variables

The above logger is not used in the Class initialization phase, so you can use @transient lazy initialization to solve the problem, and some variables cannot be generated with delay initialization, for example, using redis to initialize some variables, if you use @transient, the following error will be reported:

At this time, you need to put the corresponding class that cannot be initialized into the open initialization function, and then define the variable in the class through the var modifier, and execute the actual initialization method in the open function. The Flink RichSourceFunction open function is used as follows:

  var redis: Jedis = _
  var initValue: T = _

  override def open(parameters: Configuration): Unit = {
    redis = getRedisClient(host, port)
    initValue = ... (包含redis读取的初始化方法)
  }

3.Java

error code:

private Logger logger =   Logger.getLogger(T.class)

Correct code:

log4j cannot be serialized. In order to prevent logger from being serialized, it needs to be kept in @transient or static state. The former will cause the same problem as above, namely NullPointException, so it is modified by static + final here.

private static final Logger logger = Logger.getLogger(T.class)

3. Expansion

The above error occurs in the variable logger in class[T]. The variable cannot be serialized and can be solved by the above method. If a class in class[T] cannot be serialized, the java.io.Serializable interface needs to be implemented to ensure the Classes can be serialized. The above serialization problem occurs in the BroadcastStream scenario. Because there are variables in the class T in the broadcastStream that cannot be serialized, the broadcast stream is invalid, and the scala method has been used to solve it perfectly.

Guess you like

Origin blog.csdn.net/BIT_666/article/details/123517030