Not long ago, Flink community released FLink 1.9 version, in which contains a very important new feature, namely
state processor api, this framework supports checkpoint and savepoint operation, including
read, change, write, and so on. Here we have a concrete example to illustrate how to use this framework.
1. First, we create a sample job to generate savepoint
Main Class Code
1 final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); 2 env.enableCheckpointing(60*1000); 3 DataStream<Tuple2<Integer,Integer>> kafkaDataStream = 4 env.addSource(new SourceFunction<Tuple2<Integer,Integer>>() { 5 private boolean running = true; 6 private int key; 7 private int value; 8 private Random random = new Random(); 9 @Override 10 public void run(SourceContext<Tuple2<Integer,Integer>> sourceContext) throws Exception { 11 while (running){ 12 key = random.nextInt(5); 13 sourceContext.collect(new Tuple2<>(key,value++) ); 14 Thread.sleep(100); 15 } 16 } 17 18 @Override 19 public void cancel() { 20 running = false; 21 } 22 }).name("source").uid("source"); 23 24 25 kafkaDataStream 26 .keyBy(tuple -> tuple.f0) 27 .map(new StateTest.StateMap()).name("map").uid("map") 28 .print().name("print").uid("print");
In the above code, only need to pay attention in a custom source, a message transmitted tuple2 while doing savepoint is
critical that the state, the state in StateMap this class, as follows:
1 public static class StateMap extends RichMapFunction<Tuple2<Integer,Integer>,String> { 2 private transient ListState<Integer> listState; 3 4 @Override 5 public void open(Configuration parameters) throws Exception { 6 ListStateDescriptor<Integer> lsd = 7 new ListStateDescriptor<>("list",TypeInformation.of(Integer.class)); 8 listState = getRuntimeContext().getListState(lsd); 9 } 10 11 @Override 12 public String map(Tuple2<Integer,Integer> value) throws Exception { 13 listState.add(value.f1); 14 return value.f0+"-"+value.f1; 15 } 16 17 @Override 18 public void close() throws Exception { 19 listState.clear(); 20 } 21 }
Map above in the first statement in a ListState open, and then the message processing logic, but also very simple to put the value of the tuple2
listState in. And submit the job, the job runs after a period of time, triggering a savepoint, and record the address savepoint. This completes the
data preparation state processor api verification work.
2. Use savepoint state processor api read
this step is simply to verify the savepoint can be correctly read, as follows:
1 public class ReadListState { 2 protected static final Logger logger = LoggerFactory.getLogger(ReadListState.class); 3 4 public static void main(String[] args) throws Exception { 5 final String operatorUid = "map"; 6 final String savepointPath = 7 "hdfs://xxx/savepoint-41b05d-d517cafb61ba"; 8 9 final String checkpointPath = "hdfs://xxx/checkpoints"; 10 11 // set up the batch execution environment 12 final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); 13 14 RocksDBStateBackend db = new RocksDBStateBackend(checkpointPath); 15 DataSet<String> dataSet = Savepoint 16 .load(env, savepointPath, db) 17 .readKeyedState(operatorUid, new ReaderFunction()) 18 .flatMap(new FlatMapFunction<KeyedListState, String>() { 19 @Override 20 public void flatMap(KeyedListState keyedListState, Collector<String> collector) throws Exception { 21 keyedListState.value.forEach(new Consumer<Integer>() { 22 @Override 23 public void accept(Integer integer) { 24 collector.collect(keyedListState.key + "-" + integer); 25 } 26 }); 27 } 28 }); 29 30 dataSet.writeAsText("hdfs://xxx/test/savepoint/bravo"); 31 32 // execute program 33 env.execute("read the list state"); 34 } 35 36 static class KeyedListState { 37 Integer key; 38 List<Integer> value; 39 } 40 41 static class ReaderFunction extends KeyedStateReaderFunction<Integer, KeyedListState> { 42 private transient ListState<Integer> listState; 43 44 @Override 45 public void open(Configuration parameters) { 46 ListStateDescriptor<Integer> lsd = 47 new ListStateDescriptor<>("list", TypeInformation.of(Integer.class)); 48 listState = getRuntimeContext().getListState(lsd); 49 } 50 51 @Override 52 public void readKey( 53 Integer key, 54 Context ctx, 55 Collector<KeyedListState> out) throws Exception { 56 List<Integer> li = new ArrayList<>(); 57 listState.get().forEach(new Consumer<Integer>() { 58 @Override 59 public void accept(Integer integer) { 60 li.add(integer); 61 } 62 }); 63 64 KeyedListState kl = new KeyedListState(); 65 kl.key = key; 66 kl.value = li; 67 68 out.collect(kl); 69 } 70 } 71 }
After reading in a state where savepoint successfully move it a document, the following parts of the file, the contents of each row are key-value pairs:
3. Using state processor api rewriting savepoint
savepoint is cured to an operating point of the state program to facilitate the procedures for re-submitted at the time of connection, but sometimes need to savepoint in a state
rewritten to facilitate the specific state to start the job.
1 public class ReorganizeListState { 2 protected static final Logger logger = LoggerFactory.getLogger(ReorganizeListState.class); 3 public static void main(String[] args) throws Exception { 4 final String operatorUid = "map"; 5 final String savepointPath = 6 "hdfs://xxx/savepoint-41b05d-d517cafb61ba"; 7 8 final String checkpointPath = "hdfs://xxx/checkpoints"; 9 10 // set up the batch execution environment 11 final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); 12 13 RocksDBStateBackend db = new RocksDBStateBackend(checkpointPath); 14 DataSet<KeyedListState> dataSet = Savepoint 15 .load(env,savepointPath,db) 16 .readKeyedState(operatorUid,new ReaderFunction()) 17 .flatMap(new FlatMapFunction<KeyedListState, KeyedListState>() { 18 @Override 19 public void flatMap(KeyedListState keyedListState, Collector<KeyedListState> collector) throws Exception { 20 KeyedListState newState = new KeyedListState(); 21 newState.value = keyedListState.value.stream() 22 .map( x -> x+10000).collect(Collectors.toList()); 23 newState.key = keyedListState.key; 24 collector.collect(newState); 25 } 26 }); 27 28 BootstrapTransformation<KeyedListState> transformation = OperatorTransformation 29 .bootstrapWith(dataSet) 30 .keyBy(acc -> acc.key) 31 .transform(new KeyedListStateBootstrapper()); 32 33 Savepoint.create(db,128) 34 .withOperator(operatorUid,transformation) 35 .write("hdfs://xxx/test/savepoint/"); 36 37 // execute program 38 env.execute("read the list state"); 39 } 40 41 static class KeyedListState{ 42 Integer key; 43 List<Integer> value; 44 } 45 46 static class ReaderFunction extends KeyedStateReaderFunction<Integer, KeyedListState> { 47 private transient ListState<Integer> listState; 48 49 @Override 50 public void open(Configuration parameters) { 51 ListStateDescriptor<Integer> lsd = 52 new ListStateDescriptor<>("list",TypeInformation.of(Integer.class)); 53 listState = getRuntimeContext().getListState(lsd); 54 } 55 56 @Override 57 public void readKey( 58 Integer key, 59 Context ctx, 60 Collector<KeyedListState> out) throws Exception { 61 List<Integer> li = new ArrayList<>(); 62 listState.get().forEach(new Consumer<Integer>() { 63 @Override 64 public void accept(Integer integer) { 65 li.add(integer); 66 } 67 }); 68 69 KeyedListState kl = new KeyedListState(); 70 kl.key = key; 71 kl.value = li; 72 73 out.collect(kl); 74 } 75 } 76 77 static class KeyedListStateBootstrapper extends KeyedStateBootstrapFunction<Integer, KeyedListState> { 78 private transient ListState<Integer> listState; 79 80 @Override 81 public void open(Configuration parameters) { 82 ListStateDescriptor<Integer> lsd = 83 new ListStateDescriptor<>("list",TypeInformation.of(Integer.class)); 84 listState = getRuntimeContext().getListState(lsd); 85 } 86 87 @Override 88 public void processElement(KeyedListState value, Context ctx) throws Exception { 89 listState.addAll(value.value); 90 } 91 } 92 }
The key here is that according to the read out step dataSet, in the process of converting all of the accumulated value 10000, and then to construct a BootstrapTransformation dataSet this as input, and creates an empty that the savepoint,
and the status of the specified operatorUid written as a savepoint, written final success, got a new savepoint, this new savepoint contain
state in value compared to the original value has changed.
4. Verify that the new production savepoint is available
due to the verification of the state is ListState, in other words, it is KeyedState, and KeyedState belongs Flink hosted state, meaning Flink own
control logic preservation and restoration of the state, so in order to verify the correct operation the start of the new savepoint, StateMap before rewrite of the following:
1 public static class StateMap extends RichMapFunction<Tuple2<Integer,Integer>,String> { 2 private transient ListState<Integer> listState; 3 4 @Override 5 public void open(Configuration parameters) throws Exception { 6 ListStateDescriptor<Integer> lsd = 7 new ListStateDescriptor<>("list",TypeInformation.of(Integer.class)); 8 listState = getRuntimeContext().getListState(lsd); 9 } 10 11 @Override 12 public String map(Tuple2<Integer,Integer> value) throws Exception { 13 listState.add(value.f1); 14 log.info("get value:{}-{}",value.f0,value.f1); 15 StringBuilder sb = new StringBuilder(); 16 listState.get().forEach(new Consumer<Integer>() { 17 @Override 18 public void accept(Integer integer) { 19 sb.append(integer).append(";"); 20 } 21 }); 22 log.info("***********************taskNameAndSubTask:{},restored value:{}" 23 ,getRuntimeContext().getTaskNameWithSubtasks(),sb.toString()); 24 return value.f0+"-"+value.f1; 25 } 26 27 @Override 28 public void close() throws Exception { 29 listState.clear(); 30 } 31 }
Unable to get data corresponding recovery immediately after the state of recovery, after news reached here at a time when the output of the next state in the content, alternative and see
if the recovery is successful, the results are as follows:
Facie the above figure can be compared to the output of key 4, can be seen that the modified value is the value of the output of the verification is successful.
5. Conclusion
Flink into the state KeyedState, OperatorState and BroadcastState, the state processor api are provided corresponding processing interfaces.
In addition, for keyedState, the degree of parallelism if the job has changed how? If Key changed how? We need to be further explored.
See the official document:
https://flink.apache.org/feature/2019/09/13/state-processor-api.html
https://ci.apache.org/projects/flink/flink-docs-release-1.9 /dev/libs/state_processor_api.html