[Considerably] Flink State

state the point of consumption of each stored data after the consumption data (production environments require persistence of these states), when Job because of some error or other causes restart, can from the checkpoint (the timing of the state to make a global snapshot, in Flink, in order to be able to make Job ensure fault tolerance in the process of running, will take a snapshot of the state, saying more about state data) in the section 4.3 for recovery

  • Keyed State is always associated with a particular key, it can only be used on KeyedStream the function and operator. You can Keyed State Operator State as a special case, but it is a partition or slice. Keyed State each partition corresponds to a key of Operator State, a unique key for a certain state on a partition.
  • Each operator state corresponds to a parallel instance. Kafka Connector is a good example. Examples of each of the parallel Kafka consumer will hold a topic partition and o ff set of the map, this map is its Operator State.

How to use managed Keyed State

public class CountWindowAverage extends RichFlatMapFunction<Tuple2<Long, Long>, Tuple2<Long, Long>> {

//ValueState 使用方式,第一个字段是 count,第二个字段是运行的和 private transient ValueState<Tuple2<Long, Long>> sum;

	@Override
	public void flatMap(Tuple2<Long, Long> input, Collector<Tuple2<Long, Long>> out) throws Exception {

//访问状态的 value 值 Tuple2<Long, Long> currentSum = sum.value();

//更新 count currentSum.f0 += 1;

//更新 sum currentSum.f1 += input.f1;

//更新状态 sum.update(currentSum);

//如果 count 等于 2, 发出平均值并清除状态 if (currentSum.f0 >= 2) {

		out.collect(new Tuple2<>(input.f0, currentSum.f1 / currentSum.f0));

		sum.clear(); 
	}

	

	@Override
	public void open(Configuration config) { 
		ValueStateDescriptor<Tuple2<Long, Long>> descriptor = new ValueStateDescriptor<>( "average", //状态名称 
		TypeInformation.of(new TypeHint<Tuple2<Long, Long>>() {}), //类型信息 
		Tuple2.of(0L, 0L)); //状态的默认值

		sum = getRuntimeContext().getState(descriptor);//获取状态

} }

	env.fromElements(Tuple2.of(1L, 3L), Tuple2.of(1L, 5L), Tuple2.of(1L, 7L), Tuple2.of(1L, 4L), Tuple2.of(1L, 2L))

.keyBy(0) .flatMap(new CountWindowAverage()) .print();

//结果会打印出 (1,4) 和 (1,5)
Published 78 original articles · won praise 0 · Views 1405

Guess you like

Origin blog.csdn.net/qq_30782921/article/details/103568943