[flink]#13_Data Set

DataSource

  1. Based on the set
    fromCollection(Collection)

  2. Based on documents
    readTextFile(path)

Transformation

  • Map

  • FlatMap

  • MapPartition: once a partition of data processing

  • Filter

  • Reduce

  • Aggregations

  • Distinct: Returns the data set element of a deduplication

  • Join

  • OuterJoin

  • Cross

  • Union

  • First-n: acquiring first n elements of the collection

  • Sort Partition: Sort all partitions

  • Rebalance:

  • Hash-Partition: the hash value of the specified key data set partition
    partitionByHash()

  • Range-Partition: range-partitioned data set according to the specified key
    .partitionByRange

  • Custom Partition
    partitionCustom(partitioner, "someKey")
    partitionCustom(partitioner, 0)

Sink

  • writeAsText()
  • writeAsCsv()
  • print()
Published 78 original articles · won praise 0 · Views 1416

Guess you like

Origin blog.csdn.net/qq_30782921/article/details/102839066