Sudharsan :
In the below code after groupbyKey, I am getting PCollection>>. How to flatten the Iterable in the value before sending to FileIO.
.apply(GroupByKey.<String, String>create())
.apply("Write file to output",FileIO.< String, KV<String,String>>writeDynamic()
.by(KV::getKey)
.withDestinationCoder(StringUtf8Coder.of())
.via(Contextful.fn(KV::getValue), TextIO.sink())
.to("Out")
.withNaming(key -> FileIO.Write.defaultNaming("file-" + key, ".txt")));
Thanks for the kind help.
Jayadeep Jayaraman :
You need to use a ParDo to flatten the Iterable portion of the PCollection
as shown below:-
PCollection<KV<String, Doc>> urlDocPairs = ...;
PCollection<KV<String, Iterable<Doc>>> urlToDocs =
urlDocPairs.apply(GroupByKey.<String, Doc>create());
PCollection<R> results =
urlToDocs.apply(ParDo.of(new DoFn<KV<String, Iterable<Doc>>, R>() {
{@literal @}ProcessElement
public void processElement(ProcessContext c) {
String url = c.element().getKey();
for <String,Doc> docsWithThatUrl : c.element().getValue();
c.output(docsWithThatUrl)
}}));
Guess you like
Origin http://10.200.1.11:23101/article/api/json?id=400581&siteId=1