How to read output file for collecting stats (post) processing

Thomas Humphries :

Summary

I need to build a set of statistics during a Camel server in-modify-out process, and emit those statistics as one object (a single json log line). Those statistics need to include:

  • input file metrics (size/chars/bytes and other, file-section specific measures)
  • processing time statistics (start/end/duration of processing time, start/end/duration of metrics gathering time)
  • output file metrics (same as input file metrics, and will be different numbers, output file being changed)

The output file metrics are the problem as I can't access the file until it's written to disk, and its not written to disk until 'process'ing finishes

Background

A log4j implementation is being used for service logging, but after some tinkering we realised it really doesn't suit the requirement here as it would output multi-line json and embed the json into a single top-level field. We need varying top level fields, depending on the file processed.

The server is expected to deal with multiple file operations asynchronously, and the files vary in size (from tiny to fairly immense - which is one reason we need to iterate stats and measures before we start to tune or review)

Current State

input file and even processing time stats are working OK, and I'm using the following technique to get them:

Inside the 'process' override method of "MyProcessor" I create a new instance of my JsonLogWriter class. (shortened pseudo code with ellipsis)

import org.apache.camel.Exchange;
import org.apache.camel.Processor;
...
@Component
public class MyProcessor implements Processor {
...
@Override
 public void process(Exchange exchange) throws Exception {
... 
JsonLogWriter jlw = new JsonLogWriter();
jlw.logfilePath = jsonLogFilePath;
jlw.inputFilePath = inFilePath;
jlw.outputfilePath = outFilePath;
...
jlw.metricsInputFile();   //gathers metrics using inputFilePath - OK
...

(input file is processed / changed and returned as an inputstream:

InputStream result = myEngine.readAndUpdate(inFilePath);
... get timings
jlw.write

}

From this you can see that JsonLogWriter has

  • properties for file paths (input file, output file, log output),
  • a set of methods to populate data:
  • a method to emit the data to a file (once ready)

Once I have populated all the json objects in the class, I call the write() method and the class pulls all the json objects together and the stats all arrive in a log file (in a single line of json) - OK.

Error - no output file (yet)

If I use the metricsOutputFile method however:

InputStream result = myEngine.readAndUpdate(inFilePath);
... get timings

jlw.metricsOutputFile(); // using outputfilePath

jlw.write

}

... the JsonLogWriter fails as the file doesn't exist yet.

java.nio.file.NoSuchFileException: aroute\output\a_long_guid_filename

when debugging I can't see any part of the exchange or result objects which I might pipe into a file read/statistics gathering process.

Will this require more camel routes to solve? What might be an alternative approach where I can get all the stats from input and output files and keep them in one object / line of json?

(very happy to receive constructive criticism - as in why is your Java so heavy-handed - and yes it may well be, I am prototyping solutions at this stage, so this isn't production code, nor do I profess deep understanding of Java internals - I can usually get stuff working though)

Raúl Cancino :

Use one route and two processors: one for writing the file and the next for reading the file, so one finishes writing before the other starts reading

Or , also you can use two routes: one for writing the file (to:file) and other that listens to read the file(from:file)

You can check for common EIP patterns that will solve most of this questions here: https://www.enterpriseintegrationpatterns.com/patterns/messaging/

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=124116&siteId=1