flume modify the source code implementation of prefixes and suffixes to change the name of the source file

Business scene:

Demand: Data collected by flume, collected csv file local (windows server) to continue to produce the hdfs.

Question: In the process of generating a local file, the file name will be repeated phenomenon. In other words, the former will be one second generates a file named after aaa.csv, the file is processed through the flume change the file name, the file name will be changed to aaa.csv.COMPLATED by default, but in the second sec time, followed aaa.csv generated file, then the file processed flume renaming procedure, the error will be, for example:

ED1C3E2B-E2D0-4a01-9DC5-240D660057AA

Solution: In order to avoid duplicate file names cause problems hanging flume program, then there are two solutions:

1: In the process of generating aaa.csv added that uniquely identifies the file name

2: flume after capture file, change the file name when its done uniquely identifies

This introduces the second way, by modifying the source code of the way.

First, download the source code of the flume, import, compile

I am using the version here is flume1.9

1 Download

    flume download package: package download

    Download flume Source: Source download

Second introduction IDEA

After the source package decompress decompression, after decompression, listing results as follows:

image

IDEA import project

image

image

Continue to click next,

image

After successful introduction, project structure shown below:

image

We can see the project through each model to structure constituted.

3. Compile

Execute the command: mvn clean install -Dmaven.test.skip = true

image

See BUILD SUCCESS represents the compilation is successful, the next step

Modify the source code

It should be carried out to find the flume that class mobile directory after collecting the data, this class are generally in the flume-ng-core this model, but specifically that class, we can find from the collection log flume,

image

We can see from this log, flume need to operate a mobile source files after collecting the file, and rename the file that is carried out at this time, so we can find the class in the source code, by global search

image

Red box here is the flume after collecting complete data source file to change the name of the place, the default name is ".COMPLATED", at this time if we are to be uniquely distinguished to the file, I here by increasing timestamp way:

image

After the addition was complete, the class recompiled, alas the corresponding class files to find the corresponding target file, replace the copy, copy to flume-ng-core.jar package under the line lib directory can Flume. Of course, you can recompile packages for the entire project, but in that case only relatively little trouble.

image

The package uses 360 above the jar compression with other compression tool to open the cover into the compiled class files to

Then re-run the flume, you can see, this time the file name suffix will become:

image

Successful implementation.

Guess you like

Origin www.cnblogs.com/Gxiaobai/p/12230022.html