ORCfile Sink encountered in the development of the pit

Recently a hdfs based on the sink, the file for writing hdfs orc, middle encountered a few pit, following the ideas and issues encountered 11 recorded.

1. The development of ideas
first implementation scenario is this: take data from the channel -> After the sink to get the data do classification -> Classification after data is written to the corresponding orc file -> file is closed. Techniques like this:

1.1 thread pool management hdfs operating
in classified documents written orc this session, I opened two thread pools, one for each type of operation management hdfs file, create / write / close orc file on hdfs; one for file management rolling, at a certain timing, for example, a file is written to reach the upper limit of the number of pieces or the file reaches a certain idle time, to close the current file and create new file.

This is the code logic operation hdfs thread pool:

// create a thread pool thread execution object
callTimeoutPool = Executors.newFixedThreadPool (threadsPoolSize,
            new new ThreadFactoryBuilder () setNameFormat (timeoutName) .build ().);
 
// callTimeoutPool thread of execution, the task inherited from callable, returned Future object after execution
Private <T> T callWithTimeout (Final callRunner <T> callRunner)
      throws IOException, InterruptedException {
    Future <T> = callTimeoutPool.submit Future (a Callable new new <T> () {
      ...
    } 
}
 
// SUMMARY: a message orc file written
callWithTimeout (new new CallRunner <Void> () {
        @Override
        public Void Call () throws Exception {
          Writer.append (Message); // HDFS operations: write data
          return null;
        }
      });
1.2 write orc file realize
how data has been divided into many categories, then get the data into the format orc written hdfs it? First of all, when you create the orc file structure needs TypeDescription fields for describing orc file, and then create a batch by batch for the record is written orc file, this step is relatively simple, online many examples, not described in detail.

import org.apache.orc.TypeDescription;
import org.apache.orc.OrcFile.*;
import org.apache.orc.Writer;
import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
...
TypeDescription schema = TypeDescription.createStruct();
VectorizedRowBatch batch = schema.createRowBatch(maxFlushSize);
ORCFile.writerOptions writerOptions = ORCFile.writerOptions(conf).setSchema(schema);
Writer writer = ORCFile.createWriter(destPath, writerOptions);
... //add message into batch
writer.addRowBatch(batch);
Second orc's main achievement in WriterImpl and MemoryManagerImpl class. WriterImpl the number of records written addRowBatch method will spread MemoryManagerImpl, the number of records MemoryManagerImpl addedRow method of writing will be accumulated; when the number is greater than 5000 bar, the record callback WriterImpl checkMemory, check the number of bytes written occupied, when greater than orc single compression block size (default 64M), the truly flush data to hdfs.

//WriterImpl.java
 
public void addRowBatch(VectorizedRowBatch batch) throws IOException {
    ... //load message to orc
    memoryManager.addedRow(batch.size);
}
 
public boolean checkMemory(double newScale) throws IOException {
    long limit = (long) Math.round(adjustedStripeSize * newScale);
    long size = treeWriter.estimateMemory();
    if (LOG.isDebugEnabled()) {
      LOG.debug("ORC writer " + physicalWriter + " size = " + size +
          " limit = " + limit);
    }
    if (size > limit) {
      flushStripe();  //flush data to hdfs
      return true;
    }
    return false;
  }
//MemoryManagerImpl.java
 
public void addedRow (int rows) throws IOException {
    rowsAddedSinceCheck + = rows;
    IF (rowsAddedSinceCheck> = ROWS_BETWEEN_CHECKS) {// 5000 = ROWS_BETWEEN_CHECKS
      notifyWriters ();
    }
  }
 
public void notifyWriters () throws IOException {
    ...
    for (WriterInfo Writer: writerList.values ()) {
      Boolean flushed = writer.callback.checkMemory (currentScale); // Call Back
      ...
    }
 }
1.3 a first hole
to here, the first encountered hole. Familiar flume should know (or spotted an article), the transaction is a mechanism to ensure the reliability of the flume.

In the sink, the batch message after the end of the process, if an accident does not occur, the transaction will be carried out to confirm. This step Popular, tell me these channel data has been properly placed in the sink, no matter what the problem occurs again procedures, processing of these data will not be affected, you can put these data in the channel cache deleted.

//sink.java
public the Status Process () throws EventDeliveryException {
    Channel Channel the getChannel = ();
    the Transaction Transaction channel.getTransaction = ();
    Transaction.Begin (); // start transaction
    for (txnEventCount = 0; txnEventCount < batchSize; txnEventCount ++ ) {
       ... // handle messages
    }
    transaction.commit (); // confirm the transaction
}
However, if a file is written, orc, you will find that after the transaction confirmed, these data are still likely to memory. Because of said step 1.2, each writer must 5000 is full of data or more, and the byte size of more than one compression block in order to be written to the file orc. If you follow the logic of the original, flume is forced to kill off (normal shutdown will not lose data), the data to be confirmed if the batch is still in memory, so these data is completely lost, it will undermine the reliability of the flume.

This problem really bothering me for some time. Later the method, each batch is to turn up the maximum number of records logic 5000 determines to remove and open the compressed block size limit, so that each batch can submit the file to the complete flush. But this method is only a temporary solution for the problem, I do not know whether there is a better way to do it?
 

Guess you like

Origin blog.csdn.net/wjandy0211/article/details/93197040