Request processing process in Ozone SCM HA mode

Preface


In the previous article, the author explained the design principles of Ozone SCM HA . At present, this functional community has not been fully completed, but most of the main functions have been completed. The author of this article briefly talks about the request processing process in SCM HA mode, which is slightly different from the implementation of Ozone OM HA implemented earlier (the implementation principle of OM HA can refer to this article: Ozone OM service HA principle analysis ). Under the existing OM HA, after Ozone realizes SCM HA, the overall stability of the Ozone system will be greatly enhanced.

The difference between SCM HA and OM HA


Compared with the OM service, the SCM service provides the underlying Block service, and it will manage and maintain more things, including Block, Container, and Pipeline. The OM is more fully focused on the metadata level of KV.

In SCM HA, the metadata types to be updated include the three types of data mentioned above. Because Ozone uses Apache Ratis for data consistency control, SCM also uses this mechanism to synchronize data between the Leader SCM and each Follower SCM.

SCM HA request processing based on InvocationHandler


The following author briefly introduces how SCM HA does request processing.

First, it implements a custom InvocationHandler class to do a logical processing before method invocation. The general flow is as follows:

1) First, SCM receives a method call, and InvocationHandler obtains whether this method is marked with @Replicate annotation.
2) If it is Replicate annotated, then use Ratis's communication method, otherwise only use the logic of local update.
3) In the request update of Ratis, the update of SCM metadata will first be written into a SCM Transaction Buffer. It will not be written directly to disk. The flush write behavior of this Transaction Buffer will not be triggered until SCM take Snapshot. This is to accelerate the speed of SCM metadata update, which is similar to OM's table+double buffer.

The relevant class codes are as follows. First, the Replicate annotation is defined as follows. Let's take the Pipeline update as an example.

/**
 * Manages the state of pipelines in SCM.
 */
public interface PipelineStateManagerV2 {
    
    

  /**
   * Adding pipeline would be replicated to Ratis.
   * @param pipelineProto
   * @throws IOException
   */
  @Replicate
  void addPipeline(HddsProtos.Pipeline pipelineProto) throws IOException;

  /**
   * Removing pipeline would be replicated to Ratis.
   * @param pipelineIDProto
   * @return Pipeline removed
   * @throws IOException
   */
  @Replicate
  void removePipeline(HddsProtos.PipelineID pipelineIDProto)
      throws IOException;

  /**
   * Updating pipeline state would be replicated to Ratis.
   * @param pipelineIDProto
   * @param newState
   * @throws IOException
   */
  @Replicate
  void updatePipelineState(HddsProtos.PipelineID pipelineIDProto,
                           HddsProtos.PipelineState newState)
      throws IOException;
 ...
}

Replicate means that the execution of these methods needs to be executed synchronously in Follower SCM to achieve the consistency of SCM metadata.

Then, implement dynamic proxy settings for the PipelineStateManagerV2 class.

    public PipelineStateManagerV2 build() throws IOException {
    
    
      Preconditions.checkNotNull(pipelineStore);

      final PipelineStateManagerV2 pipelineStateManager =
          new PipelineStateManagerV2Impl(pipelineStore, nodeManager);

      final SCMHAInvocationHandler invocationHandler =
          new SCMHAInvocationHandler(SCMRatisProtocol.RequestType.PIPELINE,
              pipelineStateManager, scmRatisServer);

      return (PipelineStateManagerV2) Proxy.newProxyInstance(
          SCMHAInvocationHandler.class.getClassLoader(),
          new Class<?>[]{
    
    PipelineStateManagerV2.class}, invocationHandler);
    }

The logic of SCMHAInvocationHandler contains the judgment of Replicate annotation:

/**
 * InvocationHandler which checks for {@link Replicate} annotation and
 * dispatches the request to Ratis Server.
 */
public class SCMHAInvocationHandler implements InvocationHandler {
    
    


  private static final Logger LOG = LoggerFactory
      .getLogger(SCMHAInvocationHandler.class);

  private final RequestType requestType;
  private final Object localHandler;
  private final SCMRatisServer ratisHandler;

  /**
   * TODO.
   */
  public SCMHAInvocationHandler(final RequestType requestType,
                                final Object localHandler,
                                final SCMRatisServer ratisHandler) {
    
    
    this.requestType = requestType;
    this.localHandler = localHandler;
    this.ratisHandler = ratisHandler;
  }

  @Override
  public Object invoke(final Object proxy, final Method method,
                       final Object[] args) throws Throwable {
    
    
    try {
    
    
      long startTime = Time.monotonicNow();
      // 判断method call 是否带有了Replicate注释的,然后决定此请求是否用Ratis更新的方式
      final Object result = method.isAnnotationPresent(Replicate.class) ?
          invokeRatis(method, args) : invokeLocal(method, args);
      LOG.debug("Call: {} took {} ms", method, Time.monotonicNow() - startTime);
      return result;
    } catch(InvocationTargetException iEx) {
    
    
      throw iEx.getCause();
    }
  }

  /**
   * TODO.
   */
  private Object invokeLocal(Method method, Object[] args)
      throws InvocationTargetException, IllegalAccessException {
    
    
    LOG.trace("Invoking method {} on target {}", method, localHandler);
    return method.invoke(method, args);
  }

  /**
   * TODO.
   */
  private Object invokeRatis(Method method, Object[] args)
      throws Exception {
    
    
    LOG.trace("Invoking method {} on target {}", method, ratisHandler);
    final SCMRatisResponse response =  ratisHandler.submitRequest(
        SCMRatisRequest.of(requestType, method.getName(), args));
    if (response.isSuccess()) {
    
    
      return response.getResult();
    }
    // Should we unwrap and throw proper exception from here?
    throw response.getException();
  }

In the metadata update of SCM, the data is cached first, and is not directly persisted in the store. The code is as follows

  public void updatePipelineState(
      HddsProtos.PipelineID pipelineIDProto, HddsProtos.PipelineState newState)
      throws IOException {
    
    
    PipelineID pipelineID = PipelineID.getFromProtobuf(pipelineIDProto);
    Pipeline.PipelineState oldState =
        getPipeline(pipelineID).getPipelineState();
    lock.writeLock().lock();
    try {
    
    
      ...
      if (pipelineStore != null) {
    
    
        // 在内存中更新了Pipeline的信息
        pipelineStateMap.updatePipelineState(pipelineID,
            Pipeline.PipelineState.fromProtobuf(newState));
        // 然后更新操作首先是被保存到了SCM transactionBuffer里面的BatchOperation里面
        pipelineStore.putWithBatch(transactionBuffer.getCurrentBatchOperation(),
            pipelineID, getPipeline(pipelineID));
      }
    } catch (IOException ex) {
    
    
      ...;
    }
  }

An operation list is maintained in BatchOperation to save data update operations

/**
 * An utility class to store a batch of DB write operations.
 */
public class BatchOperation {
    
    

  /**
   * Enum for write operations.
   */
  public enum Operation {
    
    
    DELETE, PUT
  }

  private List<SingleOperation> operations =
      Lists.newArrayList();
...
}

These operations of BatchOperation, only in SCM HA mode, when SCM takes snapshots regularly, will the data in BatchOperation be committed to the underlying storage store.

SCMStateMachine#takeSnapsho method code,

  @Override
  public long takeSnapshot() throws IOException {
    
    
    long startTime = Time.monotonicNow();
    TermIndex lastTermIndex = getLastAppliedTermIndex();
    long lastAppliedIndex = lastTermIndex.getIndex();
    SCMTransactionInfo lastAppliedTrxInfo =
        SCMTransactionInfo.fromTermIndex(lastTermIndex);
    if (transactionBuffer.getLatestTrxInfo()
        .compareTo(lastAppliedTrxInfo) < 0) {
    
    
      transactionBuffer.updateLatestTrxInfo(
          SCMTransactionInfo.builder()
              .setCurrentTerm(lastTermIndex.getTerm())
              .setTransactionIndex(lastTermIndex.getIndex())
              .build());
      transactionBuffer.setLatestSnapshot(
          transactionBuffer.getLatestTrxInfo().toSnapshotInfo());
    } else {
    
    
      lastAppliedIndex =
          transactionBuffer.getLatestTrxInfo().getTransactionIndex();
    }
    // Flush transactionBuffer中的数据出去
    transactionBuffer.flush();
    LOG.info("Current Snapshot Index {}, takeSnapshot took {} ms",
        lastAppliedIndex, Time.monotonicNow() - startTime);
    return lastAppliedIndex;
  }

SCMDBTransactionBuffer#flush方法,

  @Override
  public void flush() throws IOException {
    
    
    // write latest trx info into trx table in the same batch
    Table<String, SCMTransactionInfo> transactionInfoTable
        = metadataStore.getTransactionInfoTable();
    transactionInfoTable.putWithBatch(currentBatchOperation,
        TRANSACTION_INFO_KEY, latestTrxInfo);
    // commit BatchOperation里的数据更新操作到底层store里
    metadataStore.getStore().commitBatchOperation(currentBatchOperation);
    currentBatchOperation.close();
    this.latestSnapshot = latestTrxInfo.toSnapshotInfo();
    // reset batch operation
    currentBatchOperation = metadataStore.getStore().initBatchOperation();

    DeletedBlockLog deletedBlockLog = scm.getScmBlockManager()
        .getDeletedBlockLog();
    Preconditions.checkArgument(
        deletedBlockLog instanceof DeletedBlockLogImplV2);
    ((DeletedBlockLogImplV2) deletedBlockLog).onFlush();
  }

SCM HA request processing process diagram


The SCM HA request processing process diagram is as follows: The
Insert picture description here
above ContainerStateManagerV2, PipelineManagerV2 and DeletedBlockLogStateManager correspond to the update of SCM Container, Pipeline and Block information respectively, so SCM requests are finally dispatched to these 3 classes.

Reference link


[1].https://blog.csdn.net/Androidlushangderen/article/details/105669525
[2].https://blog.csdn.net/Androidlushangderen/article/details/103997315

Guess you like

Origin blog.csdn.net/Androidlushangderen/article/details/113799763