【框架解析】Hadoop系统分析(六)--secondarynamenode

secondarynamenode是针对namenode所做的一个镜像备份,以及定时去合并editlog与fsimage内容为checkpoint(默认一个小时)。在namenode发生故障无法启动时,可以使用snn准备的checkpoint文件,在namenode启动时带上-importCheckpoint参数来进行恢复。

不带参数启动,是默认启动secondarynamenode服务,查看org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode入口代码:

StringUtils.startupShutdownMessage(SecondaryNameNode.class, argv, LOG);
    Configuration tconf = new Configuration();
    if (argv.length >= 1) {
      SecondaryNameNode secondary = new SecondaryNameNode(tconf);
      int ret = secondary.processArgs(argv);
      System.exit(ret);
    }

    // Create a never ending deamon
    //不带参数启动的时候,会启动一个不结束的进程,用于执行定时checkpoint
    Daemon checkpointThread = new Daemon(new SecondaryNameNode(tconf));
    //线程方式运行SecondaryNameNode.run
    checkpointThread.start();
实例化SecondaryNameNode对象时,会执行SecondaryNameNode.initialize方法。initialize方法主要初始化了以下几项内容:

  1. 使用nn之前建立的RpcServer(dfs.namenode.servicerpc-address),建立同namenode之间的连接进行通信
        nameNodeAddr = NameNode.getServiceAddress(conf, true);
    
        this.conf = conf;
        this.namenode =(NamenodeProtocol) RPC.waitForProxy(NamenodeProtocol.class,
                NamenodeProtocol.versionID, nameNodeAddr, conf);
  2. 初始化checkpoint的目录以及进行checkpoint的频率(fs.checkpoint.period和fs.checkpoint.size)
    // initialize checkpoint directories
        fsName = getInfoServer();
        //读取fs.checkpoint.dir配置项作为ckp目录,默认为/tmp/hadoop/dfs/namesecondary
        checkpointDirs = FSImage.getCheckpointDirs(conf,"/tmp/hadoop/dfs/namesecondary");
        //读取fs.checkpoint.edits.dir配置项作为ckp edit目录,默认为/tmp/hadoop/dfs/namesecondary
        checkpointEditsDirs = FSImage.getCheckpointEditsDirs(conf,"/tmp/hadoop/dfs/namesecondary");
        //初始化checkpoint和checkpoint.edits目录,如果不存在就创造相应目录
        checkpointImage = new CheckpointStorage();
        checkpointImage.recoverCreate(checkpointDirs, checkpointEditsDirs);
    
        // Initialize other scheduling parameters from the configuration
        //默认执行checkpoint时间间隔为1小时,edit文件大小为4M
        checkpointPeriod = conf.getLong("fs.checkpoint.period", 3600);
        checkpointSize = conf.getLong("fs.checkpoint.size", 4194304);
  3. 启动Http服务
    int tmpInfoPort = infoSocAddr.getPort();
              infoServer = new HttpServer("secondary", infoBindAddress, tmpInfoPort,
                  tmpInfoPort == 0, conf,
                  SecurityUtil.getAdminAcls(conf, DFSConfigKeys.DFS_ADMIN));
    
              if(UserGroupInformation.isSecurityEnabled()) {
                System.setProperty("https.cipherSuites",
                    Krb5AndCertsSslSocketConnector.KRB5_CIPHER_SUITES.get(0));
                InetSocketAddress secInfoSocAddr =
                  NetUtils.createSocketAddr(infoBindAddress + ":"+ conf.get(
                    "dfs.secondary.https.port", infoBindAddress + ":" + 0));
                imagePort = secInfoSocAddr.getPort();
                infoServer.addSslListener(secInfoSocAddr, conf, false, true);
              }
    
              infoServer.setAttribute("name.system.image", checkpointImage);
              infoServer.setAttribute(JspHelper.CURRENT_CONF, conf);
              infoServer.addInternalServlet("getimage", "/getimage",GetImageServlet.class, true);
              infoServer.start();
完成初始化操作后,会单独启动线程,循环执行SecondaryNameNode.run,run()调用了SecondaryNameNode.doWork()方法。

doWork默认每5分钟会进行一次检查,如果editlog的大小超过checkpointSize大小或者距离上一次checkpoint时间超出checkpointPeriod时间,则执行SecondaryNameNode.doCheckpoint:

// Do the required initialization of the merge work area.
    // 开始Checkpoint前的初始化工作主要包括:
    // 1.unlock所有的checkpoint目录
    // 2.关闭checkpoint的editlog文件
    // 3.检查checkpoint目录和checkpoint edit目录是否正常
    // 4.腾出checkpoint目录下的current目录,原current目录更名为lastcheckpoint.tmp
    startCheckpoint();

    // Tell the namenode to start logging transactions in a new edit file
    // Retuns a token that would be used to upload the merged image.
    // 通知namenode开始checkpoint,拿到namenode上的checkpoint标记,打开edits.new的文件流
    CheckpointSignature sig = (CheckpointSignature)namenode.rollEditLog();

    // error simulation code for junit test
    if (ErrorSimulator.getErrorSimulation(0)) {
      throw new IOException("Simulating error0 after creating edits.new");
    }
    //从namenode上下载fsimage文件与editlog文件
    downloadCheckpointFiles(sig);   // Fetch fsimage and edits
    //合并fsimage与editlog文件(将image和Editlog都加载到内存合并后再savenamespace)
    doMerge(sig);                   // Do the merge

    //
    // Upload the new image into the NameNode. Then tell the Namenode
    // to make this new uploaded image as the most current image.
    // 将合并好的checkpoint image上传给namenode
    putFSImage(sig);

    // error simulation code for junit test
    if (ErrorSimulator.getErrorSimulation(1)) {
      throw new IOException("Simulating error1 after uploading new image to NameNode");
    }
    //将合并后的数据文件恢复为工作状态
    //1.fsImage.ckpt重命名为fsImage,原fsImage删除
    //2.edits.new重命名为edits,原edits删除
    //3.打开editlog文件
    namenode.rollFsImage();
    //删除原有的previous.checkpoint
    //将lastcheckpoint.tmp更名为previous.checkpoint
    checkpointImage.endCheckpoint();

    LOG.warn("Checkpoint done. New Image Size: checkpointImage.getFsImageName().length());
时序图如下所示


secondarynamenode的其他参数

  1. geteditsize
    获取namenode上现在editlog文件的大小
  2. checkpoint
    如果此时editlog的size大于checkpointSize值,或者带上了"force"参数,就直接执行SecondaryNameNode.doCheckpoint()方法进行checkpoint

猜你喜欢

转载自blog.csdn.net/shorn/article/details/7891300