关于Yarn源码那些事-前传之NodeManager（一）初始化篇

本文，说下NodeManager篇：本文重在于介绍初始化部分：

还是从start-yarn.sh的脚本追本溯源，最后发现启动的类是NodeManager：

package org.apache.hadoop.yarn.server.nodemanager;

public static void main(String[] args) {
		Thread.setDefaultUncaughtExceptionHandler(new YarnUncaughtExceptionHandler());
		StringUtils.startupShutdownMessage(NodeManager.class, args, LOG);
		NodeManager nodeManager = new NodeManager();
		Configuration conf = new YarnConfiguration();
		nodeManager.initAndStartNodeManager(conf, false);
	}

直接从main方法开始看起：

		NodeManager nodeManager = new NodeManager();

看这句话，是NodeManager的初始化，采用的是父类CompositeService的方法，最终调度到AbstractService内：

/**
   * Construct the service.
   * @param name service name
   */
  public AbstractService(String name) {
    this.name = name;
    stateModel = new ServiceStateModel(name);
  }

/**
 * Implements the service state model.
 */
@Public
@Evolving
public class ServiceStateModel

/**
   * Create the service state model in the {@link Service.STATE#NOTINITED}
   * state.
   */
  public ServiceStateModel(String name) {
    this(name, Service.STATE.NOTINITED);
  }

/** Constructed but not initialized */
		NOTINITED(0, "NOTINITED"),

初始化的过程中，给NodeManager初始化了一个状态模型，服务初始状态是STATE.NOTINITED，构建而未曾初始化，这里，我们必须要对状态集中注意力，因为yarn重要的核心就在于基于状态转换的异步处理机制。

接下来，看相应配置的初始化：

		Configuration conf = new YarnConfiguration();

这个没什么可说的，等到用到相应配置再看，YarnConfiguration内部定义了很多的相应参数，前面这几句代码看起来都简单，那么，重头戏肯定就在这里了：

		nodeManager.initAndStartNodeManager(conf, false);

private void initAndStartNodeManager(Configuration conf, boolean hasToReboot) {
		try {

			// Remove the old hook if we are rebooting.
			if (hasToReboot && null != nodeManagerShutdownHook) {
				ShutdownHookManager.get().removeShutdownHook(nodeManagerShutdownHook);
			}

			nodeManagerShutdownHook = new CompositeServiceShutdownHook(this);
			ShutdownHookManager.get().addShutdownHook(nodeManagerShutdownHook, SHUTDOWN_HOOK_PRIORITY);
			// System exit should be called only when NodeManager is instantiated from
			// main() funtion
			this.shouldExitOnShutdownEvent = true;
			this.init(conf);
			this.start();
		} catch (Throwable t) {
			LOG.fatal("Error starting NodeManager", t);
			System.exit(-1);
		}
	}

果然，前面的检查和钩子我们不看了，直接研究其init方法。

@Override
	public void init(Configuration conf) {
    if (conf == null) {
      throw new ServiceStateException("Cannot initialize service "
                                      + getName() + ": null configuration");
    }
    if (isInState(STATE.INITED)) {
      return;
    }
    synchronized (stateChangeLock) {
      if (enterState(STATE.INITED) != STATE.INITED) {
        setConfig(conf);
        try {
          serviceInit(config);
          if (isInState(STATE.INITED)) {
            //if the service ended up here during init,
            //notify the listeners
            notifyListeners();
          }
        } catch (Exception e) {
          noteFailure(e);
          ServiceOperations.stopQuietly(LOG, this);
          throw ServiceStateException.convert(e);
        }
      }
    }
  }

init方法，最终调用到了AbstractService的init方法，而内部的重要实现则是serviceInit方法，这是NodeManager自身的方法，注意，这里的判断都是能通过的，因为我们最初的状态时NOTINITED。

我们看看里面调用的serviceInit方法传入的参数，发现传入的是AbstractService内部的conf，而这个conf是从哪儿加载来的？

protected void serviceInit(Configuration conf) throws Exception {
    if (conf != config) {
      LOG.debug("Config has been overridden during init");
      setConfig(conf);
    }
  }

原来在这里，把我们的YarnConfiguration加载为了AbstractService内部的Configuration：

我们看serviceInit方法：

@Override
	protected void serviceInit(Configuration conf) throws Exception {

		conf.setBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY, true);

		rmWorkPreservingRestartEnabled = conf.getBoolean(YarnConfiguration.RM_WORK_PRESERVING_RECOVERY_ENABLED,
				YarnConfiguration.DEFAULT_RM_WORK_PRESERVING_RECOVERY_ENABLED);

		initAndStartRecoveryStore(conf);

		NMContainerTokenSecretManager containerTokenSecretManager = new NMContainerTokenSecretManager(conf, nmStore);

		NMTokenSecretManagerInNM nmTokenSecretManager = new NMTokenSecretManagerInNM(nmStore);

		recoverTokens(nmTokenSecretManager, containerTokenSecretManager);

		this.aclsManager = new ApplicationACLsManager(conf);

		ContainerExecutor exec = ReflectionUtils.newInstance(conf.getClass(YarnConfiguration.NM_CONTAINER_EXECUTOR,
				DefaultContainerExecutor.class, ContainerExecutor.class), conf);
		try {
			exec.init();
		} catch (IOException e) {
			throw new YarnRuntimeException("Failed to initialize container executor", e);
		}
		DeletionService del = createDeletionService(exec);
		addService(del);

		// NodeManager level dispatcher
		this.dispatcher = new AsyncDispatcher();

		nodeHealthChecker = new NodeHealthCheckerService();
		addService(nodeHealthChecker);
		dirsHandler = nodeHealthChecker.getDiskHandler();

		this.context = createNMContext(containerTokenSecretManager, nmTokenSecretManager, nmStore);

		nodeStatusUpdater = createNodeStatusUpdater(context, dispatcher, nodeHealthChecker);

		NodeResourceMonitor nodeResourceMonitor = createNodeResourceMonitor();
		addService(nodeResourceMonitor);

		containerManager = createContainerManager(context, exec, del, nodeStatusUpdater, this.aclsManager, dirsHandler);
		addService(containerManager);
		((NMContext) context).setContainerManager(containerManager);

		WebServer webServer = createWebServer(context, containerManager.getContainersMonitor(), this.aclsManager,
				dirsHandler);
		addService(webServer);
		((NMContext) context).setWebServer(webServer);

		dispatcher.register(ContainerManagerEventType.class, containerManager);
		dispatcher.register(NodeManagerEventType.class, this);
		addService(dispatcher);

		DefaultMetricsSystem.initialize("NodeManager");

		// StatusUpdater should be added last so that it get started last
		// so that we make sure everything is up before registering with RM.
		addService(nodeStatusUpdater);
		((NMContext) context).setNodeStatusUpdater(nodeStatusUpdater);

		super.serviceInit(conf);
		// TODO add local dirs to del
	}

内容很长，抽丝剥茧，一点点看。

首先是initAndStartRecoveryStore：

private void initAndStartRecoveryStore(Configuration conf) throws IOException {
		boolean recoveryEnabled = conf.getBoolean(YarnConfiguration.NM_RECOVERY_ENABLED,
				YarnConfiguration.DEFAULT_NM_RECOVERY_ENABLED);
		if (recoveryEnabled) {
			FileSystem recoveryFs = FileSystem.getLocal(conf);
			String recoveryDirName = conf.get(YarnConfiguration.NM_RECOVERY_DIR);
			if (recoveryDirName == null) {
				throw new IllegalArgumentException(
						"Recovery is enabled but " + YarnConfiguration.NM_RECOVERY_DIR + " is not set.");
			}
			Path recoveryRoot = new Path(recoveryDirName);
			recoveryFs.mkdirs(recoveryRoot, new FsPermission((short) 0700));
			nmStore = new NMLeveldbStateStoreService();
		} else {
			nmStore = new NMNullStateStoreService();
		}
		nmStore.init(conf);
		nmStore.start();
	}

默认情况下，recoveryEnabled为false，我们直接分析else的代码，其中的init方法，最后还是要走自己的serviceInit方法：

/** Initialize the state storage */
	@Override
	public void serviceInit(Configuration conf) throws IOException {
		initStorage(conf);
	}

initStorage内无动作，而且start方法调用的storeStorage方法内也无实现：

继续往下看：

ContainerExecutor exec = ReflectionUtils.newInstance(conf.getClass(YarnConfiguration.NM_CONTAINER_EXECUTOR,
				DefaultContainerExecutor.class, ContainerExecutor.class), conf);
		try {
			exec.init();
		} catch (IOException e) {
			throw new YarnRuntimeException("Failed to initialize container executor", e);
		}

在RM和NM的交互中，Container经常被使用到，而在NodeManager初始化的时候，其就必须知道自己到底有多少可用的Container，而实际的计算和分配，则是由ContainerExecutor来实现的，默认的实现类是：DefaultContainerExecutor:

// NodeManager level dispatcher
		this.dispatcher = new AsyncDispatcher();

这句注释很明确，NodeManager级别的调度器，自然还有其他level的调度器，而这个主要用于管理需要NodeManager来处理的事件：

nodeHealthChecker = new NodeHealthCheckerService();
		addService(nodeHealthChecker);
		dirsHandler = nodeHealthChecker.getDiskHandler();

这个nodeHealthChecker，是用于NM节点健康状态的检测，这里调用的addService，是为了最后的统一初始化调用，所以我们要看看其内部的serviceInit方法：

@Override
	protected void serviceInit(Configuration conf) throws Exception {
		if (NodeHealthScriptRunner.shouldRun(conf)) {
			nodeHealthScriptRunner = new NodeHealthScriptRunner();
			addService(nodeHealthScriptRunner);
		}
		addService(dirsHandler);
		super.serviceInit(conf);
	}

/*
   * Method which initializes the values for the script path and interval time.
   */
  @Override
  protected void serviceInit(Configuration conf) throws Exception {
    this.conf = conf;
    this.nodeHealthScript = 
        conf.get(YarnConfiguration.NM_HEALTH_CHECK_SCRIPT_PATH);
    this.intervalTime = conf.getLong(YarnConfiguration.NM_HEALTH_CHECK_INTERVAL_MS,
        YarnConfiguration.DEFAULT_NM_HEALTH_CHECK_INTERVAL_MS);
    this.scriptTimeout = conf.getLong(
        YarnConfiguration.NM_HEALTH_CHECK_SCRIPT_TIMEOUT_MS,
        YarnConfiguration.DEFAULT_NM_HEALTH_CHECK_SCRIPT_TIMEOUT_MS);
    String[] args = conf.getStrings(YarnConfiguration.NM_HEALTH_CHECK_SCRIPT_OPTS,
        new String[] {});
    timer = new NodeHealthMonitorExecutor(args);
    super.serviceInit(conf);
  }

public NodeHealthMonitorExecutor(String[] args) {
      ArrayList<String> execScript = new ArrayList<String>();
      execScript.add(nodeHealthScript);
      if (args != null) {
        execScript.addAll(Arrays.asList(args));
      }
      shexec = new ShellCommandExecutor(execScript
          .toArray(new String[execScript.size()]), null, null, scriptTimeout);
    }

追本溯源过来，我们发现里面定义了一个定时的脚本执行，来定时检测NM的健康状况。

		this.context = createNMContext(containerTokenSecretManager, nmTokenSecretManager, nmStore);

这句话看似很简单，实际上是代码的集中化，内部的构造非常重要，我们看看：

public NMContext(NMContainerTokenSecretManager containerTokenSecretManager,
				NMTokenSecretManagerInNM nmTokenSecretManager, LocalDirsHandlerService dirsHandler,
				ApplicationACLsManager aclsManager, NMStateStoreService stateStore) {
			this.containerTokenSecretManager = containerTokenSecretManager;
			this.nmTokenSecretManager = nmTokenSecretManager;
			this.dirsHandler = dirsHandler;
			this.aclsManager = aclsManager;
			this.nodeHealthStatus.setIsNodeHealthy(true);
			this.nodeHealthStatus.setHealthReport("Healthy");
			this.nodeHealthStatus.setLastHealthReportTime(System.currentTimeMillis());
			this.stateStore = stateStore;
		}

在说RM结构的时候，有个rmContext的大管家，而这里，NMContext其实就是每个NM的大管家。

		nodeStatusUpdater = createNodeStatusUpdater(context, dispatcher, nodeHealthChecker);

所以说Hadoop的代码写的都很清晰明了，一眼就能看出来这个类是用于NM节点状态定时更新的，因为最终需要把这个服务加入到serviceList，我们要看看其初始化的逻辑：

@Override
	protected void serviceInit(Configuration conf) throws Exception {
		int memoryMb = conf.getInt(YarnConfiguration.NM_PMEM_MB, YarnConfiguration.DEFAULT_NM_PMEM_MB);
		float vMemToPMem = conf.getFloat(YarnConfiguration.NM_VMEM_PMEM_RATIO,
				YarnConfiguration.DEFAULT_NM_VMEM_PMEM_RATIO);
		int virtualMemoryMb = (int) Math.ceil(memoryMb * vMemToPMem);

		int virtualCores = conf.getInt(YarnConfiguration.NM_VCORES, YarnConfiguration.DEFAULT_NM_VCORES);

		this.totalResource = Resource.newInstance(memoryMb, virtualCores);
		metrics.addResource(totalResource);
		this.tokenKeepAliveEnabled = isTokenKeepAliveEnabled(conf);
		this.tokenRemovalDelayMs = conf.getInt(YarnConfiguration.RM_NM_EXPIRY_INTERVAL_MS,
				YarnConfiguration.DEFAULT_RM_NM_EXPIRY_INTERVAL_MS);

		this.minimumResourceManagerVersion = conf.get(YarnConfiguration.NM_RESOURCEMANAGER_MINIMUM_VERSION,
				YarnConfiguration.DEFAULT_NM_RESOURCEMANAGER_MINIMUM_VERSION);

		// Default duration to track stopped containers on nodemanager is 10Min.
		// This should not be assigned very large value as it will remember all the
		// containers stopped during that time.
		durationToTrackStoppedContainers = conf.getLong(YARN_NODEMANAGER_DURATION_TO_TRACK_STOPPED_CONTAINERS, 600000);
		if (durationToTrackStoppedContainers < 0) {
			String message = "Invalid configuration for " + YARN_NODEMANAGER_DURATION_TO_TRACK_STOPPED_CONTAINERS
					+ " default " + "value is 10Min(600000).";
			LOG.error(message);
			throw new YarnException(message);
		}
		if (LOG.isDebugEnabled()) {
			LOG.debug(YARN_NODEMANAGER_DURATION_TO_TRACK_STOPPED_CONTAINERS + " :" + durationToTrackStoppedContainers);
		}
		super.serviceInit(conf);
		LOG.info("Initialized nodemanager for " + nodeId + ":" + " physical-memory=" + memoryMb + " virtual-memory="
				+ virtualMemoryMb + " virtual-cores=" + virtualCores);
	}

这里，发现了很多从YarnConfiguration加载的东西，我们也就知道为什么默认的NM上只加载了8G的内容给Container使用了，也知道虚拟内存和物理内存的2.1的比例，同时默认占用8个核来使用，这就是NodeManager实际占用到的资源，可供分给Container来使用的资源：

接下来，我们看这部分，用于NM资源的监控，并看看其serviceInit方法：

NodeResourceMonitor nodeResourceMonitor = createNodeResourceMonitor();
		addService(nodeResourceMonitor);

有点怀疑这是个bug，因为这个类根本没用到，虽然加到service内，但内部不会初始化。

接着，看container的管理器：

containerManager = createContainerManager(context, exec, del, nodeStatusUpdater, this.aclsManager, dirsHandler);
		addService(containerManager);
		((NMContext) context).setContainerManager(containerManager);

我们看看其初始化，捡重点的代码：

// ContainerManager level dispatcher.
		dispatcher = new AsyncDispatcher();

其内部有自己的dispatcher，用于处理下面的事件：

dispatcher.register(ContainerEventType.class, new ContainerEventDispatcher());
		dispatcher.register(ApplicationEventType.class, new ApplicationEventDispatcher());
		dispatcher.register(LocalizationEventType.class, rsrcLocalizationSrvc);
		dispatcher.register(AuxServicesEventType.class, auxiliaryServices);
		dispatcher.register(ContainersMonitorEventType.class, containersMonitor);
		dispatcher.register(ContainersLauncherEventType.class, containersLauncher);

平时我们需要分配container和启动container，都是由该类来负责的，最重要的就是container启动的时候，可以看到这段代码在ContainerLauncher内：此处不多说了。

该类重要的代码在初始化时候基本实现完毕，所以不看其serviceInit方法了：

WebServer webServer = createWebServer(context, containerManager.getContainersMonitor(), this.aclsManager,
				dirsHandler);
		addService(webServer);

我们知道，NM自身也是有webapp监控的，而其创建的过程，就是在此处：

public WebServer(Context nmContext, ResourceView resourceView, ApplicationACLsManager aclsManager,
			LocalDirsHandlerService dirsHandler) {
		super(WebServer.class.getName());
		this.nmContext = nmContext;
		this.nmWebApp = new NMWebApp(resourceView, aclsManager, dirsHandler);
	}

其serviceInit为空，不看了。

dispatcher.register(ContainerManagerEventType.class, containerManager);
		dispatcher.register(NodeManagerEventType.class, this);
		addService(dispatcher);

		DefaultMetricsSystem.initialize("NodeManager");

		// StatusUpdater should be added last so that it get started last
		// so that we make sure everything is up before registering with RM.
		addService(nodeStatusUpdater);
		((NMContext) context).setNodeStatusUpdater(nodeStatusUpdater);

剩下的代码如上，不与分析了。

下文，将会讲述下其相关的服务启动。

关于Yarn源码那些事-前传之NodeManager（一） 初始化篇

猜你喜欢

关于Yarn源码那些事-前传之NodeManager（一）初始化篇