CGroups' understanding of CPU limits under Yarn

Curious about how the Yarn CGroup limit limits the CPU?

CGroup limit on CPU

Isolation of cpushares: It provides us with a means to flexibly allocate cpu time resources according to the weight ratio; when the cpu is idle, a cgroup that needs to occupy the cpu can completely occupy the remaining cpu time and make full use of the resources. When other cgroups need to be occupied, each cgroup can guarantee its minimum occupation time ratio to achieve the effect of resource isolation.

cpuset isolation: resource isolation is performed by allocating cores. The smallest granularity of resource allocation that can be provided is cores, and it cannot provide more fine-grained resource isolation, but the mutual influence of operations after isolation is the lowest. It should be noted that when the server is turned on 超线程, you must choose the allocated core carefully, otherwise the performance gap between different cgroups will be relatively large.

cpuquota isolation: It provides us with a more fine-grained way to allocate resources than cpuset, and guarantees the upper limit of the cpu ratio used by cgroups, which is equivalent to a hard limit on cpu resources.

Cgroups isolation for Containers in YARN

It was found that Hadoop2.7.3 already supports the hard limit for CPU.

Code execution logic: LinuxContainerExecutor.launchContainer() -> resourcesHandler.preExecute(containerId, container.getResource())

  /*
   * LCE Resources Handler interface
   */

  public void preExecute(ContainerId containerId, Resource containerResource)
              throws IOException {
    setupLimits(containerId, containerResource);
  }

  private void setupLimits(ContainerId containerId,
                           Resource containerResource) throws IOException {
    String containerName = containerId.toString();

    if (isCpuWeightEnabled()) {
      // container申请的Vcores数量
      int containerVCores = containerResource.getVirtualCores();

      // 为该containerName创建/cpu/hadoop/cgroup/{containerName}的cgroup 文件路径
      createCgroup(CONTROLLER_CPU, containerName);

      // cpuShares=1024 * container申请的Vcores
      int cpuShares = CPU_DEFAULT_WEIGHT * containerVCores;

      // 设置cpushares隔离方式,其大小即为containerVcores * 默认系数(时间片)的比例
      updateCgroup(CONTROLLER_CPU, containerName, "shares",
          String.valueOf(cpuShares));

      // 此处,如果开启硬限的话,会相应的设置cpuquota硬限
      if (strictResourceUsageMode) {

      // nodeVCores:  NM物理机器CPU数量;
      // containerVCores: 配置的逻辑虚拟Cores数量;
      // 当CPU有超售时,就需要严格定义每个CPU申请真实硬限
        if (nodeVCores != containerVCores) {

        // containerCPU即为单位物理核数量,此处非虚拟核概念了!
          float containerCPU =
              (containerVCores * yarnProcessors) / (float) nodeVCores;

          // 获取containerCPU申请CPU数量对应的cpuQuota和cpuPeriod
          int[] limits = getOverallLimits(containerCPU);
          updateCgroup(CONTROLLER_CPU, containerName, CPU_PERIOD_US,
              String.valueOf(limits[0]));
          updateCgroup(CONTROLLER_CPU, containerName, CPU_QUOTA_US,
              String.valueOf(limits[1]));
        }
      }
    }
  }



  int[] getOverallLimits(float yarnProcessorsArg) {
    return CGroupsCpuResourceHandlerImpl.getOverallLimits(yarnProcessorsArg);
  }

  @VisibleForTesting
  @InterfaceAudience.Private
  public static int[] getOverallLimits(float yarnProcessors) {

    int[] ret = new int[2];

    if (yarnProcessors < 0.01f) {
      throw new IllegalArgumentException("Number of processors can't be <= 0.");
    }

    // Hadoop设定每台机器最大CPU Quota时间片为 1000 * 1000 = 1M,按照VCores数量进行划分
    int quotaUS = MAX_QUOTA_US;
    // periosUS单位Vcores下获得的时间片数量
    int periodUS = (int) (MAX_QUOTA_US / yarnProcessors);
    if (yarnProcessors < 1.0f) {
      periodUS = MAX_QUOTA_US;
      quotaUS = (int) (periodUS * yarnProcessors);
      if (quotaUS < MIN_PERIOD_US) {
        LOG.warn("The quota calculated for the cgroup was too low."
            + " The minimum value is " + MIN_PERIOD_US
            + ", calculated value is " + quotaUS
            + ". Setting quota to minimum value.");
        quotaUS = MIN_PERIOD_US;
      }
    }

    // cfs_period_us can't be less than 1000 microseconds
    // if the value of periodUS is less than 1000, we can't really use cgroups
    // to limit cpu
    if (periodUS < MIN_PERIOD_US) {
      LOG.warn("The period calculated for the cgroup was too low."
          + " The minimum value is " + MIN_PERIOD_US
          + ", calculated value is " + periodUS
          + ". Using all available CPU.");
      periodUS = MAX_QUOTA_US;
      quotaUS = -1;
    }

    ret[0] = periodUS;
    ret[1] = quotaUS;
    return ret;
  }

follow-up

How is the memory limited?

Reference: https://blog.csdn.net/liukuan73/article/details/53358423

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324850995&siteId=291194637