【转】Resource Localization in YARN

Running on a Applciation YARN process, for submission to the task from YARN Client ResourceManager, will be submitted to Applciation resources required to HDFS, and then start the ResourceManager APPMaster, APPMaster notify each container NodeManager start to perform specific computing tasks. The container need to download resources from execution depends on HDFS before starting container, these resources include jar, dependent jar or other documents, this process is called localized resources (Resource Localization).

This part describes the resource localization-related contents.

Related concepts

Localization (Localization)
Localization refers to the Downloads on HDFS to the local process . Localized resources, the container do not always access data on HDFS, but direct access to local data and improve efficiency.

Local resources (LocalResource)
local resource is a resource container runtime needed, can be a file or dependent library, HDFS existence of these resources. NodeManager before the container is responsible for starting these resources will be localized. For Application, the local resources means:

  • URL: address the needs of local resources downloaded from the HDFS
  • Size: the size of the local resources
  • timestamp: timestamp when the resource is created on the local HDFS
  • LocalResourceType: specified when localized resources NodeManager resource types, there are FILE, ARCHIVE and PATTERN
  • Pattern: rule matching method used when extracting specific content from the archive (only LocalResourceType PATTERN is only take effect).
  • LocalResourceVisibility: NodeManager the visibility for other users and Application on the Nodemanager after resource localization. The visible range of PUBLIC, PRIVATE, and APPLICATION.

NOTE: Local resources does not refer to resources on the local disk, but need to be downloaded from HDFS to the local resources.

So what kind of container resource requests to localize it? It can be any of the files, but these files contianer must be read-only.
Here are a few typical examples are more suitable for local resources:

  1. When the container needs to start code libraries, such as jar file
  2. When the container needed to start configure file
  3. Static file directory

Some dynamic resources are not suitable as local resources, such as: container resources are likely to be needed to update other components, they will directly update the application file or application wants to change the situation of other services to share files.

ResourceLocalizationService
ResourceLocalizationService is a service internal NodeManager, is mainly responsible for all kinds of resources needed to download and manage container. On all available disk load balancing download, download resources will strictly control their access.

DeletionService
DeletionService also internal NodeManager a service, is responsible for deleting the local directory after receiving instructions

Localizer
Localizer is actually a thread for resource localization. Localizer There are two types, one refers to the type of resource use and access to download the PUBLIC PublicLocalizer , the other is to download and access PRIVATE type of the APPLICATION ContainerLocalizers .

LocalCache
LocalCache is NodeManager maintain local-cache all downloaded to the local file. Specified when downloading these resources HDFS address to uniquely identify.

Concept of complementarity

LOCALRESOURCE TIMESTAMPS

timestamp reflects a version of the local resources, NodeManager local resources when downloading checks timestamp, see in this Application runtime contents of the file are the same.
Use timestamp, YARN resources can be found through a change has occurred, if the change will avoid inconsistency container failure occurred. Because resources on the HDFS once NodeManager localized to the local disk, the file will no longer have any contact with the source file, the original URL will only record to uniquely identify locally. At this point even if the source file changes, NodeManager track this change will not download the file again.

It should be noted that when the container starts, ApplicationMaster specifies timestamp resources to NodeManager running container of the same when the container when running ApplicationMaster start, also need to timestamp resources, in which case you need to specify this timestamp by the Client . To MapReduce on YARN, for example, MapReduce is JobClient decision timestamp resources ApplicationMaster needs, and then decide map and reduce the resources required by the timestamp ApplicationMaster own.

LOCALRESOURCE TYPES

Mentioned in the previous section is LocalResourceType FILE, ARCHIVE and PATTERN, the following describes the specific meaning under three type.
FILE type is the ordinary file, a text file or binary type
ARCHIVE means of some type can be automatically identified NodeManager extracting archive, such as jars, tars, tar.gz and ZIP
the PATTERN and a mixture of FILE ARCHIVE. This type of download to a local source files will be retained, and only when the localization extracted files will be retained in the local file system. Source file and extract the files in the same directory. Which files need to be extracted from the ARCHIVE, which are not required are determined by the pattern. At present, only jar support PATTERN, other are considered normal ARCHIVE.

 

LOCALRESOURCE VISIBILITES

On a local resource LocalResourceVisibility mentioned three visibility, are PUBLIC, PRIVATE and APPLICATION. among them

Access PUBLIC refers to any Application of container to any user can access . Typical PUBLIC resources are those files in HDFS can be accessed by anyone, when these resources are localized retains the same access rights. If a resource is PUBLIC, when there is container (container can be the current Attempt, can also be any Application of other users in the container) request the same local resources, as long as the resources are not LocalCache deleted, it can be directly from LocalCache in use, without the need to download again .

PUBLIC resources are stored in NodeManager local disk <local-dir>/filecachedirectory, owner of all files in this directory when the user is NodeManager process starts, and all users have read access, so all users of these resources can run on this NodeManager container Share .

PRIVATE permission only local resources shared between the current node on the same user the Application , these resources are stored in NodeManager local disk <local-dir>/usercache/$username/filecachedirectory, owner of these documents is to start the Application of the user, and other users do not have access. Similarly PUBLIC, once localized resources, all users do not have write permission, even if the user submitting the task. This is to prevent malicious container to modify the file.

APPLICATION only share the same node between the current application of a container . In these NodeManager local disk storage resource <local-dir>/usercache/$username/appcache/<app-id>/directory, owner file is the Application of the submitter, and only read access.

It should be noted that LOCALRESOURCE VISIBILITIES and LOCALRESOURCE TIMESTAMPS similar, are designated by the local resources ApplicationMaster visibility, NodeManager not have the resources to do the visibility of any decision. Also when the container is running ApplicationMaster start, also need the resources of visibility, the visibility we need at this time is specified by the client. To MapReduce on YARN, for example, MapReduce is JobClient decision ApplicationMaster resources needed visibility, then decided to map and reduce the visibility of the resources required by the ApplicationMaster own.

Localization Process

PUBLIC resource localization is PublicLocalizerimplemented, there will be a process in NodeManager thread pool PublicLocalizers, whose number is yarn.nodemanager.localizer.fetch.thread-countdecided, a parallel thread pool size determines the maximum number of threads to download PUBLIC resources. When localization PUBLIC resources PublicLocalizer, the application will be determined by examining the rights of these resources in the resource HDFS indeed PUBLIC. As long as there are resources they do not conform to reject localization. PublicLocalizer can secure download resource is passed to ContainerLaunchContext certificate from HDFS.

Localization PRIVATE / APPLICATON resources is a ContainerLocalizerdifferent implementation of the PUBLIC, PublicLocalizerachieve. PublicLocalizer is to start directly in NodeManager in a thread pool to be localized, and ContainerLocalizer for security issues, not directly in NodeManager process, but implemented in continer in .

Localization PRIVATE / APPLICATON resources is ContainerLocalizerachieved, which is a separate process, the process by the LocalizerRunnerthread management, LocalizerRunner is a thread NodeManager in , as long as there is a container resource has not been downloaded, then the container will trigger a LocalizerRunner . The following look at specific details:

When a container first request PRIVATE / APPLICATION types of local resources, if not found in LocalResourcesTracker, then added to the list of pending-resources. Then if you need to create LocalizerRunner thread depending on whether it is necessary to download resources, local resources will be required if added LocalizerRunner maintain a list of pending-resources.

NodeManager in the safe mode, the localization needs of local resources used in user application is submitted by the user rather than the user NodeManager the start. Therefore LocalizerRunner's identity will be submitted start LinuxContainerExecutor (LCE) processes to application, then download LCE performs ContainerLocalizer resources . After ContainerLocalizer starts will remain with NodeManager a heartbeat by heartbeat, LocalizerRunner to ContainerLocalizer allocation of resources will need to download or stop ContainerLocalizer process, and will inform LocalizerRunner ContainerLocalizer own download progress. If the resource download fails, the resource will be removed from LocalResourcesTracker, and the container will eventually fail. If the download is successful, LocalizerRunner will be downloaded through the heart to ContainerLocalizer another resource until all resources are downloaded.

Local resource life cycle

Because access to local resources is not the same, different LocalResourceType will be different at the time of the local reservations.

  • PUBLIC because it is shared between any Application for any user, and therefore it will not be deleted after the end of a container or application, will be removed only when the threshold is reached is stored in a local directory, this threshold of yarn.nodemanager.localizer.cache.target-size-mbcontrol.
  • PUBLIC and PRIVATE life cycle of the same.
  • APPLICATION will be removed immediately after the application.

Localization-related configuration

In yarn-site.xmlthere are some localized resources related configuration.

  • yarn.nodemanager.local-dirs: Local directory where localized resources, can be in more than one disk directories separated by commas.
  • yarn.nodemanager.local-cache.max-files-per-directory: Localization files up to each directory number, PUBLIC / PRIVATE / APPLICATION statistics separately.
  • yarn.nodemanager.localizer.address: ResourceLocalizationService listening RPC service address to receive different localizers
  • yarn.nodemanager.localizer.client.thread-count: ResourceLocalizationService number of threads to process the requests from the localizers. The default is 5
  • yarn.nodemanager.localizer.fetch.thread-count: PublicLocalizer number of threads PUBLIC localized resources. The default is 4
  • yarn.nodemanager.delete.thread-count: DeletionService number of threads to delete the file, the default is 4.
  • yarn.nodemanager.localizer.cache.target-size-mb: Localized resources occupied by the maximum disk space, the unit is MB, including more than APPLICATION resources.
  • yarn.nodemanager.localizer.cache.cleanup.interval-ms: Fixed time intervals, to check the amount of the disk. After this interval, if the storage disk space exceeds the configured threshold, deletes unused resources.

Unused resources refers to the container resource is not running references. Each time container request resources, container resources will be added to a list of references, will not be removed until after the end of the container. So when the reference count is 0, it can be removed.

 

 

Reference:

http://bigdatadecode.club/YARN-Resource-Localization.html

https://zh.hortonworks.com/blog/management-of-application-dependencies-in-yarn/  Management of Application Dependencies in YARN
https://zh.hortonworks.com/blog/resource-localization-in-yarn-deep-dive/  Resource Localization in YARN: Deep Dive

https://stackoverflow.com/questions/32082723/make-yarn-clean-up-appcache-before-retry/42938399  Make YARN clean up appcache before retry

 

Guess you like

Origin www.cnblogs.com/piperck/p/11237203.html