lease management
foreword
We know that HDFS files are write-once-read-many and do not support parallel write operations on the client side, so a mechanism is needed to ensure mutual exclusion of HDFS files. HDFS provides a Lease mechanism to realize this function. Lease is a very important concept in HDFS. It is the name node that grants the lease holder (LeaseHolder, generally the client) to have file permissions (write files) within a specified time. contract.
In HDFS, when a client writes a file, it first needs to apply for a lease from the lease manager ( LeaseManager
). After successfully applying for the lease, the client becomes the lease holder and has the exclusive authority to the HDFS file. Other clients The HDFS file cannot be opened for operation while the lease is valid.
Namenode's lease manager saves the correspondence between HDFS files and leases, leases and lease holders, and the lease manager also regularly checks whether all leases it maintains have expired. The lease manager will forcibly reclaim expired leases, so the lease holder needs to periodically renew the lease ( renew
) to maintain an exclusive lock on the file. When the client has finished writing to the file and closes the file, the lease must be released in the lease manager.
LeaseManager.Lease
We know that an HDFS client can simultaneously open multiple HDFS files for read and write operations. In order to facilitate management, all files opened by a client are organized into a record in the lease manager, that is, a class LeaseManager. Lease
.
Lease
All the fields defined by the class are given , among which holder
the field saves the information of the client, that is, the lease holder, paths
the field saves the path of all HDFS files opened by the client, and lastUpdate
the field saves the last renewal time of the lease.
Lease
There are three more important methods in the class that need to be explained, and these methods are called for LeaseManager
managementLease
-
renew
:renew
The method is used to update the latest update time of the clientlastUpdate
. -
expiredSoftLimit
: It is used to judge whether the current lease exceeds the soft limit (sofLimit
). The soft limit is the lease timeout specified in the write file. The default is 60 seconds and cannot be configured. -
expiredHardLimit
: It is used to judge whether the current lease exceeds the hard limit (hardLimit
). The hard limit is used to consider the time when the file is closed abnormally and the lease is forcibly reclaimed. The default is 60 minutes and cannot be configured. There is an internal class inLeaseManager
which is used to periodically check the renewal of the lease. When the hard limit time is exceeded, the lease recovery mechanism will be triggered.
Lease Manager
LeaseManager
It is a class that maintains all lease operations in Namenode. It not only saves the information of all leases in HDFS, but also provides methods for adding, deleting, modifying, and checking leases. It also maintains a thread that regularly checks whether the lease has timed out Monitor
. Renewing the lease of a file (beyond the hard limit time), LeaseManager
triggers the lease recovery mechanism, and then closes the file.
LeaseManager
Use the data structure in leases
and three fields to save all leases in the Namenode; use the Imthread field to save the lease check thread; use the sortedLeases
field to save the soft limit time (default is 60 seconds, not configurable); use the field to save the hard limit time (default It is 60 minutes and cannot be configured).sortedLeasesByPath
sofLimit
hadrLimit
Let's take a look at the three fields that LeaseManager saves the lease: leases
, sortedLeases
and sortedLeasesByPath
.
-
leases
: Save the corresponding relationship between the lease holder and the lease. -
sortedLeases
: Save all leases in the LeaseManager in order of lease renewal time, if the renewal time is the same, save them in lexicographical order of lease holders. -
sortedLeasesByPath
: The corresponding relationship between the file path and the lease is saved, and the path is saved in the dictionary order of the path.
After understanding the fields defined by LeaseManager, let's learn LeaseManager
the methods of adding, deleting, updating and restoring leases provided by .
Adding a lease - addLease
When a client creates a file and appends to a file, FSNamesystem.startFilelnternal()
the appendFilelnternal()
method will be called to LeaseManager.addLease()
add a lease on the HDFS file for the client. The addLease() method has two parameters, among which the holder parameter saves the information of the lease holder, and the src parameter saves the path for creating or appending the written file. These two parameters correspond to the and parameters in the or method ClientProtocol.ercate()
respectively . The implementation of the method is also very simple. First, construct the lease through the method, and then add the information of the lease to the defined data structure for saving the lease.append()
clientName
src
addLease()
getLease()
LeaseManager
synchronized Lease addLease(String holder, long inodeId) {
Lease lease = getLease(holder);
if (lease == null) {
// 构造 Lease 对象
lease = new Lease(holder);
// 在leaseManager.leases字段添加Lease对象
leases.put(holder, lease);
} else {
renewLease(lease);
}
leasesById.put(inodeId, lease);
lease.files.add(inodeId);
return lease;
}
addLease()
It will also be called in the other two cases, that is, fsimage
when the Namenode reads the file, fsimage
the file records that the current HDFS file is in the construction state. At this time, it is necessary to rebuild the file under construction and add the INode object corresponding to the file to the file system directory tree , and then you need to LeaseManager
add lease information in ; and when Namenode reads editlog
, editlog
it records an OP_ADD
operation, which is the operation of creating a file. After Namenode creates the INode object and adds it to the file system directory tree, you need to LeaseManager
add the lease in information.
Lease Check - FsNamesystem.checkLease
As shown in Figure 3-51, when the client has successfully opened an HDFS file and added a lease, the client calls to give up the abandonBlock()
newly applied data block (when the client data flow pipeline fails to be established), and calls to apply for a getAddtionalBlock()
new data block ( When the client finishes writing the previous data block, when applying for a new data block), when calling completeFilelnternal()
submit file (when the client completes the writing operation of the HDFS file, when submitting the file), etc., it is necessary to check whether the lease is normal.
The check operation here is FSNamesystem.checkLease()
performed by the method, checkLease()
which will check whether the current HDFS file exists, whether the INode is a directory, whether the file is in the state of construction, whether the file has been deleted, the lease holder of the input file and the actual lease whether the holders are the same. Thrown if these checks are not OK LeaseExpiredException
.
Lease renewal - renewLease
When a client opens a file for writing or appending, LeaseManager
it saves the client's lease on the file. The client initiates a LeaseRenewer
periodic lease renewal to prevent the lease from expiring.
The lease renewal operation is FSNamesystem.renewLease()
responded to by , which eventually calls the LeaseManager.renewLease()
method. renewLease()
The method will first sortedLeases
remove the lease from the field, then update the last renewal time of the lease, and then add it again sortedLeases
. The reason for this is that sortedLeases
it is a collection sorted by the last update time, so every time the lease is renewed, sortedLeases
the order in it needs to be changed again.
Remove lease - removeLease
Leases in the LeaseManager are deleted in two situations.
-
When the Namenode closes the HDFS file under construction, it will call
FSNamesystem.finalizeINodeFileUnderConstruction()
the method to convert the INode from the construction state to the non-construction state. At the same time, since the client has completed the file writing operation, it needs toLeaseManager
delete lease of the file. Here,removeLease()
the method delete is called lease. -
When deleting the directory tree, for the opened file, if the client removes the HDFS file from the file system directory tree, the method will be called to delete the lease
removeLeaseWithPrefixPath()
fromLeaseManager
it .
removeLease()
The implementation of and removeLeaseWithPrefixPath()
is relatively simple, LeaseManager
just delete the lease information from the data structure that saves the lease. Readers can directly refer to the following code:
Lease Check - Monitor thread
We know that in addition to providing operations such as adding, deleting, modifying, and checking leases, the lease manager will also regularly check all leases. For files that have not been updated for a long time, the lease will be restored for this file and then closed LeaseManager
. . Under what circumstances will the lease expire? We know that HDFS is a distributed system, and the client is likely to fail after opening a file, which also causes the client to fail to complete the lease renewal and lease deletion after writing the file, which will cause the lease to expire.
The periodic check operation of the lease is performed by LeaseManager
the internal class of , which is a thread class, and its method will call the method to check the lease every 2 seconds.Monitor
Monitor
run()
LeaserManager.checkLeases()
As we have introduced in the previous section, LeaseManager
there are two limit times, the soft limit time is used to record the lease timeout specified for writing files; the hard limit time is used to judge whether the file has not been closed properly due to an exception. In checkLeases()
the method, the hard limit time (60 minutes) is used to judge whether the lease recovery operation is required.
checkLeases()
The method iterates over leaseManager
all leases managed in the and finds all leases that have exceeded the hard limit and have not been renewed. Since the lease saves all HDFS files opened by this client, checkLeases()
the method will traverse all the files on this lease, and call FSNamesystem.internalReleaseLease()
the method to restore the lease. checkLeases()
The code of the method is as follows:
Lease recovery - Monitor thread initiates
The lease recovery operation for HDFS files is FSNamesystem.internalReleaseLease()
implemented by calling . This method is used to recover the lease of an already opened file and close it. internalReleaseLease()
The method returns true if the file was successfully closed, and false if only a lease recovery operation was triggered. We know that the lease recovery is for the files in the construction that have been opened, so internalReleaseLease()
the status of all data blocks in the file will be judged, and an exception will be thrown directly for abnormal status. In checkLeases()
the method, for FSNamesystem.internalReleaseLease()
the lease that throws an exception when calling the method, call removeLease()
the method directly to delete it.
When the file is in the build state, there are three situations that can directly close the file and return true.
-
All data blocks owned by this file are in the
COMPLETED
state, that is, the client fails before closing the file and releasing the lease. At this time, the methodinteralReleaseLease()
can be directly calledfinalizeINodeFileUnderConstruction()
to close the file and delete the lease. -
The last data block of the file is in the commit state (
COMMITTED
), and there is at least one valid copy of the data block, at this time,finalizeINodeFileUnderConstruction()
the method can be directly called to close the file and delete the lease. -
The last data block of the file is under construction, but the length of this data block is 0, and no Datanode currently reports to receive this data block. In this case, it is likely that the client failed before writing data to the data stream pipeline. , at this time, you can delete the last unwritten data block, and then call
finalizeINodeFileUnderConstruction()
the method to close the file and delete the lease.
When the last data block is in the state of UNDER RECOVERY
or UNDER CONSTRUCTION
and the data has been written into this data block, a new time stamp is constructed as the recoveryld
call to initializeBlockRecovery()
trigger the lease recovery, and the lease holder of the current file is updated as "HDFS NameNode". internalReleaseLease()
The code is as follows:
Next, let's take a look at the process of lease recovery. Here, the method is called on the data block that needs to be leased. initializeBlockRecovery()
This method will traverse all the data nodes that save the copy, select a data node that reported last time as the primary recovery node, and then send The data node sends the lease recovery command, and the Namenode will send the lease recovery name node command to the recovery node through the heartbeat.
The lease recovery command is carried to the primary recovery data node through the heartbeat response. The lease recovery process of the primary recovery data node is shown in Figure 3-52. After the master recovery data node receives the command, it will call Datanode.recoverBlock()
the method to start the lease recovery. This method first Inter DatanodeProtocolinitReplicaRecovery()
collects replica information from the data nodes participating in the lease recovery in the data flow pipeline through the method, and the replica information will be ReplicaRecoveryinfo
returned to the master recovery node as an object. initReplicaRecovery()
The method will select the best state from all copies of the data block as the target state for all copies to restore. Then the master recovery node will call InterDatanodeProtocol.updateReplicaUnderRecovery()
the method to synchronize the copy of the data block on all Datanodes to the target state. After the synchronization is complete, the replica lengths and timestamps on these data nodes will be consistent. Finally, the master recovery node will call to DatanodeProtocol.commitBlock Synchronization()
report the result of this lease recovery to the name node.
Now let's look at the commit method for data block synchronization commitBlockSynchronization()
. commitBlockSynchronization()
The method is used to synchronize the replica information on the data node that has performed lease recovery with the data block information on the name node. This method also has many parameters, among which lastblock
is the data block to be restored, newgenerationstamp
is the new timestamp after the lease is restored, newlength
is the length of the copy after the lease is restored, and closeFile
is used to indicate whether to close the HDFS file corresponding to the data block, and deleteblock
to indicate whether to directly Delete this data block, newtargets
store the list of data nodes that save the copy of this data block after the lease is restored, and newtargetstorages
save the storage information of the data nodes.
During the data block recovery process, if it is found that the copy does not exist in the data flow pipeline or the length of all copies is 0, you can delete the data block directly without performing the lease recovery operation. At this time, the master recovery node will If the deleteblock field is set to true, commitBlockSynchronization()
the method will delete the data block Namenode
from , and then close the file commitBlockSynchronization()
; otherwise, the method performs the data block update operation, updates the timestamp and length of the data block on the Namenode, and makes the data block information on the Namenode and Datanode. Replicas after lease restoration are consistent. Since only some data nodes may participate in lease recovery, it is also necessary to update DatanodeStoragelnfo
the data block information on and INodeFile
the data block information in .
After completing commitBlockSynchronization()
the method, the information of the data block on the Namenode is consistent with the copy information on the Datanode, and the entire lease recovery process is over.
Lease recovery - initiated by other means
Lease restoration can be LeaseManager.Monitor
initiated by a thread, as shown in Figure 3-53, there are three situations in which the method will be called to trigger the lease restoration operation. The difference between FSNamesystem.recoverLeaselnternal()
calling in these three situations lies in the assignment of the force field. The client initiates the lease recovery through the remote method, and will eventually respond by , where the force field will be set to true, that is, the method is forced to close the file and release the lease, without judging whether the lease has exceeded the soft limit time.recoverLeaselnternal()
ClientProtocol.recoverLease()
FSNamesystem.recoverLeaselnternal()
internalReleaseLease()
The client startFilelnternal()
opens a file for writing. At this time, the recoverLeaselinternal()
method will be called to check whether other clients have opened the file, so as to prevent multiple clients from writing the file at the same time. If the file is opened by another client, recoverLeaseInternal()
the method will throw AlreadyBeingCreatedException
an exception. The force field here is set to false, the method will judge whether the original lease holder has soft timed out ( softLimit
), if timed out, the lease recovery operation will be performed to release the lease and close the file to prepare for the file write operation.
It is the same principle that the client appendFilelnternal()
performs additional write operations by opening the file . Call the method to check whether other clients have opened the file at the same time.startFilelnternal()
recoverLeaselnternal()
The content of this article is combined from "Hadoop 2.X HDFS Source Code Analysis" and my own understanding
I hope it will be helpful to you who are viewing the article, remember to pay attention, comment, and favorite, thank you