OneFlow source code analysis: Tensor storage management in Eager mode

dcde8982a7d8db1204f28e2f26bc8cbb.jpeg

Author | Zheng Jianhua

Storage management methods of different Tensor types

The storage of Lazy Tensor is managed by objects such as Runtime and Actor. After the static graph is compiled, how many objects and how much storage space are needed are determined. Runtime, etc. will allocate storage when initializing, and reclaim resources when exiting.

In Eager mode, Global Tensor can be regarded as a distributed encapsulation of Local Tensor, and the local data of EagerGlobalTensorImpl is a

EagerLocalTensorImpl object. You can understand the storage management of tensor in eager mode by examining EagerLocalTensorImpl.

The sample code for reference is as follows:

 
  
import numpy as np
 import oneflow as flow
 
 a = np.random.randn(1, 4)
 flow.tensor(a, device=flow.device("cpu"), dtype=flow.float)

Tensor stores relationships of related classes

The storage-related class relationship of EagerLocalTensorImpl is as follows.

Follow up the execution process of the sample code to see when and how the objects in the diagram are constructed, who holds the storage, and how it is allocated and released.

15c2bd31384ec829680f6bcaa6efe0f2.png

3 

Allocate storage for Tensor through virtual machine instructions

The tensor constructor is registered as PyTensorObject_init through the Python C API, by functional::_legacy_tensor_ctor 

Forward according to the signature.

The sample code corresponds to TensorWithDataFunctor

, call MakeLocalTensorFromData to construct tensor, and allocate storage by calling functional::Empty and EmptyFunctor in this function. Store the relevant attributes in attrs in EmptyFunctor, and then call OpInterpUtil::Dispatch to allocate storage during the execution preparation process of the vm instruction.

The tensor returned by EmptyFunctor is an object with only storage space and no data. The data is copied later by CopyLocalTensorFromUntypedArray

Finish.

3.1 Construction of storage related objects

Because it is a local tensor in eager mode, OpInterpUtil::Dispatch will be forwarded to NaiveInterpret for execution. For the example code, the input parameters of this function are as follows:

  • inputs is an empty array

  • outputs has only one element and is a null pointer

Because the tensor pointers in outputs are all empty, you need to create an EagerLocalTensorImpl object whose one::TensorStorage member variable is a null pointer.

Because the elements in output_eager_blob_objects have not been initialized, tensor_impl->InitEagerBlobObject will be called 

to initialize. Since tensor_storage_ is still empty, the process does the following:

  • Create a vm::TensorStorage object

  • Create an EagerBlobObject object

  • set_eager_blob_object

    • UpdateTensorStorage

      • Create a one::TensorStorage object

      • Set the callback function for tensor storage release

The creation of the above objects only records relevant information, and does not involve the storage allocation of tensor.

It should be noted that the callback function registered to one::TensorStorage is assigned to the member variable releaser_hook_, and this function will release the tensor through the virtual machine instruction.

3.2 Allocating tensor storage during instruction execution

The process of allocating tensor storage is as follows:

  • vm::Instruction::Compute

  • vm::InstructionPolicy::ComputeIf

  • vm::OpCallInstructionPolicy::Compute

  • OpCallInstructionUtil::Compute

  • get memory allocator

  • OpCallInstructionUtil::AllocateOutputBlobsMemory

  • blob_object->TryAllocateBlobBodyMemory

  • allocator->Allocate

In EagerBlobObject::TryAllocateBlobBodyMemory, the storage address allocated by the allocator will be assigned to dptr, and the storage address dptr and the Free function will construct a smart pointer and assign it to the blob_dptr_ variable of vm::TensorStorage.

Release Tensor storage through virtual machine instructions

As mentioned in the previous section 3.1, EagerLocalTensorImpl will set a callback function to release tensor while initializing EagerBlobObject and creating one::TensorStorage. The callback function is stored in the variable releaser_hook_

, this callback function is called when one::TensorStorage is destructed. Putting this information together, one::TensorStorage will perform the following operations when it is destructed:

 
  
vm::InstructionList instruction_list;
 InstructionsBuilder instructions_builder(&instruction_list);
 
 // JUST(Build(&instructions_builder));
 if (eager_blob_object->producer_stream().has_value()) {
   JUST(instructions_builder->ReleaseTensor(eager_blob_object));
 }
 
 JUST(vm::Run(instructions_builder.mut_instruction_list()));

In InstructionsBuilder::ReleaseTensor, if other streams have recently used eager_blob_object, they will be synchronized through SoftSyncStreamBetween. In this way, the storage dependency problem is solved.

Under normal circumstances, the storage is released through the producer_stream of tensor, and the corresponding vm::Stream object is obtained according to this object, and the instruction instruction (including eager_blob_object and vm_stream) is constructed accordingly. The instruction type corresponding to the sample code is FastReleaseTensorInstructionPolicy, and its Compute method executes the specific The storage release logic, the process is as follows:

  • ReleaseTensorInstructionPolicy::Release()

  • eager_blob_object->DeallocateBlobDataPtr()

  • tensor_storage_->Release()

  • tensor_storage_->_Release()

  • blob_dptr_.reset()

    • The smart pointer is reset, and the Free method specified when allocating storage is called

5 

Storage management for scenarios such as reshape

In scenarios such as reshape, slice, and transpose, the parameters of the EagerLocalTensorImpl constructor called include input tensor_storage, so the tensor_storage_ variable of this tensor is not empty. When InitEagerBlobObject is executed, only EagerBlobObject is created to provide information such as shape and stride; but not One::TensorStorage will be created again, but the storage of input will be reused.

Can two TensorStorage types be merged?

Why is the callback function saved by one::TensorStorage triggered to release the storage in vm::TensorStorage when it is destructed?

one::TensorStorage only has one more releaser, can these two Storage types be merged?

Under the current design, the two types cannot be merged. Because one::TensorStorage::releaser_hook_ holds the smart pointer of EagerBlobObject, EagerBlobObject also holds the smart pointer of vm::TensorStorage. If two Storage types are merged into one, there will be a circular reference, and the object cannot be destructed, resulting in a memory leak.

Therefore, vm::TensorStorage is just a simple storage that can be shared between multiple tensors. EagerBlobObject includes both storage and unique object information such as shape, stride, and data_type. And one::TensorStorage is introduced to avoid circular references and is responsible for releasing storage.

7 

appendix

GDB breakpoint example

 
  
break oneflow::one::MakeLocalTensorFromData
 break oneflow::one::NaiveInterpret
 break oneflow::vm::VirtualMachineEngine::DispatchInstruction
 break oneflow::vm::OpCallInstructionUtil::Compute
 break oneflow::vm::OpCallInstructionUtil::AllocateOutputBlobsMemory
 break oneflow::vm::EagerBlobObject::TryAllocateBlobBodyMemory
 break oneflow::vm::ReleaseTensorInstructionPolicy::Release
 break oneflow/core/eager/eager_blob_object.cpp:107

References

everyone else is watching

Welcome Star, Try OneFlow: github.com/Oneflow-Inc/oneflow/ icon-default.png?t=N3I4http://github.com/Oneflow-Inc/oneflow/

Guess you like

Origin blog.csdn.net/OneFlow_Official/article/details/130256963