Measuring the running time of pytorch code snippets

Measuring the running time of pytorch code snippets
2020/6/12 FesianXu @ Tencent internship

Preface

In pytorch, we often need to measure the running time of a certain code segment, a certain class, a certain function or a certain operator to determine the speed bottleneck of the entire model. This article introduces some common methods.

∇ \nabla Contact:
E-mail: [email protected]
QQ: 973 926 198
GitHub: https://github.com/FesianXu


Generally speaking, we can use two major methods to measure code segments:

  1. timeit, It is equivalent to measuring the start time and end time of the code, and then find the difference.
  2. profile, Some pytorch comes with or third-party code time-consuming tools.

timeit

This way of working is similar to the following code:

import time 
begin = time.clock()
run_main_code()
end = time.clock()

print(end-begin) 
# time consuming

However, pytorch's code often runs on the GPU, and the operation on the GPU is asynchronous, which means that the general timeit operation cannot accurately get the runtime sum, so we generally need to use the pytorchbuilt-in timing tools and synchronization tools , The code is like [1]:

start = torch.cuda.Event(enable_timing=True)
end = torch.cuda.Event(enable_timing=True)

start.record()
z = x + y
end.record()

# Waits for everything to finish running
torch.cuda.synchronize()

print(start.elapsed_time(end))

profile

timeitThe method of testing some small codes is still barely applicable, but it will obviously become very troublesome in large-scale testing. Of course, you can add decorators to simplify the process of adding these time measurement codes manually. Boring, but this is not the best solution.

Fortunately, it pytorchcomes with time-consuming tools for each part of the calculation model, which can calculate the time-consuming cpu or gpu, which is very practical. This magical tool is called profile[3], and in pytorch.autogradit, the usage is very simple.

x = torch.randn((1, 1), requires_grad=True)
with torch.autograd.profiler.profile(enabled=True) as prof:
	for _ in range(100):  # any normal python code, really!
    	y = x ** 2
print(prof.key_averages().table(sort_by="self_cpu_time_total"))

Reference

[1]. https://discuss.pytorch.org/t/how-to-measure-time-in-pytorch/26964/2
[2]. https://blog.csdn.net/LoseInVain/article/details/82055524
[3]. https://pytorch.org/docs/stable/autograd.html?highlight=autograd%20profiler#torch.autograd.profiler.profile

Guess you like

Origin blog.csdn.net/LoseInVain/article/details/106713550