A problem in HLS

In the process of HLS accelerating the convolutional neural network, due to the input and output feature maps and weights are segmented (loop tiling), the address appears discontinuous when reading the data, although the innermost loop is added The comprehensive report after the pipeline command shows success (II=1). However, by carefully observing the C/RTL co-simulation, it can be found that the actual data transmission did not reach the expected pipeline reading, and the number of clock cycles used to read the data during simulation was long It is much larger than the number of clock cycles given in the comprehensive report. I don't know if this is a BUG of HLS, but because of this, the data transmission time is greatly increased, which is an urgent problem to be solved. I don't know if there is a good solution.

Guess you like

Origin blog.csdn.net/qq_40268672/article/details/105738969