Remember cache issue of a bootloader

Background problem

Recently bootloader to a armv7 board in a decompression algorithm transplant, transplant itself is still relatively smooth, but finished transplant found that function is normal, but the efficiency is greatly reduced. Similarly decompressed data, which takes about 10 times the uboot.

Preliminary positioning

From the order of 10 times, more suspected Cache related, but other factors also first suspected case was confirmed. Make sure the CPU and DDR under the directly related.

DDR drive is exactly the same, so the first rule DDR.

Then the CPU, the clock After power is cured in a chip BootROM set, the default is relatively low, but look at the code of the CPU clock is adjusted, it has been increased to the 1G. To confirm that the changes are in effect, attempts to set the CPU frequency to reduce a bit and found that indeed will slow the speed, it shows the CPU clock configuration does take effect. To say the least, CPU settings even if not successful, it should not cause ten times the performance gap.

Cache eyes then fell on him. From the code point of view, MMU, DCache is both open and ICache of. So since it enabled, you must find a way to confirm whether indeed play a role, a simple way is deliberately not enable it to see if there are changes in performance.

Modify the code were not tested not enabled and enabled ICache DCacne decompression time, seen from the results of the work ICache, and DCache did not work, no impact on the switching DCache decompression time. That question certainly on DCache.

Cache settings

To this point, I think the problem resolved before another Cache does not work, it is ultimately found to be set smp bit, then add the corresponding setup code, but did not solve the problem together.

After continued google, looked up some information on the Cache, he turned to the page table setting mmu.

In simple terms, when you enable mmu, a page table need to give informed mmu, how virtual address and physical address mapping in the page table, with each site as well as a number of functions, including the authority, Cache settings.
For some address register associated Cache generally not enabled, this register is not read by the Cache Effect. As for the other normal address, usually Cache enabled to improve efficiency. The Enable Cache Cache also need to configure specific mode can be configured as write-through (write-through / write-through) or wrike-back (write back). For write-through, data is written to both Cache and write to main memory, Cache and main memory data is always consistent. For write-back, only data written to Cache, and marked as dirty, which only when the Cache is swapped out of main memory written to.

The actual control of the page table, find the set write-through. requires write-through actually written each time the main memory, the natural rate is slow, and quickly modify the write-back test. Sure enough, decompression speed to get a qualitative leap.

The problem in my code itself runs on Sram, and the need to extract the source data, and the data is decompressed on the Dram. After Dram provided corresponding to the address to write-back, the speed gained approximately 3 fold increase. After further Sram corresponding address is also provided to write-back, the speed of about 10 fold increase again. Cumulative lift about 28 times, it is not only praise Cache really is a good thing.

Incidentally, plus the beginning of smp bit really need, if you find DCache did not work, you can check this setting, before on another issue is fucked me several days to ferret out from uboot in this configuration.

Cache back brush

When you're done, decompression speed Leverage, but it also brings a number of other issues, such as my system can not boot up, bootloader jumps directly linked to the past. To the next, it should be changed to the write-back Cache and main memory data inconsistency caused.
If you are in the main system, that you have fine control over the Cache, the back brush back to brush the invalid invalid, but in this issue in my scene is relatively simple, bootloader poor and white, simply some of it, and then transplanted some brush Cache code, all direct brush DCache. Then in a few key places a call to the next, and sure enough, start the process back to normal.

Guess you like

Origin www.cnblogs.com/zqb-all/p/11443127.html