The father of Linux bombarded AMD: angrily criticized fTPM as "stupid" and "broken stuff"

1d1fac37d93c72d1245e1a404fe0a198.gif

Organize | Zheng Liyuan

Listing | CSDN (ID: CSDNnews)

I still remember the shock to developers when Linus Torvalds, the father of Linux, announced in May 2020 that he had abandoned Intel for the first time in 15 years and replaced it with a new computer equipped with an AMD processor:

In fact, the most exciting thing for me this week is that I upgraded my mainframe for the first time in 15 years, my desktop is not Intel based, after the new computer installed AMD Threadripper 3970X processor, my "allmodconfig" Test builds are now three times faster than before.

It is not difficult to see that Linus was quite satisfied with the decision to upgrade the computer processor from Intel to AMD at the time, and his excitement also drove many developers to the AMD camp.

I never thought that after three years, Linus, who seems to support AMD, recently began to express strong dissatisfaction with AMD's fTPM function: "Let's disable the stupid fTPM hwrnd!"

12e8fac0e4ebb081cc87c0580e42d41f.png

(picture from wccftech)

ef7a1535425758aef37a1f4dac4ce104.png

AMD fTPM causing system stuttering

The fTPM function mentioned by Linus, I believe that those who have paid attention to Windows 11 in the past two years should be familiar with it-it is the hardware "high threshold" that has been repeatedly complained by many netizens.

Before Microsoft set "TPM 2.0" as the minimum hardware requirement for Windows 11, most people may not have heard of TPM. TPM (Trusted Platform Module, Trusted Platform Module) is a module made according to the specifications of the International Industry Standards Organization Trusted Computing Group. It can be the real hardware of dTPM, or a software module simulated by firmware such as fTPM. Whether firmware or hardware based, a TPM is used to securely create and store encryption keys, certificates, passwords, and more.

Since Microsoft made TPM a hard requirement to run Windows 11, many AMD users started researching the motherboard's BIOS system to enable the fTPM module to run Windows 11 operating system.

However, after enabling the fTPM module, many users with AMD Ryzen processors said: Why is the system always intermittently stuck, especially audio glitches and game frame rate freezes? !

After ruling out the user's own problems and Windows 11 bugs, the answer seems to come out: there may be compatibility issues between AMD's fTPM and Windows. Sure enough, in March 2022, AMD finally found out the cause of the lag and issued an announcement:

"AMD has determined that certain AMD Ryzen™ system configurations may intermittently perform fTPM-related extended memory transactions in the on-board SPI flash memory ("SPIROM"), which may cause a temporary suspension of system interactivity or responsiveness, until the end of the business."

be3e294a46958bb62ea5a6c0ca831cc4.png

872385c9798835fc306e4e00095dd7f2.png

Fired up by "Grumpy" Linus

Generally speaking, when the system security module communicates with the TPM, other parts of the system are also accessing the memory at the same time. In order to ensure that there is no conflict when reading/writing/modifying data and improve operating performance, the system uses a name The method called memory transaction.

According to AMD's reason for the freeze, its fTPM has a problem with memory transactions: as long as the system security module exchanges data with fTPM, the rest of the hardware needs to wait until the fTPM transaction is completed before continuing to use other memory. This causes the computer to freeze.

After discovering the problem, AMD said the company was working on a solution and wouldn't roll it out until "early May" or later. Later AMD did update the workaround: "As an immediate resolution, affected customers who rely on fTPM functionality to support Trusted Platform Modules can use hardware TPM ("dTPM") devices for trusted computing."

At first, the freeze problem was limited to the Windows platform, that is, the Ryzen processor will cause intermittent freezes in Win10 and Win11 systems after enabling fTPM. So when AMD released a solution, the lag problem on the Windows platform was greatly improved.

But later, Linux distributions were also affected, and the situation was even much worse, not only causing freezes, but also causing more serious compilation errors. Even with a fix that doesn't completely fix the issue, it's most evident in the Linux 6.1 kernel, mostly triggered after the hardware random number generator (hwrng) enables kernel multithreading (kthread) for untrusted sources.

For this situation, AMD did not give more effective solutions - as time went on, it unexpectedly caused the "bombardment" of Linus, who has always been "temper-tempered".

be935972df806bb0edf3d5ee387decf0.png

"I don't see any harm in disabling fTPM directly"

As of now, AMD has not given a clear explanation of the problems caused by fTPM on Linux, but Linus has made some reasoning: "I can easily guess that the BIOS fTPM code should use some terrible global EFI sync lock. sort of stuff, then random issues are thrown based on some completely unrelated activity. For example, it might not be the fTPM hwrnd code itself that decides to read some random number from the SPI, but its serialization with other activities that the BIOS is involved in .”

In Linus' view, the solution to this problem is simple: since fTPM has caused so many problems, why not disable fTPM hwrnd and use the processor's rdrand instruction to provide random numbers?

Let's disable the stupid fTPM hwrnd! Maybe it can be used at startup to "gather entropy from different sources", but obviously it shouldn't be used at runtime.

Why would anyone use this crap when the CPU rdrand instruction on any machine that purports to have this problem fixed (which it clearly isn't) won't have the problem? If you don't trust the CPU's rdrand implementation, why trust the fTPM which causes more problems?

So I don't see any harm in directly "disabling fTPM". Even if it works in the future, there will be other alternatives that will not be worse than the present.

149635391570d403342b4e927ba6baf5.png

In short, Linus believes that fTPM can only be used to provide entropy for the kernel's random number generation service when the system starts up, but fTPM cannot be used as a random number source during normal system use.

Also, Linus admits that rdrand can be slow, but seems like a better alternative to the current fTPM-induced stutter: "rdrand can be pretty slow, but I think we're talking a few hundred CPU cycles , which is much better than the stutter reports we've seen from fTPM."

Therefore, according to Linus, disabling fTPM in the BIOS may be the best solution for AMD users who experience lag in Linux distributions. But in practice, this also limits system functionality, especially when it comes to hardware encryption and security.

However, considering Linus' strong influence in the industry, his "bombardment" may also prompt AMD to pay attention to this, so as to come up with a reasonable and effective solution as soon as possible.

Reference link:

https://www.theregister.com/2023/07/31/linus_torvalds_ftpm/

https://lore.kernel.org/lkml/CUGA0YM7BIJN.3RDWZ1WZSWG28@seitikki/T/

https://www.amd.com/en/support/kb/faq/pa-410

fe1c2ba4541158a1428e771a205a6f72.gif

Guess you like

Origin blog.csdn.net/FL63Zv9Zou86950w/article/details/132094777
AMD