RISCV Reader notes_5 RV32A, RV32C

atomic instruction

RV32A is an extension of RISCV that supports atomic operations. There are two main implementations: atomic memory operation (AMO), load reserved / store conditional

1688218941375

1688219029401

AMO: A processor's operations on memory will not be interrupted, and its value will not be modified by other processors.

load reserved / store conditional: The atomicity of these two instructions is guaranteed, one is to read a value and store it in the target register, and save this record; the other is to write the value if there is a reserved record at the target address, and write it to the target Write the flag of 0 in the register, otherwise the save fails and write non-0.

Application scenarios of lr sc: Architectures often have a (compare-and-swap) atomic compare-and-swap operation. Compare the values ​​of registers 1 and 2, and if they are equal, write the value in memory to register 3, otherwise do not write.

For this operation we need 3 source registers and 1 destination register. And lr sc can split it into two parts.

1688220232523

Above example: a0 is loaded into a3, if a1 and a3 are equal and a0 has saved records, store the value of a2 into a0, and set a3=0 flag to write successfully. Otherwise execute in loop.

That is to say, writing is allowed only when a0 has been taken out and there is a save flag. Otherwise the deposit fails.

1688220542427

The value at address a0 is fetched to t1, and the value at t0 is temporarily stored in the address of a0 to store the previous lock state.

Lock t0 is initially 1 unlocked. Then when locking, assign t0 to a0 (1), and take out the previous value of a0 to t1 to see if it is 1.

It is 1, indicating that other programs have assigned t0, which is 1, to a0, indicating that someone is occupying it. Then wait, and continue to cycle this process.

It is 0, indicating that no one is using it yet, and the critical section code can be started.

At the end, assign the value of the 0 register x0 to a0 to indicate that the critical section resources are no longer accessed.

Compared with swapping, AMO has better performance in multiprocessor systems and can also provide a method for mutually exclusive access to IO resources. Both methods have their advantages.

compression instruction

In order to compress the instruction set, most of the operands are reduced to two, the immediate field is reduced, and so on. In the past, arm thumb2 and microMIPS were some optimizations to the original instruction set, but increased the burden on the processor and the difficulty of programmers' understanding.

In order to reduce the burden, the RV32C compressed instruction requires: under the premise of compression, each compressed 16-bit instruction must be compared with the original instruction. The architects therefore picked the following directives:

1688229039324

  • There are not many commonly used registers (a0-a5, s0-s1, sp and ra), so there is no need for too many bits to store which register rs rt rd refers to.
  • Immediate values ​​tend to be small, so the number of immediate values ​​can be reduced.
  • Many instructions whose rd is one of the source operands, for example add r0, r0, r1, can be reduced.
  • load store only takes an integer multiple of the operand length (word).
  • Use the decoder to convert all 16-bit instructions into 32-bit. The circuit of the decoder only accounts for 5% of the whole, which is quite suitable in comparison. Thumb2 and Arm32 are two sets of ISAs, and two sets of decoders are needed.

Because of the need to match the original instructions, some instructions do not, such as load and store multiple, so the length may not be as good as thumb2 and so on.

However, some architects do not consider RV32C, because for some processor systems that can fetch several instructions at a time, the decoding stage may be the biggest bottleneck, so the decoder stage of 16-32 conversion will greatly affect performance.

Guess you like

Origin blog.csdn.net/jtwqwq/article/details/131496663