Detailed explanation of basic assembly syntax in embedded systems

Detailed explanation of basic assembly syntax in embedded systems

elementary applications

if statement

if (a<b){
    
    x=5;y=c+d;}
else{
    
    x=c-d;}
ADR r4 ,a;
LDR r0 ,[r4];
ADR r4 ,b;
LDR r1 ,[R4];
CMP r0 ,r1;
BLT fblock;
ADR r4 ,c;
//若干执行语句(else)
B after 
fblock ADR ;
//若干执行语句(if a<b)
after

for loop

for (i=0;i<20;i++){
    
    
  for (j=0;j<10;j++){
    
    
    z[i][j]=a[i][j]*b[i];
  }
}
MOV r0 ,#0 ;ro-i
ADR r2 ,z;
ADR r3 ,a;
ADR r4 ,b;
MOV r8 ,#0;
MOV r9 ,#0;

LOOP mov r1,#0; r1=0;
LOOP1 LDR r5 ,[r3,r8];a[i][j]
			LDR r6 ,[r3,r9];b[i]
			MUL r7 ,r5,r6;
			STR r7 ,[r4,r8];
			ADD r1,r1,#1;
			ADD r8,r8,#4;
			CMP r1,#10;
			BLT loop 1
			ADD r9,r9,#4
			ADD r0,r0,#1;
			CMP r0,#20
			BLT loop
loopend
for (i=0;i<20;i++){
    
    
  for (j=0;j<10;j++){
    
    
    z[i][j]=a[i][j]*b[j];
  }
}
MOV r0 ,#0 ;ro-i
ADR r2 ,z;
ADR r3 ,a;
ADR r4 ,b;
MOV r8 ,#0;
MOV r9 ,#0;

LOOP mov r1,#0; r1=0;
MOV r9 ,#0;
LOOP1 LDR r5 ,[r3,r8];a[i][j]
			LDR r6 ,[r3,r9];b[i]
			MUL r7 ,r5,r6;
			STR r7 ,[r4,r8];
			ADD r1,r1,#1;
			ADD r8,r8,#4;
							ADD r9,r9,#4
			CMP r1,#10;
			BLT loop 1
		;	ADD r9,r9,#4
			ADD r0,r0,#1;
			CMP r0,#20
			BLT loop
loopend

State machine applications

Thumb instruction set

The Thumb instruction set uses a re-encoded subset of 16-bit instructions, which has better code density than ARM and increases system performance when ARM uses a 16-bit bus.

similarities

Thumb instructions are all 16-bit and have corresponding ARM instructions, so they inherit many features of the ARM instruction set.

  • LS structure
  • Supports 8-bit bytes, 16-bit half words, and 32-bit word data
Difference

In order to implement 16-bit instructions

  • Most thumb instructions are executed unconditionally

    Conditional execution:

    The condition field occupies the upper 4 bits of the 32-bit instruction field. The condition field has a total of 16 values. Each value determines whether the instruction is executed or jumped based on the flag bit in the CPSR.

  • Most data processing instructions use 2-address format

  • No ARM instruction format rules

B and BL

  • There is not much difference, except that B is mainly used for non-return jumps, and BL is generally used for subroutines.
  • BL will store the address of the next instruction in r14
  • When rotating, only MOV r15, r14 is needed;

Computing platform

basic computing platform

Platform hardware components

Platform software components

Hardware and software are inseparable and require both to work together to achieve functions. Much software for embedded systems comes from external sources, and some software components come from third parties. Hierarchy diagrams are often used to describe the relationship between different software components in the system. The hardware abstraction layer (HAL) provides the basic level abstraction of the hardware. Device drivers usually use HAL to simplify their structures. At the same time, the battery management module must implement hardware at a low level. for a visit. Operating systems and file systems provide the basic abstractions required by complex applications.

CPU bus

The bus is the mechanism by which the CPU communicates with memory and devices. A bus, first and foremost, is a bundle of wires that also defines communication between the CPU, memory, and devices. The main function of the bus is to provide an interface to the memory. The CPU acts as the bus master and initiates all transfers. Most bus protocols use a four-cycle handshake. The handshake protocol ensures that when two devices try to communicate, one is ready to send and the other is ready to receive.

  • Device 1 sets the query signal high to tell Device 2 that it should be ready to listen for data
  • Device 2 is ready to receive data and sets the response signal high. Device 1 and Device 2 are ready to send and receive.
  • Device 2 sets the response signal to a low level to indicate that it has accepted the data.
  • After the answer signal goes low, device 1 sets the query signal low.
Storage devices and systems

Random access memory can be read and written.

Competition and dangers in combinational logic circuits

compete

The same signal or some signals that change at the same time have time differences when reaching a certain point through different paths. We call this competition, but not all competition will produce wrong output. For error output we call it critical contention.

Danger

static hazard

When the output is 1-0-1, the hazard is called a static 1 hazard; when the output is 0-1-0, the hazard is called a static 0 hazard.

Static hazards are further divided into functional hazards and logical hazards.

functional hazards
  • K input signals change simultaneously
  • The steady-state output remains unchanged before and after the input variable changes.

Functional danger is caused by competition caused by the inconsistent change speed of each input signal. It is inherent in the function of the logic function and cannot be eliminated by changing the design. It can only be avoided by controlling the sequence of changes in the input signal.

logical hazards
  • Only one input signal changes
  • The steady-state output is the same before and after the input variable changes

dynamic hazards

If the steady-state values ​​before and after the input change are different, and 1-0-1-0 0-1-0-1 appears before the output stabilizes, we call it a dynamic hazard.

CPU

storage system mechanism

Caches are widely used to improve memory system performance. Can significantly reduce the average memory access time,

forced miss

When the unit is accessed for the first time

capacity miss

Working set too large

conflict miss

Both addresses map to the same location in the cache

How to implement caching

direct map cache

The cache consists of cache blocks

A mark used to indicate which memory unit this block represents
Data field used to save the corresponding contents of the memory
A valid flag indicating whether the contents of this cache block are valid.

The index is used to select which block of the cache is instrumented.

The tag is used to compare with the tag value of the indexed selected block

If the length of the data field is greater than the smallest addressing unit, the lowest bits of the address are used as offsets to select the desired value from the data field.

write operation

write through

Each write operation will simultaneously change the cache and the corresponding main memory unit. This mode ensures the consistency of the cache and main memory, but will generate additional main memory communication.

Writeback

Writing occurs only when a unit is moved out of cache, so we can reduce the number of writes to the unit before it is moved out of cache.

group association

The set-associative cache is described by the number of memory banks or ways, and is accordingly called an n-way set-associative cache.

CPU technology

Assembly line technology

ARM7 three-stage pipeline

1. Instruction fetch: fetch instructions from memory

2. Decoding: Decoding obtains the operation code and operands to determine what function to perform.

3. Execution: Execute the decoded instructions.

Pipelined RISC machines have specially designed timing characteristics; most instructions without pipeline hazards exhibit the same latency.

less than ideal state

data blocking

Multiload instructions are an example of an execution phase that requires multiple cycles to complete.

control block

This is what we call branch loss. Whether to execute conditional BNE cannot be determined until the third clock cycle of the instruction.

Introducing delayed branches In this form of branch instruction, there are instructions directly following the branch instruction that tend to be executed regardless of whether the branch instruction is executed or not.

memory

Random access memory (RAM) can be read and written to. Random access because they can be accessed in any order.

Generally, it is dynamic random access memory (DRAM). DRAM is very dense, but the data needs to be refreshed periodically.

Since a certain part of the storage unit is being refreshed, it cannot be accessed until the refresh is completed.

ROM

Memory, the main type of ROM, uses standard voltages to erase and program, allowing the chip to be reprogrammed within a standard system.

Guess you like

Origin blog.csdn.net/xrk00/article/details/125057194