Software exam for advanced system architect on computer hardware basics and embedded systems

Today: September 7, 2023, there are only 57 days left before the advanced soft exam.

Computer hardware basics

Storage components in computer systems are usually organized into a hierarchical structure, with storage components closer to the CPU being accessed faster. The storage speeds from fast to slow are: register bank, Cache, memory, and Flash.

When the computer executes a program, in the process of an instruction cycle, in order to read the instruction operation code from the memory, the contents of the program counter (PC) are first sent to the address bus.

Generally speaking, embedded systems usually use shift registers in the interface to implement serial/parallel and parallel/serial conversion operations of data.

The operating frequency (main frequency) of the CPU consists of two parts: the FSB and the frequency multiplier. The product of the two is the main frequency. FSB, that is, external frequency, refers to the system bus frequency. Multiplier, multiplier coefficient, multiplier coefficient refers to the relative proportional relationship between the CPU main frequency and the external frequency. Initially, the CPU clock speed and the system bus speed were the same, but as the CPU speed became faster and faster, frequency multiplication technology was produced accordingly. Its function is to make the system bus work at a relatively low frequency, and the CPU speed can be increased by frequency multiplication.

The CPU usually accesses the memory in a synchronous manner, the CPU and the I/O interface exchange information in a synchronous manner, the CPU and the PCI bus exchange information in a synchronous manner, and the I/O interface and the printer exchange information in an asynchronous manner based on the buffer pool.

The controller controls the work of the entire CPU, including program control and timing control. Its components include: Instruction
register IR: temporarily stores the instruction to be executed.
Program counter PC: stores the execution address of the next instruction.
Address register AR: stores the memory currently accessed by the CPU. Address
Instruction Decoder ID: Analyze instruction opcode

AND

IR: Instruction Register, used to temporarily store the currently executing instructions. The clock signal of the instruction register is clk, which is triggered on the rising edge of clk. The instruction register stores the instructions sent from the data bus into a 16-bit register, but not every data on the data bus needs to be registered, because sometimes instructions are transmitted on the data bus, and sometimes data is transmitted. Whether the data needs to be registered is controlled by the signal of the CPU state controller Ir_ena. On reset, the instruction register is cleared.

PC

PC: Program Counter, Program Counter, is a register in the central processing unit that indicates the computer's position in its program sequence. A place used to store the address of the unit where the next instruction is located. When executing an instruction, the instruction first needs to be fetched from the memory to the instruction register according to the instruction address stored in the PC, that is, the instruction is fetched. At the same time, the address in the PC is either automatically incremented by 1 or the address of the next instruction is given by the branch pointer. After that, the instructions are analyzed and executed. Complete the execution of the first instruction, and then fetch the address of the second instruction according to the PC, and so on, executing each instruction.

SR

SR: Status Register, also known as condition code register, is the core component of the computer system - part of the arithmetic unit. The status register is used to store two types of information:

  • One type is various status information (condition codes) that reflect the current instruction execution results, such as whether there is a carry (CF bit), whether there is an overflow (OV bit), whether the result is positive or negative (SF bit), whether the result is zero (ZF bit) ), parity flag bit (P bit), etc.;
  • The other type stores control information (PSW, program status word register), such as enabling interrupts (IF bit), tracking flags (TF bit), etc. In some machines, PSW is called flag register FR (Flag Register).

GR

GR: General-purpose register, which can be used to transmit and temporarily store data. It can also participate in arithmetic and logical operations and save the results of operations. In addition, they each have some special features. Assembly language programmers must be familiar with the general and special uses of each register. Only in this way can they be used correctly and reasonably in the program.

bus

A bus is a set of information transmission lines that can be shared by multiple components in a time-sharing manner. It is used to connect multiple components and provide an information exchange path for them. The so-called sharing means that all components connected to the bus can transmit information through it; time-sharing means that only one component is allowed to send data to the bus at a certain time. Therefore, sharing is achieved through time sharing .

serial bus

Features:

  • Serial buses are divided into half-duplex and full-duplex. Full-duplex means one line sends and one line receives.
  • Serial bus is suitable for transmitting data over long distances
  • The serial bus transmits and receives bits. Although slower than byte-by-byte parallel communication, a serial port can send data on one wire while receiving data on another wire. It is simple and enables communication over long distances. For example, when IEEE488 defines the parallel communication state, it stipulates that the total length of the equipment line shall not exceed 20 meters, and the length between any two devices shall not exceed 2 meters; for serial ports, the length can be up to 1200 meters.
  • The most important parameters of serial communication are baud rate, data bits, stop bits and parity. For both ports to pass through, these parameters must match
  • Serial bus data can be sent and received in a variety of ways, with interrupts and DMA being the more common ones.

PCI

PCI is a local bus standard. It is a first-level bus inserted between the CPU and the original system bus. Specifically, a bridge circuit implements the management of this layer and implements the interface between the upper and lower layers to coordinate the transmission of data. .

JTAG

JTAG is a debugging interface used by developers to debug the working status of the CPU. JTAG software controls the CPU through this interface to debug the CPU and read and write Flash.

VM

Virtual memory, in a computer system with hierarchical memory, automatically implements partial loading and partial replacement functions, and can logically provide users with an addressable main memory that is much larger than the physical storage capacity. The capacity of the virtual storage area has nothing to do with the size of the physical main memory, but is limited by the computer's address structure and available disk capacity. The page replacement is performed according to the corresponding page replacement algorithm. When the page fails, data exchange needs to be performed, which involves the conversion of the logical address (virtual address) to the auxiliary storage physical address.

CISC & RISK

The basic idea of ​​CISC (Complex Instruction Set Computer) is to further enhance the functions of the original instructions, replace the functions originally performed by software subroutines with more complex new instructions, and realize the hardwareization of software functions, resulting in machine The command system is becoming increasingly large and complex. CISC computers generally contain at least 300 instructions, and some even exceed 500.

The disadvantages are as follows:

  • Microprogram technology is an important pillar of CISC. Each complex instruction must be completed by executing an interpretive microprogram, which requires multiple CPU cycles and reduces the processing speed of the machine;
  • The instruction system is too large, which makes the high-level language compiler have a wide range of target instructions to choose, and makes the compiler itself lengthy and complex, making it difficult to optimize the compilation to generate truly efficient target code;
  • CISC emphasizes perfect interrupt control, which will inevitably lead to numerous actions, complex designs, and long development cycles;
  • CISC brings many difficulties to chip design, increasing the types of chips, increasing the probability of errors, increasing costs and reducing yields.

The basic idea of ​​RISC (Reduced Instruction Set Computer) is to reduce the complexity of hardware design by reducing the total number of instructions and simplifying instruction functions, so that instructions can be executed in a single cycle, and through optimized compilation, the execution speed of instructions is improved. Use hard-wired control logic to optimize the compiler.

The key technologies of RISC are:

  • Overlapping register windows technology was first used in Berkeley's RISC project;
  • Optimizing compilation technology, RISC uses a large number of registers. How to reasonably allocate registers, improve register usage efficiency, reduce the number of memory accesses, etc. should be achieved through the optimization of compilation technology;
  • Superpipeline and superscalar technology: new technologies adopted by RISC to further increase pipeline speed;
  • Hardwired logic combined with microprogramming in microprogramming technology

DSP

Programming DSP chip is a microprocessor with a special structure. In order to achieve the purpose of rapid digital signal processing, DSP chips generally adopt special software and hardware structures: 1) Harvard structure DSP adopts
Harvard
structure, which divides the memory space into two two to store programs and data respectively. They have two sets of buses connected to the processor core, allowing them to be accessed simultaneously, and each memory is independently addressed and accessed independently. This arrangement doubles the processor's data throughput and, more importantly, simultaneously provides data and instructions to the processor core. Under this layout, the DSP can implement single-cycle MAC instructions. In the Harvard architecture, because program and data memory are in two separate spaces, instruction fetches and execution can completely overlap.
2) Pipeline
Related to the Harvard structure, DSP chips widely use 2-6 stage pipelines to reduce instruction execution time and thereby enhance the processor's processing power. This enables complete overlap of instruction execution, with different instructions active during each instruction cycle.
3) Independent hardware multiplier
In systems that implement multimedia functions and digital signal processing, algorithm implementation and digital filtering are both computationally intensive applications. In these situations, multiplication operations are an important part of number processing and one of the basic elements in the implementation of various algorithms. The faster the multiplication is performed, the higher the performance of the DSP processor. Compared with the 30-40 instruction cycles required by a general processor, the DSP chip is characterized by a dedicated hardware multiplier, and the multiplication can be completed in one cycle.
4) Special DSP instructions:
Special instructions are used to optimize some common algorithms in digital signal processing. These special instructions provide acceleration for some typical number processing and can greatly improve the execution efficiency of the processor. Makes real-time data processing of some high-speed systems possible.
5) Independent DMA bus and controller
There are one or more independent DMA buses that work in parallel with the CPU's program and data buses. Without affecting the work of the CPU, the DMA speed has reached more than 800MB/S. This can reduce CPU overhead and improve data throughput when a large amount of data is exchanged. Improve the parallel execution capability of the system.
6) Multi-processor interface
allows multiple processors to work in parallel or serially to increase processing speed.
7) JTAG, Joint Test Action Group, standard test interface (IEEE 1149 standard interface), facilitates on-chip online simulation of DSP and debugging under multi-DSP conditions.
8) Fast instruction cycle
Harvard structure, pipeline operation, dedicated hardware multiplier, special DSP instructions and optimized design of the integrated circuit can make the instruction cycle of the DSP chip below 10ns. The fast instruction cycle enables DSP chips to implement many DSP applications in real time.

MMU

Memory Management Unit, the memory management unit, is a device used to manage the virtual memory system. The MMU is usually part of the CPU and has a small amount of storage space to store the matching table from virtual addresses to physical addresses. This table is called the TLB (Translation Lookaside Buffer). All data requests are sent to the MMU, which determines whether the data is in RAM or a mass storage device. If the data is not in the storage space, the MMU will generate a page fault interrupt. The two main functions of the MMU are to convert virtual addresses into physical addresses and to control memory access permissions. When the MMU is turned off, the virtual address is output directly to the physical address bus. The Cortex-M3 processor uses the ARMv7-M architecture, which includes all 16-bit Thumb instruction sets and the basic 32-bit Thumb-2 instruction set architecture. Cortex-M3 supports thread mode and processing mode. The processor enters thread mode on reset and also enters this mode when an exception is returned. Privileged and user (non-privileged) mode code can run in thread mode. When exception mode occurs, the processor enters processing mode. In processing mode, all code has privileged access. μC/OS-II can run on the Cortex-M3 processor.

MPU

Embedded microprocessor (MPU) is the core of the hardware layer of embedded systems. Most of them work in systems specially designed for specific user groups. It integrates many tasks completed by boards in general-purpose CPUs into the chip, thus facilitating embedding. The system tends to be miniaturized in design, while also having high efficiency and reliability. The typical representative of Embedded Microcontroller Unit (EMCU) is a single-chip microcomputer, also called an embedded microcontroller. It is small in size and compact in structure. It is installed as a component in the controlled device and mainly performs signal control functions.

Digital signal processor (Digital signal processor) is composed of large-scale or very large-scale integrated circuit chips and is a processor used to complete certain signal processing tasks. It was gradually developed to meet the needs of high-speed real-time signal processing tasks. With the development of integrated circuit technology and digital signal processing algorithms, the implementation methods of digital signal processors are also constantly changing, and the processing capabilities are constantly improving.

System-on-a-chip refers to the integration of a complete system on a single chip, generally including CPU, memory, and peripheral circuits. SOC was developed in parallel with other technologies, such as silicon on insulator (SOI), which can provide enhanced clock frequencies and thus reduce the power consumption of microchips.

RAID

Let’s start another article with reference to RAID technology in computer basics .

DMA

Direct Memory Access, DMA method is a data exchange control method directly between peripherals and memory without going through the CPU. In DMA mode, the CPU only needs to issue instructions to the DMA controller, and let the DMA controller handle the data transmission. After the data transmission is completed, the information is fed back to the CPU.

index

Throughput

The throughput rate of the instruction pipeline is defined as: throughput rate TP = number of instructions / execution time.

Embedded Systems

Embedded systems are application-centered and based on computer technology. They can adapt to the requirements of different applications in terms of functionality, reliability, cost, volume, and power consumption. They are special computer systems that integrate configurable and tailorable software and hardware. .

component:

  • Embedded hardware platform
  • Related supporting hardware
  • embedded operating system
  • Support software
  • application

The main reason for using interrupts to implement input and output in embedded systems is to respond quickly to emergencies. During an interrupt, CPU breakpoint information is generally saved to the stack .

The hardware abstraction layer is the interface layer between the operating system kernel and the hardware circuit. Its purpose is to abstract the hardware. It hides the hardware interface details of a specific platform and provides a virtual hardware platform for the operating system, making it hardware-independent and portable on multiple platforms.

embedded operating system

The embedded operating system is an operating system software used in embedded systems to realize the allocation of software and hardware resources, task scheduling, control and coordination of concurrent activities, etc. In addition to the most basic functions of a general operating system, such as multi-task scheduling, synchronization mechanism, etc., it usually also has the following features suitable for embedded systems:

  • Application-oriented and can be inspected and ported to support open and scalable architectures;
  • Strong real-time performance to adapt to various control equipment and systems;
  • Hardware applicability provides effective support for different hardware platforms and achieves unified device driver connection with high reliability. It does not require excessive user intervention during operation and handles various events and faults;
  • The coding size is small and is usually fixed in the limited storage unit of the embedded system.

Power consumption control at the software design level can mainly be carried out from the following aspects:

  • Software and hardware co-design, that is, the design of software must match the hardware and consider hardware factors.
  • Compilation optimization, using low-power optimization compilation technology
  • Optimize and reduce system running time from an algorithmic perspective
  • Use interrupts instead of queries
  • Effective power management

In the development of embedded systems, since embedded devices do not have sufficient processor power and storage space, program development is generally completed on a PC (host machine), and then the executable file is downloaded to the embedded system (target machine) for running. . When the machine instructions of the host machine and the target machine are different, a cross tool chain (referring to a complete set of tools such as compilation, assembly, linking, etc.) is required , such as a cross compiler, to generate the executable code of the target machine on the host machine.

initialization

The embedded system initialization process can be divided into three main links, in order from bottom to top, from hardware to software: chip-level initialization, board-level initialization and system-level initialization.

  • Chip-level initialization: Complete the initialization of the embedded microprocessor, including setting the core register and control register of the embedded microprocessor, the core working mode of the embedded microprocessor and the local bus mode of the embedded microprocessor, etc. Chip-level initialization gradually sets the embedded microprocessor from the default state when powered on to the working state required by the system. This is a purely hardware initialization process.
  • Board-level initialization: Complete the initialization of other hardware devices other than the embedded microprocessor. In addition, some software data structures and parameters need to be set to establish the hardware and software environment for subsequent system-level initialization and application running. This is an initialization process that includes both software and hardware parts.
  • System initialization: This initialization process is mainly software initialization, which mainly initializes the operating system. The BSP transfers control of the embedded microprocessor to the embedded operating system, and the operating system completes the remaining initialization operations, including loading and initializing device drivers that have nothing to do with the hardware, establishing a system memory area, and loading and initializing other system software modules. , such as network systems, file systems, etc. Finally, the operating system creates the application environment and hands control to the application's entry point.

Embedded Development

Embedded software development

Embedded hardware development

Power consumption control

Power consumption control at the software design level can mainly be carried out from the following aspects:

  1. Software and hardware co-design, that is, the design of software must match the hardware and consider hardware factors.
  2. Compilation optimization, using low-power optimization compilation technology
  3. Reduce the continuous running time of the system and optimize it from an algorithmic perspective
  4. Use interrupts instead of queries
  5. Effective management of power

RTOS

Real-Time Operating System,

Tasks are the most important operating objects in RTOS. Each task is executed by the CPU in a time-sharing manner under the scheduling of RTOS. There are currently three main types of task scheduling: time-slicing, turn-taking, and priority preemption. Different RTOSs may support one or more of them, among which priority preemption has the best support for real-time performance.

In non-real-time systems, the main purpose of scheduling is to shorten the average response time of the system, improve the utilization of system resources, or optimize a certain indicator; while the purpose of scheduling in real-time systems is to ensure that each task meets their requirements as much as possible. time constraints to respond to external requests in a timely manner.

Scheduling

Scheduling: Given a set of real-time tasks and system resources, the entire process of determining when and where each task will be executed.

  • Preemptive scheduling: usually priority-driven scheduling, such as uCOS. The advantages are good real-time performance, fast response, and the scheduling algorithm is relatively simple, which can ensure the time constraints of high-priority tasks; the disadvantage is that there are many context switches.

  • Non-preemptive scheduling: usually a schedule allocated according to time slices, which does not allow tasks to be interrupted during execution. Once a task occupies the processor, it must be executed or voluntarily given up, such as WinCE. The advantage is that there are fewer context switches; the disadvantage is that the effective resource utilization of the processor is low and the schedulability is not good.

  • Static table-driven strategy: Before running, the system uses a certain search strategy to generate a running schedule based on the time constraints and relationships of each task, indicating the starting running time and running time of each task.

  • Priority-driven strategy: Determine the execution order of tasks according to their priority.

Real-time task classification: periodic tasks, sporadic tasks, and non-periodic tasks.

General structural model of real-time systems: the data collection task implements the collection of sensor data, the data processing task processes the collected data, and sends the processed data to the execution agency management task control agency for execution.

microkernel system

Insert image description here
The basic idea is to extract the parts of the operating system that are directly related to the hardware as a common layer, called the hardware abstraction layer (HAL). HAL, which is closely related to the hardware, hides the diversity of the hardware from the operating system. In fact, it is a virtual machine that provides a series of standard services through API interfaces to all other layers based on this layer. Only a few components such as processor scheduling, storage management, and message communication are retained in the microkernel, and some components of the traditional operating system kernel are implemented outside the kernel. For example, kernel functions such as file management system, process management, device management, virtual memory and network in traditional operating systems are all implemented outside the kernel as an independent subsystem. Therefore, most of the code of the operating system only needs to be designed on a unified hardware architecture.

main feature:

  • The kernel is very small, and many operating system services do not belong to the kernel but run on top of the kernel. In this way, the kernel does not need to be recompiled when higher-level modules are updated.
  • With a hardware abstraction layer, the kernel can be easily ported to other hardware architectures. Because when it is necessary to migrate to a new software or hardware environment, the microkernel can be embedded into the new hardware environment with only slight modifications to the hardware-related parts. In most cases, there is no need to transplant external servers or clients. application.
  • Flexibility and scalability. If you want to implement another view, you can add an external server. To expand functionality, internal servers can be added and expanded.

When designing a microkernel OS, the use of object-oriented technologies, such as encapsulation, inheritance, object classes and polymorphism, as well as the use of message passing mechanisms between objects, etc., can improve the correctness, reliability, and ease of modification of the system. flexibility, easy scalability, etc., and it can also significantly reduce the overhead of developing the system. Compared with traditional operating systems, operating systems using microkernel structures have the advantage of improving system flexibility and scalability, enhancing system reliability, and providing support for distributed systems. The reasons are as follows:

  • Flexibility and scalability: Since many functions of the microkernel OS are implemented by relatively independent server software, when new hardware and software are developed, the microkernel OS only needs to add new functions to the corresponding server, or Add another dedicated server. At the same time, the flexibility of the system will inevitably be improved. Not only can new functions be added to the operating system, but original functions can also be modified, and obsolete functions can be deleted to form a more lean and effective operating system;
  • Enhance the reliability and portability of the system: Since the microkernel is carefully designed and rigorously tested, it is easy to ensure its correctness; on the other hand, it provides a standardized and streamlined application program interface (API) for external use of the microkernel. Create conditions for high-quality programming. In addition, since all servers run in user mode, the message passing communication mechanism is used between servers. Therefore, when an error occurs on a certain server, it will not affect the kernel or other servers. In addition, in an operating system with a microkernel structure, all codes related to specific CPU and I/O device hardware are placed in the kernel and the hardware hidden layer below the kernel, while most other parts of the operating system (i.e. various Server) are independent of the hardware platform, therefore, the modifications required to port the operating system to another computer hardware platform are relatively small;
  • Provide support for distributed systems: Since in microkernel OS, communication between clients and servers and between servers is carried out using a message passing communication mechanism, microkernel OS can well support distributed systems. and network systems. In fact, as long as all processes and servers in the distributed system are given unique identifiers, a system mapping table is configured in the microkernel (that is, the correspondence table between the identifiers of the processes and servers and the machines where they reside. ), when communicating between client and server, you only need to mark the sent message with the identifiers of the sending process and the receiving process, and the microkernel can use the system mapping table to send the message to the target, regardless of where the target resides. on a machine.

design

Core Technology of Embedded Systems

  1. Processor technology
  • General-purpose processor
    has strong versatility, fast time to market and low cost. Performance may be poor for some applications.
  • Single-purpose processors
    perform better for some applications but are less flexible. The design cost is high, but the cost can be reduced when the quantity is large.
  • Dedicated processor
    smaller, lower cost
  1. IC technology
  • Fully customized VLSI
    In fully customized IC technology, it is necessary to optimize each layer of designers according to the digital implementation of a specific embedded system. Designers start from the layout size, position, and wiring of the transistors, and have achieved high chip area utilization, high speed, Optimized performance with low power consumption. Using masks to produce actual chips in a manufacturing plant, fully custom IC designs, also commonly known as large-scale integrated circuit designs (VLSI), have high costs, long manufacturing times, and are suitable for large volumes or stringent performance requirements. application.
  • Semi-custom ASIC
    Semi-custom ASIC is a constrained design method, including gate array design method and standard cell design method. By making some semi-finished hardware with universal unit components and component groups on the chip, designers only need to consider the logical functions of the circuit and the reasonable connections between each functional module. This design method is flexible, convenient, cost-effective, shortens the design cycle, and improves the yield.

  • All layers in the programmable ASIC programmable period already exist. After the design is completed, the designed chip can be fired in the laboratory without the participation of IC manufacturers, and the development cycle is significantly shortened . Programmable ASICs are lower cost, consume more power, and are slower.
  1. Design/verification technology
    Embedded system design technology mainly includes hardware design and software design technology:

Hardware design technology includes:

  • Chip level design technology
  • Board level design technology

Embedded system design model

  • State machine model
  • data flow model
  • Concurrent process model
  • object-oriented model

embedded database

Embedded databases are actually similar to in-memory databases and run in the same process as the application. In-memory database refers to in-memory database Apache Derby, H2 .

Features: embedded, mobility, scalability, real-time.

The database system of an embedded system is called an embedded database system or an embedded real-time database system. Embedded systems must be able to run uninterrupted for long periods of time without human intervention, and therefore require high reliability. At the same time, database operations are required to be predictable, and the size and performance of the system must also be predictable to ensure system performance. Embedded systems need to deal with the underlying hardware, so when managing data, they must also have the ability to control the underlying layer, such as when disk operations will occur, the number of disk operations, how to control them, etc. The ability of underlying control is the key to determining database management operations. Embedded database management systems generally only provide native service interfaces to provide basic data support for front-end applications.

reference

Guess you like

Origin blog.csdn.net/lonelymanontheway/article/details/132544507