Software exam advanced system architect computer basics

Overview

Today is September 28th, there are only 37 days left before the advanced soft exam, come on!

concept

Three cycles:

  • Clock Cycle: Clock cycle, CPU main frequency, also known as clock frequency, clock cycle is the reciprocal of clock frequency
  • Instruction Cycle: Instruction cycle, the time to fetch and execute an instruction
  • BUS Cycle: Bus cycle, which is the time it takes to access a memory or I/O port.

The relationship between the three: one instruction cycle consists of several bus cycles, and one bus cycle contains several clock cycles.

The main frequency is 1GHz, which means there are 1G clock cycles in 1 second, and each clock cycle is 1000*1000*1000/1G=1ns.

Throughput rate
Throughput rate refers to the number of tasks completed per unit time. Commonly used expressions include MIPS and MFLOPS:

  • MIPS: Million Instructions Per Second, the number of millions of machine language instructions processed per second, mainly used to measure the performance of scalar machines
  • MFLOPS: Million Floating-point Operations Per Second, one million floating-point operations per second, which cannot reflect the overall situation but only the floating-point operation situation. It is mainly used to measure the performance of vector machines.

It takes 5 machine cycles to complete an instruction, and one machine cycle is 3 microseconds (µs), then MIPS=1/15=0.067.

1秒=1000毫秒(ms)=1000000微秒(µs)=1000000000纳秒(ns)

Encoding and floating point numbers

Original code, inverse code, complement code

Original code: It is a binary number. For example, the original code of the decimal number 10 is0000 1010

Inverse code: Positive numbers are the same as the original codes, such as: the complement code of the decimal number 10. Negative 0000 1010numbers change the original code from 0 to 1, and from 1 to 0, and the sign bit remains unchanged. Such as: decimal numbers -10, original code 1000 1010, complement code1111 0101

Complement code: The complement code of a positive number is the same as the original code. For example, the complement code of 10 is: 0000 1010. The complement of a negative number is the complement minus 1, such as -10the complement: 1111 0101, the complement:1111 0110

Numbers in computers are stored in two's complement code, because the calculation of the original code and the complement code is not accurate, but the two's complement code is accurate, and the two's complement code can directly handle the addition and subtraction of positive and negative numbers.

floating point number

Scientific notation:N=尾数*基数(的指数)

  • The meaning of each part of a floating point number:N = 尾数*基数指数
    • Generally, the mantissa is complemented and the exponent is framecoded;
    • The number of digits in the exponent code determines the range of number representation. The more digits, the greater the range;
    • The number of digits in the mantissa determines the effective precision of the number. The more digits, the higher the precision.
      The range of values ​​that floating point numbers can represent is as follows: the largest positive number
  • Floating point operation rules:对阶 > 尾数计算 > 结果格式化
    • When aligning the order, the small numbers should be aligned with the large numbers;
    • The order is achieved by shifting the mantissa of the smaller number to the right.

assembly line

Pipelining refers to a quasi-parallel processing implementation technology in which multiple instructions overlap during program execution. The simultaneous processing of various components is for different instructions. They can work on different parts of multiple instructions at the same time to improve the utilization of each component and the average execution speed of the instructions.

The idea of ​​pipeline: convert the serial execution process of one instruction in the CPU into sub-processes of several instructions that are executed overlappingly in the CPU.

Parallelism includes two meanings: simultaneity and concurrency. Simultaneity means that two or more events occur at the same time, and concurrency means that two or more events occur at the same time interval.
Pipeline setup time: 1 instruction execution time.
Pipeline cycle: the period with the longest execution time

The execution process of an instruction can be divided into three stages:

  • Instruction fetch: access the main memory according to the contents of the instruction counter, fetch an instruction and send it to the instruction register.
  • Analysis: Decode the instruction opcode part, form the operand address according to the given addressing mode and the content in the address field, and use this address to read the operand.
  • Execution: Write the operation result to the general register or main memory.

Sequential execution process: 取址 -> 分析 -> 执行
Use pipeline execution process:
Insert image description here
Related test points:
Pipeline execution time (theoretical formula): (t1+t2+..+tk)+(n-1)*∆t
Pipeline execution time (practical formula): k*∆t+(n-1)*∆t
Pipeline throughput rate: Throughput rate TP = number of instructions/execution time
Pipeline maximum throughput rate:
Insert image description here
Pipeline acceleration ratio : Sequential execution time/pipeline execution time

Boot process

The BIOS boot program is solidified on the ROM chip and is automatically executed every time the computer is turned on. Mainly performs the following tasks:

  • Identify and configure all plug and play devices. If the system has a BIOS for Plug and Play devices, the system will search and test all installed Plug and Play devices and assign them DMA channels, IRQs, and other devices they require
  • Complete the power-on self-test (POST). Power-on self-test mainly detects and tests basic equipment such as memory, ports, keyboards, video adapters, and disk drives. Some newer versions of the system also support CD-ROM drives
  • Locate the bootable partition of the boot drive. In CMOS, the user can set the system's boot order to relocate the bootable partition of the boot drive. The boot sequence for most systems is the software driver, then the hardware driver, then the CD-ROM drive
  • Load the master boot record and the partition table of the boot drive, and execute the master boot record MBR. After the master boot record finds a bootable partition on the hard disk, it loads its partition boot record into memory and hands over control to the partition boot record. Locate the root directory from the partition boot record, and then load the operating system.

Cache

All functions of Cache are implemented by hardware . Increase the rate of CPU data input and output and break through the von Neumann bottleneck, which is the limitation of data transmission bandwidth between the CPU and the storage system. In the computer storage system system, Cache is the layer with the fastest access speed. The basis for using Cache to improve system performance is the principle of program locality.

  • Temporal locality: If an information item is being accessed, it is likely to be accessed again in the near future. Program loops, stacks, etc. are the causes of time locality;
  • Spatial locality: Once a program accesses a certain storage unit, its nearby storage units will also be accessed soon. That is, the addresses visited by the program within a period of time may be concentrated in a certain range. The typical situation is program sequence. implement.

Working set theory: The working set is a collection of pages that are frequently accessed when the process is running.
Working principle: Based on the principle of locality, content with a high access probability in the main memory is stored in the cache. When the CPU needs to read data, it first searches the cache to see if there is the required content. If so, it reads it directly from the cache; if not, it reads the data from the main memory and then sends it to the CPU and cache at the same time. .

If most of the content that the CPU needs to access can be found in the cache (access hits), system performance can be greatly improved. After the CPU issues a memory access request, the memory address is first sent to the cache controller to determine whether the required data is already in the cache. If there is a hit, the cache is directly accessed. This process is called cache address mapping. Common mapping methods include direct mapping, associative mapping and group associative mapping:

  • Directly associated imaging: Direct Mapping, the hardware circuit is relatively simple, but the conflict rate is high
  • Fully Associative Mapping: Fully Associative Mapping. The circuit is difficult to design and implement. It is only suitable for small-capacity caches and has a low conflict rate.
  • Set-Associative Mapping: n-ways Set-Associative Mapping, a compromise between direct associative and fully associative mapping

After the data in the Cache is full, the existing data must be eliminated and new data must be loaded. The elimination algorithms include: random elimination method, first-in-first-out method FIFO, and least recently used elimination method LRU. LRU has the highest average hit rate.

Common methods for writing data in Cache include the following categories:

  • Write-through: When writing to the cache, the data is written back to the main memory at the same time, also called write-through and penetration.
  • Write back: After the CPU modifies a certain line in the cache, the corresponding data is not written to the main memory immediately. Instead, the data is written back to the main memory when the line is eliminated from the cache.
  • Notation: Set a valid bit for each data in the cache. When the data enters the cache, the valid bit is set to 1. When the CPU wants to modify the data, it only needs to write it to the main memory and clear the bit to 0. When you want to read data from the cache, you need to test its valid bit: if it is 1, it will be taken from the cache, otherwise it will be taken from the main memory.

Check code

The check code is usually the last digit of a group of numbers, which is derived from the previous numbers through some kind of operation to verify the correctness of the group of numbers.

Common verification codes include: the last digit of the People's Republic of China resident ID card, the last digit of the ISBN number, the last digit of the organization code, the data transmission correctness verification code, etc.

When codes are input as data into computers or other devices, input errors are prone to occur. In order to reduce input errors, coding experts have invented various verification and error detection methods, and set check codes based on these methods.

Parity check: only detects odd bit errors and cannot correct errors.
Cyclic check code CRC: error-checkable but not error-correctable. Modulo-two division is used to calculate the check code.
Hamming checksum: error checking and error correction; Hamming checksum check digit calculation: 2r>=r+m-1.

Performance evaluation

Performance

The performance of computer systems generally includes two major aspects:

  • Availability is the time that a computer system can work normally. Its indicator can be the length of time it can continue to work, or it can be the percentage of time it can work normally within a period of time.
  • Processing capability can be divided into three categories of indicators: throughput rate, response time, and resource utilization, which is the ratio of the time that various components are used to the entire time in a given time interval.

Generally, the availability of a computer system can be evaluated from three aspects: failure rate (Failurerate), robustness (Robustness) and recoverability (Recoverability):

  • Failure rate refers to the number of system failures and maintenance events that occur in a given period of time
  • Robustness refers to the system's ability to detect and handle faults, as well as the system's ability to work under various fault conditions.
  • Recoverability refers to the ability of a system to recover from a faulty state to a normal state.

The portability of computer application systems is important for promoting applications, but for most users using a single system, availability indicators mainly include failure rate, robustness and recoverability.

In addition, the performance indicators of different computers (systems, modules) are also different. The key is to understand:

  • Computer: clock frequency (main frequency), cache, operation speed, operation accuracy, memory storage capacity, memory access cycle, data processing rate, response time, RASIS characteristics, mean failure response time, compatibility
  • Network: device level, network level, application level, user level, throughput
  • Operating system: system reliability, system throughput, system response time, system resource utilization, portability
  • Database: database description function, management function, query and manipulation function, maintenance function,
  • Web server: including maximum number of concurrent connections, response latency, throughput (number of requests processed per second), number of successful requests, number of failed requests, number of clicks per second, number of successful clicks per second, number of failed clicks per second, connection attempts number, user connections

Performance evaluation

Computer performance assessment (evaluation) methods are basically divided into two categories, measurement methods and model methods

  • Measurement method: Through certain measurement equipment or measurement procedures, various performance indicators or closely related measures can be measured directly from the system, and then through some simple operations, the corresponding performance indicators can be obtained.
  • Model method: The basic idea is to first establish an appropriate model for the system to be evaluated, and then find the performance indicators of the model in order to evaluate the performance of the system.

Classic performance evaluation methods among measurement methods:

  • Clock frequency method: The clock frequency of a computer reflects the machine speed to a certain extent. For the same type of computer, the higher the clock frequency, the faster the computer. However, computers with different architectures have the same speed at the same frequency. Performance can vary significantly.
  • Instruction execution speed method: In the development of computers, since the instruction speed of addition can generally reflect the speed of other arithmetic operations such as multiplication and division, the execution time of simple instructions such as logical operations and transfer instructions is often designed to be the same as that of addition instructions. The speed of a computer can be measured by the operation speed of addition instructions. MIPS is commonly used to evaluate system performance.
  • Equivalent instruction speed method: Also known as Gibson or mixed proportion calculation method, it is the computer operation speed obtained by calculating the proportion of various instructions in the program . It is necessary to calculate the proportion of various instructions in the program.
  • Data processing rate method: Processing Data Rate, PDR, uses the method of calculating the PDR value to measure machine performance. The larger the PDR value, the better the machine performance. PDR is related to the average number of bits per instruction and each operand, as well as the average operating speed per instruction. PDR mainly measures the speed of the CPU and main memory. It is not suitable for measuring the overall speed of the machine and cannot fully reflect the performance of the computer because it does not involve the impact of technologies such as Cache and multi-function components on performance.
  • Comprehensive theoretical performance method: CPT. This method first calculates the effective calculation rate of each calculation unit of the processing component, and then adjusts it according to different word lengths to obtain the theoretical performance of the calculation unit. The sum of theoretical performance is the final computer performance. System performance is evaluated using million theoretical operations per second (MTOPS).
  • Benchmark program method: The core program that is used the most and most frequently in an application is used as a standard program to evaluate the performance of a computer system, which is called a benchmark. The benchmark method is currently recognized as a better method for testing system performance. Consider the impact of IVO structure, operating system, and compiler efficiency on system performance,

On the other hand, there are three main methods to evaluate the performance of computer systems:

  • Measurement method: Mainly by using various performance data acquisition methods and running various types of benchmark testing programs or tools to measure the performance of the target system
  • Analysis method: By establishing a mathematical model for the computer system, and then obtaining the performance of the target system through calculation under given input conditions
  • Simulation method: Approximately imitate the target system by constructing a system model and workload model to understand the characteristics of the system

benchmark procedure method

In most cases, to test the performance of a new system, users must rely on evaluation programs to evaluate the performance of the machine. The core program that is used the most and most frequently by applications is used as a standard program to evaluate computer performance, which is called a benchmark.

The evaluation accuracy of real programs, core programs, small benchmark programs, and synthetic benchmark programs decreases in descending order.

The Transaction Processing Performance Council (TPC) is a non-profit organization that develops standard specifications, performance and price measurements for business application benchmark programs, and manages the release of test results. The TPC-C it publishes is a benchmark program for online transaction processing. TPC-D is a benchmark program for decision support.

Web server performance indicators mainly include request response time, transaction response time, number of concurrent users, throughput, resource utilization, and the number of transactions or transactions that the system can handle per second.
The benchmark program method mainly focuses on the performance of the CPU (sometimes including main memory), and usually also considers the impact of the I/O structure, operating system, compiler efficiency, etc. on system performance.

The performance indicators of computer systems are high-precision data, and user questionnaires or expert group methods can only obtain some rough and outline data. Most users do not use multiple computer systems and it is difficult to make comparisons. Therefore, the evaluation of computer system performance indicators is generally not determined through user surveys.

Amdahl's law

Gene Amdahl's Law: How improving the performance of one part of a system affects the entire system. The system here can refer to a computer system or other systems.

When improving the performance of one part of the system, the impact on the overall system performance depends on:

  1. How important is this part?
  2. How much performance improvement does this part have?

Calculation method:
Insert image description here

reference

Guess you like

Origin blog.csdn.net/lonelymanontheway/article/details/120401683