2023 The first half of the software designer's knowledge points review outline

Foreword: The National Computer Technology and Software Professional Technical Qualification (Level) Examination (hereinafter referred to as the IT Professional Qualification Examination) is a state-level and authoritative computer science and technology examination administered by the Ministry of Personnel of the People's Republic of China and sponsored by the National Computer Network and Information Security Management Center. Vocational Skills Proficiency Certification Exam. It mainly provides a way for enterprises, institutions and social training institutions to test and certify the professional skills of computers and software.

In the IT professional qualification examination, the software designer examination (soft examination for short ) is an important examination category, and the qualification certificate is considered to be an important symbol of talents in the software industry and a professional qualification certificate.

The soft test intermediate software designer is a level of the soft test, which belongs to the professional level of software engineers. It is a professional qualification certification in the field of advanced software design and development, and it is one of the core competency certifications for software industry professionals. The soft test intermediate software designer test content includes software development, requirements analysis, software testing, software project management, software quality assurance and other aspects. Passing this certificate exam proves that the candidates have professional knowledge and practical experience in software design and development, can play an important role in software projects, and provide high-quality software application solutions for enterprises and organizations.

 


The following are the knowledge points of software designers compiled by the seniors for you, hoping to help you achieve a good result~

Chapter 1 Computer System Knowledge

A computer system consists of two parts: hardware and software.

Five components of computer hardware system

Controller, computing unit, memory, input device, output device

The memory is divided into internal memory ( memory, small capacity, fast speed, storing temporary data, disappearing after power failure ) and external memory ( hard disk, CD, large capacity, slow speed, long-term data storage )

Input devices and output devices are collectively referred to as peripherals

Host ( CPU + main memory )

Central Processing Unit CPU

CPU composition: composed of arithmetic unit, controller, register group (fastest reading speed) , internal bus

CPU function: realize program control, operation control, time control, data processing functions

The composition of the calculator (often tested):

Arithmetic logic unit ALU (Arithmetic logic unit): realizes arithmetic and logic operations on data , and provides a workspace

Accumulation register AC (Accumulator): storage area for operation results or source operands

Data buffer register DR (Data Register): Temporarily store instructions or data in memory

Status condition register PSW (Program Status Word): save the conditional content of the instruction operation result , such as overflow flag, etc.

Calculator function: perform arithmetic and logic operations

Controller:

Instruction Register IR (Instruction Register): Temporarily store the current

Instructions the CPU is executing

Program Counter PC (Program Counter): archive instruction execution address

Address register AR (Address Register): save the memory address accessed by the current CPU

Instruction Decoder ID (Instruction Decoder): Analyze instruction opcodes

Controller function: control the work of the entire CPU , the most important, including program control, timing control

Programmers can access general-purpose registers to access data, status registers and program counters , but cannot access instruction registers

Data base conversion

binary, hexadecimal ( 0x18F, 18FH )

R base to decimal case: Hexadecimal 5043 to decimal => 36^0 + 46^1 + 06^2 + 56^3 => 1107

Case of converting decimal to R: decimal 200 to hexadecimal => 200/6 = 33 remainder 2 => 33/6 = 5 remainder 3 => 5/6 = 0 remainder 5 => hexadecimal is the remainder of the remainder Arrange forward 532

Convert m-ary to n-ary: transfer through decimal

Representation of numbers (few exams)

Minimum data unit b (bit bit)

Minimum storage unit 1B (byte byte) = 8b

1KB=1024B; 1MB=1024KB; 1GB=1024MB

Machine number: the form in which various values ​​are represented in the computer. It is characterized by the use of the binary counting system. The symbols of the number are represented by 0 and 1, and the decimal point does not occupy the position implicitly (for example, +0 (0 0000000) -0(1 0000000)) Among them, the first digit is the symbol, and the last seven digits represent the value digits

Fixed-point representation: divided into pure decimals and pure integers, where the decimal point does not occupy storage space

Pure decimal: agree that the decimal point is before the highest numerical digit of the machine number

Pure integers: the position of the decimal point is agreed after the lowest numerical digit of the machine number

True value: the actual value corresponding to the machine number

How to encode numbers (not much test)

原码:一个数的正常二进制表示 例如 +0 ( 0 0000000 ) -0(1 0000000)

反码:正数的反码即为原码负数的反码是在原码的基础上,除了符 号位以外,其他各位按位取反(例如如上数值的反码为 +0 ( 0 0000000) -0 ( 1 1111111 ))

补码:正数的补码即原码负数的补码是在原码基础上,除了符号位 以外,其他各位按位取反,而后在末位 +1 ,若有进位则产生进位 +0 ( 0 0000000 ) -0 ( 0 0000000 )) -0 的补码有溢出

移码:用作浮点运算的阶码,无论正数负数,都是将该原码的补码的 首位(符号位)取反得到

计算机系统中常采用补码来表示和运算数据,原因是采用补码可以简 化计算机运算部件的设计

浮点表示(常考)

浮点数 N = F * 2^E, 其中 E 称为阶码 ( 带符号的纯整数 ) , F 称为尾数(带符号的纯小数),类似于十进制的科学记数法 例如 101.011=0.101011*2^3

浮点数所能表示的数值范围由阶码确定,所表示的数值精度由尾数确 定

浮点数运算需要先 1. 对阶,即将阶码换算成相同的后再计算,小阶 向大阶对接,否则会损失尾数的精度 》 2. 尾数计算 》 3. 结果 格式化

浮点数存储格式: | 阶符 | 阶码 | 数符 | 尾数 | ,般尾数用补码,阶码用移码规格化浮点数:将尾数的绝对值限定在 [0.5, 1]

浮点数的范围: M: 尾数补码位 ( 包括数符 ) , R: 阶码补码位 ( 包括阶符 ) ,则最大正数 +(1-2^-M+1) * 2(2R-1 - 1), 最小负 数 -1 * 2(2R-1 - 1)

定点表示法与浮点表示法:定点表示法分为定点整数和定点小数,定 点表示法的小数点不需要占用存储位,总位数相同时浮点表示法可以 表示更大的数

定点小数在机器字长为 n 的表示范围是定点整数表示范围除以

2^n-1

寻址

立即寻址:操作数包含在指令

直接寻址:操作数存在内存单元中,指令中给出操作数所在存储单元 的地址

寄存器寻址:操作数存在某一个寄存器中,指令中给出存放操作数的 寄存器名,比直接寻址要快,寄存器 " 距离 " CPU 更近

寄存器间接寻址:操作数存在内存单元中,寄存器中存放了操作数所 在的内存地址,指令中则存了寄存器名,也就是说寻址路径 指令 - > 寄存器 -> 内存

间接寻址:指令中给出操作数地址的地址

Addressing efficiency: immediate addressing > register addressing > direct addressing > register indirect addressing > indirect addressing

The purpose of using different addressing methods: to expand the addressing space and improve programming flexibility

checksum

Code distance: refers to how many binary differences there are at least between two legal codes in a coding system

Parity check code: Add a check bit in the code to make the number of 1s odd or even, and the code distance is 2. Parity check can only check errors and cannot correct errors

Hamming code: a check method that uses parity to correct errors . The code distance is 3. Suppose there are n bits of data bits and k bits of check bits, then n and k must satisfy the following relationship: 2^k-1 > = n+k

Cyclic redundancy check code (CRC): It can detect errors but cannot correct them . The code distance is 2. It consists of k data bits + r check bits . The check code is generated by the information code. The more check code digits The stronger the multi-check ability, the modulo two operation is used when calculating the CRC code

• RISC and CISC

RISC: Reduced instruction set computer, less instructions, low complexity , fixed instruction length, fewer addressing modes, more general-purpose registers, support for pipeline technology , hard-wired control logic, combinational logic controller

CISC: complex instruction set computer, with many and complex instructions, variable instruction length, complex and diverse addressing methods, general-purpose registers, and pipeline technology , using microprogram control technology

Pipeline technology

Pipeline: A quasi-parallel processing implementation technology in which multiple instructions overlap and operate

Instructions are divided into three parts: Fetch -> Analysis -> Execution

Execution time of a complete instruction = fetch time + analysis time + execution time

Pipeline cycle: The section of the instruction step that takes the longest time, for example: fetch instruction 1ms -> analyze 3ms -> execute 2ms, then the pipeline cycle is 3ms, which means that except for the first one, the following instructions only need to spend 3ms more can complete

The total time of the pipeline: theoretical formula: completion time of a complete instruction + (total number of instructions - 1) * pipeline cycle ; practical formula: give the first instruction sufficient time, and each step of the first instruction uses a pipeline cycle time

Throughput rate: refers to the number of tasks completed by the pipeline per unit time. The calculation formula is: number of instructions / total time spent on the pipeline , and the maximum throughput rate = 1 / pipeline cycle

Asynchronous control will prolong the time and reduce performance, and an end signal must be sent after each operation

memory

Classified by location: internal memory; external storage

Classified by working mode: read/write memory RAM; read-only memory ROM

Classification by access method: access to memory by content (for example: connected memory), access to memory by address

按寻址访问存储器:随机存储器、顺序存储器、直接存储器

闪存,一种只读存储器 ROM ,删除时以块为单位删除,类比为 U 盘

虚拟存储器,由主存 + 辅存 组成

存储系统的层次结构,由内而外: CPU 内部通用寄存器 》 Cache(SRAM 静态随机存储器 ) 》主 \ 内存储器 (DRAM 动态随 机存储器,需要周期性刷新来保持数据 ) 》外存储器

空间局部性:若一个存储单元被访问,则其临近的存储单元在不久的 将来也很可能被访问,这种特性就是空间局部性

时间局部性:若一个存储单元被访问,则这个单元在以后也可能被再 次访问,这种特性就是时间局部性

• Cache 高速缓存

Cache 高速缓存:位于 CPU 与 主存之间,用来存放当前最活跃的 程序和数据(主存的部分拷贝信息),速度比主存块 5~10 倍,对程 序员来说是透明的(程序员不可操作)

Cache 容量越大,命中率越高,逐渐接近 100% ,但是随着容量变 大, Cache 成本和命中时间也在增大

替换算法:目标是使 Cache 获得更高的命中率

随机替换算法

先进先出算法

近期最少使用算法

优化替换算法

Cache 中的地址映像方法

Address mapping: The main memory address sent by the CPU when it is working . To read and write information from the Cache, it is necessary to convert the main memory address into a Cache address . This conversion is called address mapping

Direct image: Main memory partition, Cache block , each area of ​​main memory has the same block as Cache, the corresponding relationship between the blocks of each area of ​​main memory and the block of Cache is fixed , the hardware circuit is simple, but the conflict rate is high

Full-link image: main memory is not partitioned , and main memory and Cache are divided into blocks according to the same size. Cache blocks can correspond to any block on main memory. The circuit design is difficult, and it is only suitable for small-capacity Cache, and the conflict rate is low.

Group-connected image: It is a compromise between direct connection and full-link image, grouping first, direct image between groups , and full-link image within the group

Probability of Conflicts: Full Associative Image « Group Associative Image « Direct Image

The mapping between Cache and main memory is done automatically by hardware

interruption

Interruption: When encountering an event that needs to be processed urgently, suspend the currently running program, turn to execute the relevant program, and return to the source program after processing . This process is called interruption

Interrupt vector: Provides the entry address of the interrupt service routine

Interrupt response time: the period from when an interrupt request is issued to when the interrupt service routine is entered

The purpose of saving the scene: to return to the interrupted program correctly and continue execution

7 Chapter 1 Computer System Knowledge

Input and output ( I/O ) control mode

Program query mode (program direct control mode):

The CPU and I/O can only work serially , the CPU needs to poll and check the status all the time, it is in a busy state for a long time, and the CPU utilization is low

Only read and write one word at a time

Put the number into memory by the CPU

Interrupt drive mode:

The I/O device actively reports to the CPU that the I/O operation has been completed through the interrupt signal

CPU and I/O peripherals can operate in parallel , improving CPU utilization

Only read and write one word at a time

Data is put into memory by the CPU

Direct memory storage method (DMA method):

The CPU issues data read and write commands to the I/O module, and then the CPU can do other things. The I/O module establishes a direct data path with the memory. After the I/O module operation is completed, it informs the CPU through an interrupt signal.

CPU and I/O peripherals can work in parallel

Put data directly into memory by peripherals

The unit of one read and write is block instead of word

CPU intervention is only required at the beginning and end of the transfer block

The CPU responds to the DMA request at the end of a bus cycle , and each transfer of data takes up a storage cycle

An interrupt request from an I/O device is a maskable interrupt , and a power failure is a non-maskable interrupt

bus

Bus: A group of signal lines linking related components of a computer, and is a public channel used by computers to transmit information codes

Bus classification: data bus, address bus, control bus

8 Chapter 1 Computer System Knowledge

Advantages of the bus: Simplify the system structure, reduce the number of connecting lines , facilitate interface design, facilitate fault diagnosis and maintenance, and reduce costs

Bus bandwidth calculation: clock frequency * bytes transferred per second

Address bus width calculation: The width of the address bus indicates the addressing capability of the CPU, which is related to the size of the memory. The size of the memory requires a wider address bus. For example: the memory capacity is 4GB -> 2^32 B -> the address bus width is 32

Data bus width calculation: The data bus width is the word length of the processor

PCI: parallel internal bus, system bus ; SCSI: parallel external bus

Encryption technology and authentication technology

The Problem Encryption Solved: Eavesdropping

Problems that authentication solves: tampering, counterfeiting, denial

Encryption Technology

Symmetric encryption: the same secret key is used for encryption and decryption, and there is only one secret key ; the encryption and decryption speed is fast, suitable for a large amount of plaintext data, and the secret key distribution is flawed

Asymmetric encryption: Encryption and decryption are not the same secret key , there are two secret keys (public key and private key); encryption and decryption speed is slow, secret key distribution is flawless, public key and private key cannot be calculated

Hybrid encryption: It is a mixture of symmetric encryption and asymmetric encryption: first encrypt a large amount of plaintext data with symmetric encryption, then encrypt the "symmetric encryption key" with an asymmetric public key, and transmit it to the receiver along with the encrypted plaintext , the receiver uses the asymmetric private key to decrypt the "symmetric encryption key", and then uses the "symmetric encryption key" to decrypt the plaintext

authentication technology

Abstract: Get the summary of the sent plaintext Hash algorithm , put it in the ciphertext and send it together, and compare it with the summary result of the plaintext Hash algorithm decrypted by the receiver. If they are consistent, there is no tampering

Digital signature: On the basis of the abstract, the private key is used to sign the abstract , and the recipient passes

The public key decrypts the digital signature , which can determine whether it has been tampered with, counterfeited, or denied, and is used to verify the authenticity of the source of the message

Digital certificate: use the private key of the third-party organization CA to digitally sign the user's public key to ensure that the public key is not tampered with, and the receiver decrypts it with the public key of the CA to obtain the public key of the sender

Use digital certificates to authenticate user identities , and use digital signatures to prevent tampering, counterfeiting, and denial

Encryption Algorithm

Symmetric encryption algorithm (private key, private key encryption, shared key encryption algorithm)

DES 3DES RC-5 IDEA AES RC4

Asymmetric encryption algorithm (public key, public key encryption)

ECC RSA DSA

Hash function, MD5 digest algorithm, SHA-1 secure hash algorithm

System reliability

Suppose a system is composed of N subsystems, and the reliability of the subsystems are R1 R2 R3

Series system reliability: R = R1R2R3

Parallel system reliability: R = 1 - (1-R1)(1-R2)(1-R3 )

Other

The number of bits in the instruction register depends on the instruction word length

The hierarchical storage speed of the computer is as follows: CPU internal general-purpose registers "Cache" memory "external memory

Safety requirements:

Physical Line Security – Computer Room Security

Network Security – Intrusion Detection

System Security – Vulnerability Patch Management

Application Security – Database Security

10 Chapter 1 Computer System Knowledge

Chapter 2 Programming Languages

Low-level language vs. high-level language

Low-level languages: machine language and assembly language

High-level language: various application-oriented programming languages , such as C JAVA Python, are closer to natural language and improve programming efficiency

A high-level language or assembly language is called a source program, and the source program cannot be directly executed on the computer

Interpreter (Interpreter)

When translating the source program, no independent object program is generated

The interpreter and the source program need to participate in the running process of the program

Compiler (compiler program)

When translating, translate the source program into an independently saved target program

An object program that is equivalent to the source program when running on the machine

Neither the compiler nor the source program participate in the running process of the target program

Data components of programming languages

Identifier: A token consisting of numbers, letters, and underscores

Constants and variables: Distinguish according to whether the value of the data can be changed when the program is running. Constants are stored in the pool and do not have their own storage unit

Global and local quantities: divided according to the scope of data in the program code

控制结构:顺序结构、循环结构、选择结构

程序中的数据具有类型的作用: 1. 便于为数据分配存储单元; 2. 便于对参与表达式计算的数据对象进行检查; 3. 规定数据对象的取

11第 1 章 计算机系统知识

范围以及能够进行的运算

表达式的左结合:由左向右执行例如: a && b

表达式的右结合:由右向左执行例如: x = y = z

传值调用和传址调用

函数定义:函数首部(返回值类型 函数名 ( 形参 ) ) + 函数体 ({})

传值调用:将实参的值传给形参实参可以是常量、变量、表达式

不可以实现实参形参之间双向传递数据的效果

传址调用:将实参的地址传给形参,实参必须有地址实参不能是常 量(值)或者表达式可以实现实参形参之间双向传递数据的效果, 即改形参的值,实参的值也同时改掉了

编译、解释程序翻译阶段

编译方式各个阶段:词法分析 – 语法分析 – 语义分析 – 中间代 码生成(可省略) – 代码优化(可省略) – 目标代码生成

解释方式各个阶段:词法分析 – 语法分析 – 语义分析

编译器与解释器都不可省略且变换顺序的阶段是 词法分析、语法分析、 语义分析

The intermediate code generation and code optimization stages of the compiler are not necessary and can be omitted

The role of the symbol table

Continuously collect, record and use the type and characteristic information of some symbols in the source program , and store them in the symbol table

Record the necessary information of each character in the source program to assist the semantic correctness check and code

generate

12 Chapter 1 Computer System Knowledge

Lexical analysis

Treat the source program as a multi-line string , scan character by character from top to bottom, from left to right, and identify word symbols (such as keywords, identifiers, operators, etc.)

Words analyzed by lexical analysis are often output in the form of two-tuples, that is, the word category and the value of the word

Lexical analysis is based on the lexical rules of the language

The input of lexical analysis is the source program , and the output is a stream of tokens

The main function of lexical analysis: to analyze whether the characters that make up the program and the symbols formed by the characters according to the construction rules conform to the regulations of the program language

syntax analysis

On the basis of lexical analysis, word symbol sequences are decomposed into various grammatical units according to the grammatical rules of the language , such as 'expression', 'statement', 'program', etc.

If there is no grammatical error, a grammatical tree will be constructed after grammatical analysis , otherwise an error will be pointed out and diagnostic information will be given

The input of the syntax analysis is the token stream generated by the lexical analysis , and the output is the syntax tree

The main function of grammatical analysis is to analyze the legality of the structure of each statement and find all grammatical errors in the program

Semantic analysis

Used to check whether the source program contains static semantic errors , mainly used for type analysis and inspection

The input of semantic analysis is the syntax tree generated in the syntax analysis stage

In the semantic analysis stage, not all semantic errors can be found , only static semantic errors can be analyzed , and dynamic semantic errors can only be found at runtime

Dynamic semantic errors often have infinite loops

13 Chapter 1 Computer System Knowledge

Object code generation

The task of this stage is to turn the intermediate code into absolute instruction code , relocatable instruction code or assembly instruction code on a specific machine

The work of the target code generation stage is closely related to the specific machine

The allocation of registers is in the object code generation phase

intermediate code generation

Generate intermediate code according to the output of semantic analysis , which is a simple notation system that can have many forms, and its characteristic is that it has nothing to do with the specific machine

Semantic rules of the language upon which semantic analysis and intermediate code generation are based

Common intermediate codes: suffix type, three-address code, ternary type, quaternary type, tree (graph) and other forms

Different high-level languages ​​can be compiled into the same intermediate code

Intermediate code can be cross-platform

Intermediate code facilitates machine-independent optimization and portability of compiled programs

Formal

Formal forms are tools for lexical analysis

It can roughly be compared to the basic rules of regularization in JS, mainly remember that the value range represented by * is [0, infinity], and then bring the option into it to see if the character does not conform to the rules

Finite Automata

Finite automata is a tool for lexical analysis , which can correctly identify regular sets

State, divided into initial state and final state , a state can be either initial state or final state

The basis for successful recognition is: the road of the state machine runs smoothly and the end point after running is the final state

Deterministic finite automata: The transition state after character recognition is unique for each character

14 Chapter 1 Computer System Knowledge

of

Uncertain finite automata: For each character, the transition state after recognizing characters is not unique

The difference between a definite finite automaton and an uncertain finite automaton is that given a number or letter, it has only one way to run, that is definite , otherwise it is uncertain

empty string

Context Free Grammar

It is widely used to represent the grammar rules of various programming languages ​​– context-free grammar

Do the question: start from the start symbol, push to the terminal symbol in the option , try one option at a time

后缀式、中缀式

中缀式就是常见的表达式: 1*2

后缀式的符号放在后边: 12*

中缀式转后缀式:按照 () 、 * /、 + - 的优先级,一个表达式 一个表达式的转换成后缀式,同等优先级的从右向左转换

后缀式转中缀式:使用栈的方式(先进后出、后进先出)

语法树中缀遍历 -> 生成中缀式:左根右

语法树后缀遍历 -> 生成后缀式:左右根

后缀式又称逆波兰式

其他

反编译通常不能将可执行文件还原成高级语言源代码,只能转换成功 能等价的汇编程序

动态语言指的是程序运行时可以改变其结构

指针变量:变量是内存单元的抽象,用于在程序中保持数据,当变量

15第 1 章 计算机系统知识

存储的时内存单元地址时,称为指针变量

链表中的节点控件需要程序员申请和释放,数据控件应采用堆存储分 配策略

可视化程序设计特点:

基于面向对象思想,引入控件概念和事件驱动

研发过程遵循,先界面绘制,再基于事件编写程序代码

Designers do not need to write or write a small amount of code

During the compilation process, the logical address is used to allocate the storage unit for the variable , and it is mapped to the physical address when the program is running.

The storage space of global variables in C is in the static data area

Grammar-directed translation is a static semantic analysis method

Syntax analysis method:

Both recursive descent analysis and predictive analysis are top-down analysis methods

Shift-reduce analysis is a bottom-up analysis

16 Chapter 1 Computer System Knowledge

Chapter 3 Intellectual Property

Copyright

Copyright is divided into personal rights and property rights

Personal rights are divided into: right of publication, right of authorship, right of modification, right to protect the integrity of works

Except for the right of publication, the time limit of other rights is permanent, and the time limit of the right of publication is life + 50 years after death

Territoriality of IP: Where IP is granted, it is valid only in the granting country and not protected in foreign countries

Computer software copyright

Computer software copyright is protected by the "Copyright Law of the People's Republic of China" and "Computer Software Protection Regulations"

The subjects of computer software copyright are citizens

计算机软件著作权的客体的是指受保护的计算机程序(源程序和目标 程序)和相关文档(流程图、说明书、用户手册)

计算机软件著作人身权:发表权、开发者身份权(署名权、永久)

计算机软件著作权的保护期:自软件开发完成之日起,保护期为 50 年,保护期满,除了开发者身份权外,其他权利终止

《计算机软件保护条例》是国务院颁布

侵权行为鉴别:未经著作权人的同意,发表、登记、署名、更改、翻 译、复制、出售、出租

职务作品

职务软件作品指公民在单位任职期间为执行本单位的工作所开发的计 算机软件作品

17第 1 章 计算机系统知识

公民在单位任职期间所开发的软件,著作权属于单位。如果开发软件 不是职务工作,那著作权就不是单位所有,但是如果用了单位设备, 则不能归个人享有

如果是职务软件作品,那开发者只有署名权

委托开发

委托开发的作品,著作权由委托方和受委托方订立的合同决定,无合 同约定的,著作权为受委托方所有

商业秘密

商业秘密的基本内容:经营秘密和技术秘密

商业秘密的构成条件:

具有未公开性,不为公众所知悉

具有实用性,能给权利人带来利益

具有保密性,即采取了保密措施

专利权

由书面形式申请的,一份申请一项发明

专利权保护期限 20 年实用新型专利是 10 年

专利权就申请之日起算,两人以上申请,现申请先获得,同一天申请,

由两人协商决定

商标权

自核准之日起, 10 年有效,届满前可以续,每次续 10 年

谁先注册,谁享有商标;同时注册,谁先使用,谁享有商标;同时注 册,都没有使用,则协商或者抓阄决定

18第 1 章 计算机系统知识

软件许可使用

独占许可使用:软件著作权人不能再给他人许可,软件著作权人也不 可使用

独家许可使用:软件著作权人不能再给他人许可,但是软件著作权人

可以使用

普通许可使用:软件著作权人可以再给他人许可,软件著作权人也可 以使用

软件著作权中的翻译权

Convert software from one programming language to another

19 Chapter 1 Knowledge of Computer Systems

Chapter 4 Database Knowledge

Taxonomy of data models

Conceptual data model: a data model abstracted from the information world, independent and computer systems, generally expressed by the entity -relationship method (ER method) , used for modeling the information world

Structural data model (data model DBSM): directly oriented to the logical structure of the database , generally examining the relational model and the corresponding relational schema in the data model

Conceptual data model common terminology

Entity: There are objectively differentiated transactions , such as a person, a unit, and an external system

Attribute: Used to describe the characteristics of an entity , such as the name of a person, the address of an organization

code | key: A property or set of properties that uniquely identifies an entity

Domain: the value range of the attribute

Contact: Correspondence between entities

There are three types of connections between entities

One-to-one (one class corresponds to one monitor)

One-to-many (one class corresponds to many students)

Many-to-many (a teacher can correspond to multiple classes, and a class can also have multiple teachers)

• ER Diagram (Entity - Relationship)

Entity – Rectangular representation

Attributes – Ellipse Representation

Contact – indicated by a diamond

20 Chapter 1 Computer System Knowledge

Use undirected edge links, use 1:n, n:1, n:m to represent the type of connection

Structural data models are mainly divided into: hierarchical model, network model, relational model and object-oriented model

Three-level schema and two-level mapping

External mode (user mode or sub-mode): corresponding to external views, interacting with users

Conceptual mode: corresponding to the basic table of the database

Internal mode (storage mode): the actual database storage file

Mode-internal mode image: realizes the conversion between conceptual mode and internal mode, and maintains the physical independence of data

External mode-schema image: realizes the conversion between external mode and conceptual mode, and maintains the logical independence of data

Basic terms in the relational model

Relationship: A relationship is a two-dimensional table

Yuanzu: A row in the table is a Yuanzu, corresponding to a record value

Attribute: A column in the table is an attribute, the first row of the column is the attribute name, and the others are attribute values

Domain: the value range of the attribute

Relationship mode: description of the relationship, format: entity (attribute 1, attribute 2, ...attribute n)

Candidate key | Candidate key: an attribute or combination of attributes that can uniquely identify a tuple

Primary key | Primary key: There may be multiple candidate keys in a relationship , and one of them is selected as the primary key

Foreign key | Foreign key: An attribute or attribute group in a relationship is not the key of the relationship , but is the primary key in another relationship, it is called the foreign key of the relationship | foreign key

Full code: a candidate key that contains all the attributes in the relation , then the candidate key is a full code

Superkeys: attribute sets that contain candidate keys

21 Chapter 1 Computer System Knowledge

Main attributes: all the candidate codes are main attributes , and others are non-main attributes

Relational Model Integrity Constraints

Entity integrity: the value of the primary key cannot be empty

Referential integrity: the value of the foreign key can be empty , but if there is a value, its value must be found in the corresponding table

User-Defined Integrity: User-defined specified constraints on a piece of data

Relational algebra operations

∪ (union): R ∪ S, R, S all ancestors | records merged, the result after deleting duplicate records

- (bad): RS, delete records from R that exist in S

∩ (intersection): R ∩ S, the composition of records that exist simultaneously in R, S

× (Cartesian product): R × S, each record in R is concatenated with each record in S to form a new record

π (Projection): π 1,3 ® Extract the 1st and 3rd column attributes from the relationship R to form a new relationship (where 1, 3 column numbers can also be replaced by an attribute name)

6 (Select): 6 1=5 ® Select all records whose value in the first column is equal to the value in the fifth column from the relation R to form a new record

Link: Link is to select qualified rows in the Cartesian product of two relations

Equivalence link: The connection condition is that the values ​​​​of the two columns are equal

Natural link ( |><| ): No need to write the link condition, the natural link will select the records with the same value corresponding to the attribute with the same name in the two tables , and the duplicate attribute column will be removed from the new relationship generated

Left outer link: R natural link S, based on the natural link, the result of concatenating the records lost in the natural link result in the left relation R with the natural link, and filling the right attribute with the null value NULl

22 Chapter 1 Computer System Knowledge

Right Outer Links: Opposite of Left Links

Full Outer Links: The results of left outer links and right outer links are superimposed

Relational schema

R is the relationship name< attribute group, attribute domain, attribute-to-domain mapping, data dependency of attributes in the attribute group> usually D dom can be omitted, for example: R<{A, B, C, D}, {A ->B, A->C, C->D}> "->" can be understood as "derivation" or "decision"

Complete functional dependency: for example (student number, course) -> grades, but a single student number or course cannot directly deduce grades, which is a complete functional dependency

Partial functional dependency: for example (student number, course number) -> name, a single student number can also deduce the name, which is partial functional dependency

依赖传递: A->B , B->C , 则称 C 对 A 传递依赖

属性闭包计算(挑选主键),由推导关系选出可以完全推导出所有属 性集的某个属性或者某几个属性组合,例如: R<{A, B, C, D}, {A- >B, A->C, C->D}> 中 A 属性可以完全推导出 {A, B, C, D}, A 就是主键,有多个情况,则可以有多个候选键

冗余函数依赖:例如 {A->B, B->C, A->C, C->D} 中 A->C 即为冗 余,因为 A->B->C 有传递依赖,传递依赖优先(自己的理解)

范式 – 应试技巧

1NF -> 2NF: 消除非主属性对码的部分函数依赖

2NF -> 3NF: 消除非主属性对码的传递函数依赖

3NF -> BCNF: 消除主属性对码的部分函数依赖和传递函数依赖

BCNF -> 4NF: 消除非平凡且非函数依赖的多值依赖

结题技巧:

一般都满足 1NF,除非有类似 工资(基本工资,加班工资,实发 工资)这种可在细分的属性

23第 1 章 计算机系统知识

通过函数依赖集,找到 “码”,以及主属性与非主属性(能被推 导出的属性都不是主属性或者码

Determine whether the non-key attribute has partial functional dependence on the code, that is, whether the non-key attribute can be derived from a part of the code (if it is a member in the code + non-key attribute A -> non-key attribute B, it is not considered a partial function Dependency) must be at least 2NF

See if there is any transitive functional dependency (similar to A->B, B->C; or in some cases of pseudo-transitive: A->B, BC->D, then there can be AC->D), and if it is met, it is at least 3NF,

Check whether the main attribute has partial functional dependence or transitive dependence on the candidate code , and if it matches, it is at least BCNF

See if there is a multi-valued dependency, and the left side of the multi-valued dependency is a code, such as A->B, A->C, and A is a code, then it conforms to the fourth normal form

Determine whether it is a lossless connection

Directly link the decomposed relationship through the "natural link" to see if the obtained result is consistent with the original, and if it is inconsistent, it is lossy

Judging whether it is "maintain functional dependency"

It depends on whether there are dependencies of the original functional dependency set in the two decomposed relationships , if not, it will not be maintained

Database design steps

User needs analysis – data flow diagram ;

Conceptual Design – ER Diagram ;

Logic Design – Relational Schema

24 Chapter 1 Computer System Knowledge

• ER model

Entity: rectangle representation ;

Weak entity: The double-sided rectangle indicates that the existence of an entity must depend on another entity, and this type of entity is a weak entity

Connection: The diamond shape indicates that the type of connection is marked on the undirected edge 1:nn:1 n:m, and the connection of weak entities is a bilateral diamond shape

Attribute: ellipse representation ;

Simple attributes: attributes that cannot be subdivided ; composite attributes: can be subdivided into smaller parts

Multi-valued attribute: A double-sided ellipse indicates that a multi-valued attribute means that an attribute can correspond to a set of values.

Derived attribute: The dotted ellipse indicates the attribute that can be calculated from other attributes

• Types of conflicts between ER diagrams

Attribute conflict: attribute type, value range, data unit conflict

Naming conflicts: attributes with the same meaning are named differently in different ER diagrams

Structural conflict: the same object is abstracted as an entity in one ER diagram , and as an attribute in another ER diagram, or the same entity has different attributes in different ERs

Relational model conversion

Convert the 1:1 relationship into a relationship mode : put the attributes corresponding to the relationship into any entity, and put the primary key of another entity into this entity

Convert the 1:n relationship into a relationship mode : put the attribute corresponding to the relationship into the entity corresponding to n, and put the primary key of another entity into this entity

Convert the n:n relationship to a relationship mode : treat the relationship as a new relationship alone, and combine the primary keys of other entities as the primary key of this new relationship

关系模式规范化的要求:至少满足 3NF

25第 1 章 计算机系统知识

事务管理的特性

原子性:事务是原子的,要么都做,要么都不做

一致性:事务执行的结果必须保证从一个一致性状态 变到另一个一致 性状态

隔离性:事务间互相隔离,多个事务并发时,任意事务的变更操作知 道其成功提交的整个过程对其他事务都是不可见的

持久性:一旦事务成功提交,即使数据库崩溃,其对数据库的更新操 作也永久有效

数据库备份方法

静态转储与动态转储,动态指在转储期间允许对数据库的存取修改操 作,静态则不允许

海量存储与增量存储,增量是指仅转储距离上次转储更新的数据

日志文件,把对数据库的每次操作写入日志文件,一旦发生故障,则 利用日志文件撤销事务,回退到事务前的数据状态

封锁

排它锁:就是加了这个锁那么其他的排它锁和共享锁都不能加了,直 到这个锁被释放了才可以加其他的排它锁和共享锁

共享锁:就是加了这个锁那么不能加排它锁但是可以加共享锁,直 到这个锁被释放了才可以加排它锁

分布式数据库

分片透明:指不需要知道表具体是如何分块存储的

复制透明:采用复制技术的分布式方法时,用户不需要知道复制到了 那些节点,和如何复制的

位置透明:无需知道数据存放的物理位置

26第 1 章 计算机系统知识

逻辑透明:无需知道局部使用的时那种数据模型

共享性:数据存储在不同的节点数据共享

自治性:每个节点对本地数据独立管理

可用性:某一个场地故障,系统可以使用其他场地的副本而不至于整 个系统瘫痪

分布性:数据在不同场地存储

存储过程

存储过程是在大型数据库系统中,一组为完成特定功能的 SQL语句集

通过提供存储过程让第三方调用,将需要更新的数据传入存储过程, 从而避免了向第三方提供系统的表结构,保证了系统的数据安全

27第 1 章 计算机系统知识

5章 面向对象基础

面向对象基本概念

如何识别是面向对象:对象 + 分类 + 继承 + 通过消息的通 信

类:定义了一组大体上相似的对象,是一组对象的抽象

类分为三种

实体类:表示显示世界中的真实实体,如人物等

Interface class (boundary class): Provide users with a way to cooperate and interact with the system , such as QR codes, bar codes, etc.

Control class: used to control the flow of activities and act as a coordinator

Object: It is a basic runtime entity. An object has both attributes and behaviors (operations on data). An object can usually be composed of object names, attributes, and methods

Message: is a construct for communication between objects

Overloading: It is a series of methods with the same name but different parameter types or numbers at the same position

Encapsulation: It is an information concealment technology, the purpose is to separate the user and producer of the object , and to separate the definition and implementation of the object

Inheritance: It is a mechanism for sharing data and methods between parent classes and subclasses. A parent class can have multiple subclasses. These subclasses are special cases of the parent class . The parent class describes the public properties and methods of the subclass

A subclass has a parent class called single inheritance , a subclass has multiple parent classes called multiple inheritance , multiple inheritance may lead to ambiguous members in the subclass

Override: In an inheritance relationship, the subclass implements the method inherited from the parent class in a more specific way , called overriding or overriding

Polymorphism: When receiving a message, the object needs to respond. Different objects can produce completely different results when receiving the same message . This phenomenon is called polymorphism.

The implementation of polymorphism is supported by inheritance . Using the hierarchical relationship of inheritance, consumers with general functions

28 Chapter 1 Knowledge of Computer Systems

The information is placed at a high level , and different implementations are placed at a low level . The objects generated by these low levels can respond to messages differently.

There are four types of polymorphism:

General polymorphism: parametric polymorphism (the most widely used), including polymorphism (common example: subtyping)

Specific polymorphism: overload polymorphism (the same name has different meanings in different contexts), mandatory polymorphism

Static binding and dynamic binding

Binding is the process of linking a method call with the class that calls the method

Static binding means that before the program runs, the compilation phase can determine who calls the method

Dynamic binding, binding according to specific object types at runtime, dynamic binding supports inheritance and polymorphism

Object-oriented design principles

Single Responsibility Principle: A class has only one reason for its change , and a class only has one type of responsibility

Open-Closed Principle: Software entities should be extensible (open), but not modifiable

(closed)

Li-style substitution principle: the subclass must be able to completely replace the parent class

Dependency Inversion Principle: Abstractions should not depend on details, details should depend on abstractions ; high-level modules should not depend on low-level modules, both should depend on abstractions

The principle of interface separation: the interface belongs to the client , not to the class hierarchy in which it resides, and there should be no dependence on details at the level of abstraction

Common closure principle: If a change affects a package, it will affect all classes in the package , but will not affect other packages

Common Reuse Principle: If you reuse one class in a package, you must reuse all classes in the package

29 Chapter 1 Knowledge of Computer Systems

Object-oriented analysis

The purpose of object-oriented analysis: to gain understanding of the application problem

The five activities of object-oriented analysis: identify objects - organize objects - describe the interaction between objects - determine the operation of objects - define the internal information of objects

Identification object: used to define the problem domain , with naturally existing nouns as an object

Defining the domain model is one of the key steps in object-oriented analysis. It creates a description of the object domain from the perspective of object classification, including defining concepts, attributes, and important associations . The results are organized with class diagrams

Object Oriented Design

Based on the results of object-oriented analysis, transform the analysis results into design models and define the blueprint of system construction

Object-oriented design: obtain the solution to the corresponding problem, realize the system, pay attention to the details of technology and implementation level

Five activities of object-oriented design: identify classes and objects - define attributes - define services - identify relationships - identify packages

Object-oriented programming

The essence is to choose an object-oriented language and use the programming concepts of objects , classes and so on.

Object Oriented Testing

Four levels of object-oriented testing: algorithm layer - class layer - template layer - system layer

30 Chapter 1 Knowledge of Computer Systems

Chapter 6 UML _

• UML concepts

The UML vocabulary consists of three building blocks: transactions (abstracting the most representative components of a model), relations (joining transactions together), diagrams (aggregating related transactions)

• UML transactions

Structural affairs: static parts of the model , nouns, describing concepts or physical elements, including classes, interfaces, use cases, components, etc.

Behavioral transactions: the dynamic part of the model , verbs, describe behaviors across time and space, including interactions, state machines, activities, etc.

Grouping transaction: the organizational part of the model , the most important grouping transaction is the package, and the structural transaction, behavioral transaction or other grouping transactions can be placed in the package

Annotation transaction: the interpretation part of the model , which is used to describe, illustrate, and label. Annotation is the most important annotation transaction

• UML relationships

Dependency relationship: It is the semantic relationship between two transactions . A change in one transaction (independent transaction) will affect the semantics of another transaction (dependent transaction);

- Graphically realized by dotted line + arrow

- A (dependent transaction) ·······> B (independent transaction) A depends on B, and the change of B transaction will cause the semantic change of A transaction

Association relationship: It is a structural relationship that describes a set of chains. A chain is a link between objects . It is usually divided into an aggregation relationship and a composition relationship , and describes the association between the whole and the parts.

-One -way association: solid line + arrow A() —————> B(), it makes a class know the properties and methods of another class , A class depends on B object, and B as

31 Chapter 1 Computer System Knowledge

is a member variable of A, then there is an association relationship between A and B, and the association can be one-way or two-way (two-way association does not use arrows)

-Bidirectional association: solid line A ——————— B, the multiplicity of association is located above the solid line , indicating how many instances of class A can be associated with an instance of class B, and how many instances of class A can be associated with an instance of class B. Usually the multiplicity can be represented by (1 *) (1 1...*) , etc. For many-to-many bidirectional associations, the association relationship can generally be extracted into an "association class" - Aggregation relationship: solid line + empty diamond A (part) ———————<> B (the whole) The life cycle of the whole is not synchronized with the part, the whole disappears, and the part can still exist- combination relationship: solid line + solid diamond A (part) —————<+> (The solid rhombus is not easy to draw, use <+> instead) B (whole) The life cycle of the whole is synchronized with the part, the whole disappears, and the part also disappears

Generalization relationship: It is a relationship between "special" and "general" . The object of a special element (child) can replace the object of a general element (parent). The child element shares the structure and behavior of the parent element - graphically through Solid line + hollow triangle arrow - A (special, child) ————————|> B (general, parent)

Realization relationship: is a semantic relationship between classifiers, where one classifier specifies a contract guaranteed to be executed by another classifier

- 通常的使用情况: 1. 接口和实现他们的类或构件之间; 2. 用 例和实现他们的协作之间 - 图形上通过 虚线 + 空心三角箭头 - A( 类 ) ········|> B( 接口 )

类图(静态)

类图:展现了一组对象、接口、协作和他们之间的关系,类图中可以 包含注解和约束,也可以有包或子系统

类图对静态设计视图建模的三种方式

32第 1 章 计算机系统知识

对系统词汇建模

对简单的协作建模

对逻辑数据库模式建模

类图中类的组成

第一层 类名

第二层 属性名 ( 属性名 1: 属性的类型 )

第三层 方法名 (方法名 1(): 方法返回值类型),其中属性名 和方法名前可以有修饰符 + public 公有的; - private 私 有的; # protected 受保护的

对象图(静态)

对象图:展现了某一时刻,一组对象及他们之间的关系

对象图中对象与类图中类的区别:对象分为两层,第一层是 对象名

( 对象名 : 类名 ) , 第二层 属性

用例图(静态)

用例图:展现了一组用例、参与者、以及他们之间的关系

参与者:参与者是与系统交互的外部实体,可以是使用者或者参 与系统交互的外部系统,基础设备等

用例:是从用户角度描述的系统行为,用例是一个类,代表了一 类功能,而不是该功能的某一个具体实现

包含关系:用例与用例之间的关系,图形上使用虚线 + 箭头 + 《 include 》 表 示 : A ( 基 本 用 例 ) ··· 《 include》 ···> B (被包含用例) , 箭头指向被包含的用 例, A 用例包含 B 用例,则 A 执行用例 B 一定也会被执行

扩展关系:用例与用例之间的关系,图形上使用虚线 + 箭头 + 《 extend 》 表 示 : A ( 扩 展 用 例 ) ··· 《 extend 》 ···> B (被扩展用例) , 箭头指向被扩展的

33第 1 章 计算机系统知识

用例,被扩展后的用例 B 在执行时可能会遇到特殊的情况或者可选的 情况,这个时候就可以用 扩展用例

包含关系与扩展关系区分:

区分包含关系:使用某个用例,必然会使用另外一个用例

区分扩展关系:当执行某个用例,不一定要去执行另外一个用例

序列图(动态)

序列图,描述了以时间顺序组织的对象之间的交互活动,序列图是对 一个用例进行详细过程的分解

图形上,参与交互的对象放在图的上方,沿水平方向排列,发起交互 的对象放左边;然后把对象间发送和接收的消息,按照时间顺序由上 到下排列

对象生命线:由对象起始的一条垂直向下的虚线,表示对象在一段时 间内存在

控制焦点:对象生命线之上的一段瘦高的矩形,表示对象执行一个动 作所经历的时间段

调用消息用实线 + 箭头表示,返回消息用 虚线 + 箭头表示

调用消息所要执行此消息方法的是箭头指向的对象

通信图(动态)

通信图也成协作图,强调收发消息对象的结构组织

通信图与序列图的不同,在于通信图有路径、通信图有顺序号,延同 一个链可以展示许多消息,每个消息都有唯一的顺序号

通信图与顺序图是同构的,可以相互转换

通信图展现了对象间的消息流及其顺序

34第 1 章 计算机系统知识

状态图(动态)

状态图展现了一个状态机,它由状态、转换、事件和活动组成

状态图对系统的动态方面建模

当对系统、类或用例的动态方面建模时。通常是对反应型对象建模

状态:任何可以被观察到的系统行为模式,一个状态代表系统的一种 行为模式

The state is divided into initial state (solid circle), final state (a layer of circle outside the solid circle) and intermediate state

The state in the state diagram is a rounded rectangle, the first layer is the state name , and the middle layer is

State variables (can be absent), the last layer is the activity table (can also be absent)

A line with an arrow between the states indicates a "transition (transition)" When an event on the arrow line occurs, the transition begins

A state diagram can only have one initial state, and can have multiple final states or no final state

Activity: It consists of "event name/action expression" and is located in the activity table of the state. There are three standard events as follows

entry: entry action, enter the state, execute immediately

do: internal activity, takes up a limited time, can interrupt work

exit: exit action, exit status, execute immediately

Event: The time that occurs at a specific moment, which is an abstraction of events that cause the system to act and transition from one state to another

Transitions consist of "event(guard condition)/action"

Transition consists of two states

event triggered conversion

Activities (actions) can be executed within a state or when a state transitions

guard condition is a boolean expression

The "event" occurs and the "monitoring condition is true" state transition occurs , and the "action" will not be executed until the state transition starts

Composite state: A set of state transitions surrounded by an action rectangle as one in another state diagram

35 Chapter 1 Knowledge of Computer Systems

state exists

All sub-states in the combined state are completed before going to other states outside the combined state

Activity diagram (dynamic)

The activity diagram shows the flow of the system from one activity to another . It is very important to model the function of the system, emphasizing the control flow between objects

The activity diagram consists of start, end, activities composed of rounded rectangles, streams composed of solid lines + arrows , concurrent forks, concurrent confluences, branches, and guardian expressions on branches

Use activity diagrams to model workflows and operations

Activities that are directly connected after a concurrent fork can be executed concurrently

Component diagram (static)

A component diagram, also known as a component diagram , shows the organization and dependencies between a set of components

Related to class diagrams , which typically map components to one or more classes, interfaces, or collaborations

Component diagrams have special marks , and the component diagrams are connected through supply interfaces (hollow circles) and required interfaces (arcs) . The supply interfaces provide corresponding method implementations , and the required interfaces call this method

Deployment diagram

Deployment diagrams are used to model the physical aspects of the system

Three-dimensional graphics of the deployment diagram, "artifact" indicates the product

Deployment diagrams show the relationship between system software and hardware , used in the implementation phase

The relationship between deployment components is similar to package dependencies

Summary

Static modeling: class diagram, object diagram, use case diagram

Dynamic modeling: sequence diagrams, communication diagrams, state diagrams, activity diagrams

Physical Modeling: Component Diagrams, Deployment Diagrams

36 Chapter 1 Knowledge of Computer Systems

Interaction diagrams: sequence diagrams (sequence diagrams, sequence diagrams), communication diagrams (collaboration diagrams)

37 Chapter 1 Knowledge of Computer Systems

Chapter 7 Design Patterns

Creative Design Patterns

Concept: The creative design pattern abstracts the instantiation process and helps a system how to create, compose, and represent those objects

Simple factory pattern

Define a factory class that can return instances of different classes according to the parameters passed in . These "different classes" inherit from the same parent class

The method of creating an instance in a factory class is usually a static method , so it is also called a static factory

Users do not need to know how product objects are created

Factory method pattern ( Factory Method )

On the basis of a simple factory, define a factory interface for creating objects , and let the subclasses that implement this interface decide which class to instantiate

The factory method pattern defers the instantiation of a class to subclasses

applicability:

When a class does not know the class of the object it must create

When a class wants its subclasses to specify the objects it creates

Abstract Factory pattern (Abstract Factory)

Intent: To provide an interface for creating a series of related or interdependent objects without specifying their concrete classes

It can be understood as: "factory method" creates an object , and "abstract factory" creates a series of objects

Applicability: (Drawing independent combination of multiple products, the joint display interface is not implemented)

38 Chapter 1 Knowledge of Computer Systems

When a system is to be created, composed and represented independently of its products

When a system is to be configured by one of several product families

When the design of a series of related product objects is to be emphasized for joint use

When providing a product class library, only want to show their interface and not the implementation

Builder pattern (Builder)

Intent: To separate the construction of a complex object from its representation , so that the same construction process

Different representations can be created

Understanding: define an abstract generator; then define multiple concrete generator classes (encapsulating complex algorithms of objects) to inherit it; define a manager (object assembly) to operate the generator, so that the manager and different generators Combining classes to create different product representations

Applicability: (generate complex algorithms, different construction objects)

When complex algorithms for creating an object should be independent of the object's components and assembly

When the construction procedure must allow different representations of the constructed object

Prototype mode (Prototype)

Intent: Use prototype instances to specify the type of object to create , and create new objects by copying these prototypes

Applicability: (prototypes are independently constituted, run-specific, different combination states)

when a system should be created, composed and represented independently of its products

When the class to instantiate is specified at runtime , such as dynamically loaded

When an instance of a class can only have one of several different combinations of states

Singleton _

Intent: To ensure that a class has only one instance and provide a global access point to it

Applicability: (singleton has one instance)

39 Chapter 1 Knowledge of Computer Systems

When a class can have only one instance and clients access it from a well-known access point

Structural Design Patterns

Concept: deals with how to combine classes and objects to obtain larger structures

Adapter mode (Adapter)

Intent: Convert the interface of a class into another interface that the user wants , so that those classes that could not work together due to interface incompatibility can work together

Applicability: (adaptation interface does not meet)

Want to use an existing class, but the interface does not meet the requirements

Bridge mode (Bridge)

Intent: To separate the abstraction from the implementation so that both can vary independently

Understanding: Split a class with multiple combinations into several interrelated classes through aggregation or combination relationships , each of which can change independently

Applicability: (Bridge binding to augment, does not affect does not compile class hierarchy)

Don't want a fixed binding between an abstraction and its implementation

Both the abstraction of a class and its implementation can be extended by generating methods for subclassing

Modifications to the implementation part of the abstraction should have no impact on the client and do not require recompilation

A class hierarchy with many classes to generate

Composite _

Intent: To combine objects into a tree structure to represent a part -whole hierarchy, so that users can use a single object and a combined object in a consistent manner

Understanding: Understand the relationship between folders and files

Applicability: (combine part-whole, use compose object)

Want to represent the part of the object – the overall hierarchy

40 Chapter 1 Computer System Knowledge

It is hoped that the user ignores the difference between the combined object and the single object , and uses all objects in the combined object uniformly

Decorator mode (Decorator)

Intent: Dynamically add some additional responsibilities to a class

Applicability: (decorated to add and remove responsibilities, not extendable)

Dynamically and transparently add responsibilities to individual objects without affecting other objects

Deal with responsibilities that can be revoked

When it is not possible to extend by generating subclasses

Appearance mode (Facade)

Intent: Provide a consistent interface for a set of interfaces in the subsystem , define a high-level interface , and make the subsystem easier to use

Applicability: (simple interface to facade subsystem, depends on build hierarchy entry point)

When you want to provide a simple interface to a complex subsystem

There is a large dependency between the client program and the implementation part of the abstract class

When you need to build a hierarchical subsystem , use the facade pattern to define the entry point for each layer of the subsystem

Flyweight _

Intent: Use sharing technology to effectively support a large number of fine-grained objects

Understanding: Use a flyweight factory to create and maintain instances, only one instance of a type is created , and subsequent creations directly return the created instance

Applicability: (flyweight has a lot of overhead, external state replaces multiple groups of objects)

An application uses a large number of objects

Due to the use of a large number of objects, a large storage overhead is caused

41 Chapter 1 Knowledge of Computer Systems

Most states of an object can become external states

Many groups of objects can be replaced by relatively few shared objects if the external state of the objects is removed

Proxy mode (Proxy)

Understanding: Provide a proxy for other objects to control access to this object

Applicability: (proxy simple pointer)

When more general and complex object pointers are needed instead of simple pointers

Behavioral Design Patterns

Concept: Involves algorithms and assignment of responsibilities between objects , describes communication patterns between them , uses inheritance mechanism to assign behavior among objects

Chain of Responsibility

Intent: Make multiple objects have the opportunity to process the request , thereby avoiding the coupling relationship between the sender and receiver of the request , connect the objects to a chain , and pass the request along the chain until an object processes it

Applicability: (responsibility chain request is automatically determined, and the receiver is not specified dynamically)

There are multiple objects handling a request , which object handles the request is determined automatically at runtime

Want to submit a request to one of multiple objects without explicitly specifying the recipient

The set of objects that can handle a request should be dynamically specified

Command mode (Command)

Intent: To encapsulate a request into an object , so that different requests can be used to

42 Chapter 1 Knowledge of Computer Systems

User parameterization , queuing requests, recording request logs, and supporting undoable operations

Applicability: (command parameterization, specified arrangement, cancel operation modification)

Abstract the action to be performed to parameterize an object

Specify, queue and execute requests at different times

支持取消操作、修改日志

解释器模式 (Interpreter)

意图:给定一个语言,定义它的文法的一种表示,并定义一个解释器, 解释语言中的句子

适用性:(解释抽象语法树)

当一个语言需要解释执行时,可将语言中的句子表现为一个抽象 语法树

迭代器模式 (Iterator)

意图:提供一种方法顺序的访问一个聚合对象中的各个元素,且不需 要暴露该对象的内部表示

适用性:(迭代聚合无需暴露。多种遍历统一接口)

访问一个聚合对象的内容而无需暴露它的内部表示

支持对聚合对象的多种遍历

为遍历不同的聚合结构提供一个统一的接口

中介模式 (Mediator)

意图:用一个中介对象来封装一系列的对象交互,中介者使得各个对 象不需要显示的相互引用,从而使其耦合松散,而且可以独立的改变 它们之间的交互

适用性:(中介复杂通信,依赖难以理解)

一组对象定义良好,但是以复杂的方式进行通信,产生的相互依

43 Chapter 1 Knowledge of Computer Systems

The dependency structure is confusing and difficult to understand

Memento mode (Memento)

Intent: To capture the internal state of an object and save it outside the object without breaking encapsulation , so that the object can be restored to the original saved state later

Applicability: (memorandum state at a certain moment, breaking encapsulation)

The (partial) state of an object at a certain moment must be saved so that it can be restored later when needed

If an interface is used to allow other objects to directly obtain these states, it will expose the implementation details of the object and break the encapsulation of the object

Observer pattern (Observer)

Intent: Define a one-to-many dependency relationship between objects . When an object changes, all objects that depend on it are notified and automatically updated

Applicability: (Observers change other objects, not coupled)

When the change of one object needs to change other objects at the same time , and it is not known how many objects need to be changed

When an object must inform other objects , but it cannot assume who the other objects are, that is, it does not want these objects to be tightly coupled

State mode (State)

Intent: To allow an object to change its behavior when its internal state changes , the object appears to modify its class

Applicability: (state change behavior, multi-branch statement depends on state)

The behavior of an object is determined by its state , and its behavior must be changed according to the state at runtime

An operation contains a large multi-branch conditional statement , and these branches depend on the object

44 Chapter 1 Knowledge of Computer Systems

status.

Strategy _

Intent: Define a series of algorithms , encapsulate them one by one, and make them interchangeable . This mode allows the algorithms to change independently of the customers who use them

Applicability: (strategies for multiple behaviors, algorithm variants to avoid exposure, multiple conditional statements)

Many related classes simply behave differently , and strategies provide a way to configure a class with one of many behaviors

Different variants of an algorithm need to be used

Clients using the algorithm should not know the data structure, avoid exposing

Multiple behaviors are defined in a class , and these behaviors appear in the form of multiple conditional statements in the operation of this class , and the relevant conditional branches are moved into their respective strategy classes to replace these conditional statements

Template Method

Define the algorithm skeleton in an operation , and defer some steps to subclasses , so that a subclass can redefine some steps of the algorithm without changing the structure of the algorithm

Applicability: (Templates can be changed to subclasses to avoid code duplication and subclass extensions)

Implement the invariant parts of an algorithm once and leave the variable behavior to subclasses

Common behaviors in subclasses should be extracted and centralized into a common parent class to avoid code duplication

Controlling subclass extensions , template methods only allow extensions at specific points

Visitor pattern (Visitor)

Intent: Represents an operation to be performed on elements in an object structure . Allows defining new operations on elements without changing their class

45 Chapter 1 Knowledge of Computer Systems

Applicability: (Visitors depend on concrete classes, classes of unrelated objects, define new operations)

An object structure contains many class objects, they have different interfaces, and the user wants to perform some operations on these objects that depend on their specific classes

Many different and unrelated operations need to be performed on objects in an object structure , and there are classes that do not want these operations to pollute these objects

The class defining the object rarely changes, but often new operations need to be defined on this structure

46 Chapter 1 Computer System Knowledge

Chapter 8 Information Security

Firewall

防火墙建立在内外网络边界上的过滤封锁机制,认为内部网络是安全 可信赖的外部网络是不安全和不可信任

防火墙对通过受控干线的任何通信进行安全处理,例如:控制、审计、 报警、反应

DMZ(屏蔽子网防火墙):位于内网与外网之间,通常作为隔离区, 在 这 里 可 以 放 置 一 些 公 用 服 务 器 , 例 如 web 服 务 器 、 Email 、 FTP

包过滤防火墙:通过一个包过滤器,根据数据的包头中各项信息来控 制站点、网络之间的访问性

包过滤防火墙对用户完全透明、访问速度快、低水平控制

包过滤防火墙处在网络层和数据链路层之间

每个 IP 字段都被检查:源地址、目的地址、协议、端口

缺点:不能防黑客攻击、不支持应用层协议、访问控制粒度粗糙、 不能处理新的安全威胁

应用代理网关防火墙:彻底隔绝内外网之间的直接通信,内外网之间 的互相访问需要经过应用层代理软件转发

优点:可以检查应用层、传输层、网络层的协议特征,对数据包 的检测能力较强

Disadvantages: Difficult to configure, slower processing speed

State inspection technology firewall: combines the advantages of both packet filtering firewall and application proxy gateway firewall, namely security and high speed

virus

Computer virus characteristics: transmission, concealment, infectivity, latency, triggering, destruction

47 Chapter 1 Knowledge of Computer Systems

sex

Worm: worm ; Trojan: Trojan horse ; Backdoor: backdoor virus ; Macro: macro virus

Objects of macro virus infection: text documents, spreadsheets

Trojan software: Glacier

Trojan horse infection process: through software download and bundling, etc. , a Trojan virus server is established on the user host, and the Trojan virus server establishes a network connection with the Trojan virus client on the attacker’s host, so that the attacker can use the To steal or destroy user host data

Worms: Happy Hour, Incense Panda, Code Red, Love Bug, Stuxnet

Cyber ​​attacks

Denial of service attack (Dos attack): By continuously sending requests to the computer , the target server has no resources to receive other normal requests, so as to achieve the purpose of "making the computer or network unable to provide normal services"

Replay attack: The attacker sends a message that the target host has already received to achieve the attack purpose , mainly used in the identity authentication process and destroying the correctness of authentication ; it can be prevented by adding a timestamp in the message

Password intrusion attack: use some legitimate user accounts and passwords to log in to the target host , and then carry out attack activities

Trojan horse: After the user downloads the software containing the Trojan horse , the Trojan horse program will initiate a connection request to the hacker. After the connection is established, the hacker can carry out the attack

Port spoofing attack: Use port scanning to find system vulnerabilities and implement attacks

Network monitoring: The attacker can interface with all information transmitted on a unified physical channel on a certain network segment ,

Intercept account number and password

IP spoofing attack: Forge the source IP address , pretending to be another system or the identity of the sender

SQL injection attack: By injecting certain SQL query codes , obtaining database privileges, thereby stealing and modifying information

Intrusion detection technology: expert system, model detection, simple matching

48 Chapter 1 Knowledge of Computer Systems

Internet Security

SSL (Secure Sockets Layer): Transport Layer Security Protocol port number 443

TLS (Transport Layer Security Protocol): It is also a Transport Layer Security Protocol, a subsequent version of SSL 3.0

SSH: A secure protocol for establishing connections between terminal devices and remote sites , based on a full protocol based on the application layer and transport layer

HTTPS: HTTP encrypted with SSL

MIME: e-mail extension related protocol , not secure

PGP: Mail protocol with asymmetric encryption via RSA

IPSec: Encrypts IP datagrams

ARP: Address Resolution into Physical Address Protocol

Telnet: an insecure remote login protocol

WEP: Limited Equivalent Nondisclosure Agreement

TFTP: Trivial File Transfer Protocol

PP2P: link encryption

RFB: Remote Login Graphical User Interface Protocol

IGMP: Internet Group Management Protocol

Five basic elements of information security: confidentiality, integrity, availability, controllability, and auditability

49 Chapter 1 Computer System Knowledge

Chapter 9 Computer Networks

Network equipment

Interconnection devices at the physical layer: repeaters (Repeaters) and hubs (Hubs), where the hub can be regarded as a multi-port repeater

Data link layer interconnection equipment: bridge (Bridge) and switch (Switch), where the switch is a multi-port bridge

Network Layer Interconnection Devices: Routers

Application Layer Interconnect Devices: Gateways

The physical layer device cannot isolate the broadcast domain and the collision domain, the data link layer device can isolate the conflict domain but cannot isolate the broadcast domain, the network layer can isolate the broadcast domain and the collision domain

Classification of TCP/IP protocol suite

Network layer protocol IP: a best-effort communication protocol, the transmitted data may be lost ICMP: Internet Control Information Protocol, using IP to transmit messages

Transport layer protocol: TCP UDP, all based on IP

application layer protocol

FTP: File upload protocol, the port for file upload is 20, and the control port is 21

SNMP: Network Management Protocol

Easy to remember: 1. Those with "IP" and "AP" in their name are network layer, 2. All application layer protocols with "T", except TFTP are based on TCP, others are based on UDP, without "T" Only POP3 is TCP, others are UDP

50 Chapter 1 Knowledge of Computer Systems

Network layer protocol IP TCP UDP

The service provided by IP is connectionless (refers to sending data without determining that the target system is ready to receive), unreliable (the target system does not confirm the successfully received packet)

TCP Connection-oriented, reliable transmission control protocol, using three-way handshake to achieve reliability

Reliable transmission, connection management, error checking, retransmission, flow control (variable size sliding window protocol), port addressing,

UDP is a connectionless and unreliable transmission protocol, which can ensure the communication between application processes and help improve the efficiency of transmission

port addressing

Email Service Agreement

SMTP: Simple Mail Transfer Protocol, used to send mail, port 25, based on TCP, can only transfer text and ASCII code files

SMTP communicates based on the C/S mode, that is, the client/server mode

MIME message attachment extension type

PEM private email

POP3 is the protocol used to receive mail, based on TCP, port number 110 is based on C/S mode communication

Address Resolution ARP RARP

ARP: Address Resolution Protocol, which converts IP addresses into physical addresses (MAC address, unique for each network card)

RARP: Anti-Address Resolution Protocol, which converts physical addresses into IP addresses

Computers use ARP communication process PC1 to communicate with PC2

Query the ARP cache

If there is an IP address cache of PC2, use its corresponding physical address to send directly

51 Chapter 1 Knowledge of Computer Systems

data

If there is no cache, send the ARP request packet in the form of broadcast on the LAN

If a computer on the LAN has the same IP address, that computer will respond with an ARP reply containing the corresponding physical address

ARP stores the IP and the physical address of the reply in the cache

• DHCP

Dynamic host configuration protocol, centralized management and allocation of IP addresses, so that hosts in the network can obtain IP addresses, gateway addresses, DNS addresses, DHCP server addresses, etc.

• URL

Protocol name://hostname. domain name. domain name suffix. domain name category/directory/webpage file

• DNS domain name query order

Local hosts file》Local DNS cache》Local DNS server》Root domain name server

The query order of the main domain name server after receiving the request

Local cache "local hosts file" local database "forward domain name server

• IP addresses and subnetting

The network in the Internet is divided into 5 categories A, B, C, D, E

Class A network, the network address has 8 bits (the first bit is 0), and the rest are host addresses

Class B network, the network address is 16 (the first two digits are 10), and the rest are host addresses

Class C network, the network address is 24 (the first two digits are 110), and the rest are host addresses

52 Chapter 1 Computer System Knowledge

Subnet mask: To identify whether the message is only stored in the network or forwarded to other places by routing. Use 1 to represent the network address and 0 to represent the host address. For example, the C-type mask is 11111111.11111111.11111111.00000000, which is 255.255.255.0

Subnetting:

Divide a network into multiple subnets: take part of the host number as the subnet number, and take as many digits as k to get 2^k subnets

Merge multiple networks into one large network: remove part of the network number and host number

In 222.125.80.128/26, /26 represents a 26-bit network address and a 32-26-bit host address

All 1s are broadcast addresses, all 0s are network addresses

• IPv6

IPv4 is 32 bits, IPv6 is 128 bits, and the address space is not exhausted

Wireless Internet

The bluetooth has the smallest coverage and the shortest communication distance in the wireless network

• Windows commands

ipconfig: Display IP addresses, subnet masks, and default gateway values ​​for all network adapters

ipconfig/release: release IP address

ipconfig/flushdns: clear dns cache, or flush dns

ipconfig/displaydns: display dns

ipconfig/registerdns: DNS client registers with the server manually

ipconfig/all: Display the complete TCP/IP configuration information of all network adapters, including whether the DHCP service is started

ipconfig/renew: DHCP client refresh, re-apply for IP

53 Chapter 1 Knowledge of Computer Systems

Routing

Five Routing Types

Host route: subnet mask 255.255.255.255

direct network

remote network

Default route: destination network and netmask are both 0.0.0.0

persistent routing

When the server receives an IP packet, it first looks for the host route, then looks for the network route (directly connected network, remote network), and finally looks for the default route

If a router receives multiple routes for a destination, it compares the administrative distances of the routes and uses the one with the smallest administrative distance.

Other

Network Availability: Percentage of Network Time Available to Users

Campus Subsystem: Communication system linking each building

DNS for load balancing: set up multiple host records for the same domain name, enable round robin, add host records for each web server

To enable two IPv6s to communicate through the existing IPv4 network, "tunneling technology" is required

To enable IPv6 to communicate with IPv4, a "translation technique" is required

The DNS server and the computer are not in the same subnet, which will not cause the computer network to be inaccessible, as long as the route can reach DNS, it can work normally

The main function of the core layer of the hierarchical LAN model: forwarding packets from one area to another at high speed

The default gateway must be in the same subnet as the current IP address

54 Chapter 1 Knowledge of Computer Systems

Chapter 10 Operating Systems

Operating system status

A computer system consists of software and hardware, and those without software are called bare metal

Operating system status: computer hardware "operating system" system software "application software" user

All other software, such as compilers, assemblers, database management systems, etc., and a large number of application software are built on top of the operating system

Think of the operating system as the interface between the user and the computer

Process management

A process is the basic unit of source allocation and independent operation

The focus of process management is to study the concurrency characteristics between processes, as well as the problems arising from mutual cooperation and resource competition between processes

Precursor graph

Directed acyclic graph, composed of nodes and one-way edges, nodes represent the operation of each program segment, one-way edges represent the predecessor-predecessor relationship Pi (node, predecessor) ---> Pj (node, successor), Pi executes End Pj to execute

For the forward graph, if there are n arrows, n semaphores need to be set, and written to the graph in order from small to large. The direction of the arrow is the P operation, and the tail of the arrow is the V operation.

The main characteristics of program sequential execution: sequentiality, closure (exclusive resources), reproducibility

The main features of concurrent execution of programs: loss of program closure, no one-to-one correspondence between programs and machine execution activities, mutual constraints between concurrent programs

55 Chapter 1 Knowledge of Computer Systems

Three-state model of process

In a multi-programming system, processes alternately run on the processor, usually in three states: ready to run, blocked

Running: When a process is running on the processor (CPU), it is in the running state

Ready: A process has obtained all resources except the processor (CPU). Once it gets the processor, it can run. This is the ready state

Blocking: waiting, sleeping, a process is waiting for some event (such as waiting for I/O to complete) to occur and stop running

Synchronization and mutual exclusion

Synchronization is a matter of direct constraints between cooperating resource processes

Mutual exclusion is an indirect constraint problem between processes applying for critical resources

Critical Section Management Principles

Enter when you are free, if there is no process in the critical area, you are allowed to enter, and can only run in the critical area for a limited time

If there is no space, wait, if there is a process in the critical section, other processes will have to wait

Limited waiting, the process waiting outside must be guaranteed to be accessible within a limited time

Give up the right to wait. When the process has CPU but no resources, it cannot enter its own critical section. It must release CPU resources immediately to avoid busy waiting.

Semaphore mechanism

The physical meaning of the semaphore S, S>=0 indicates the available number of resources, and S<0 indicates the number of processes waiting for the resource

56 Chapter 1 Knowledge of Computer Systems

• PV operation is a common way to achieve synchronization and mutual exclusion

P means to apply for a resource (S = S-1, which can be understood as applying for one from the semaphore S to use. After the application, if S<0, the process will turn into a blocking state and insert into the blocking queue),

V means to release a resource (S = S+1, which can be understood as releasing a resource to a semaphore. After release, S<=0 will wake up a process from the blocking queue and insert it into the ready queue)

• PV operation realizes process mutual exclusion

Let the initial value of the semaphore mutex be 1, execute the P operation when entering the critical area, and execute the V operation when exiting the critical area. These two use PV to realize code mutual exclusion

• PV operation realizes process synchronization

Single buffer synchronization (only one product can be placed in the buffer): it is divided into producers and consumers, and two semaphores need to be set. The initial value of S1 is 1 to indicate the number of products that can be placed in the buffer, and the initial value of S2 is 0. Indicates the number of products that can be taken out of the buffer; P(S1) and V(S2) are required for each production product, and P(S2) and V(S1) are required for each consumption product

Multi-buffer synchronization (the buffer can hold multiple products): add a semaphore S on the basis of a single buffer, named a mutex semaphore, the initial value is 1, and mark the operable amount of the buffer (the buffer is a Mutually exclusive resources); every time a product is produced and consumed, a P(S1) P(S) V(S) V(S2) operation is added, and the PV operation of S is placed in the middle

deadlock

Deadlock caused by improper allocation of similar resources: If the resource allocation strategy adopted is to allocate for each process in turn, it may result in that after several rounds of allocation, no process reaches the required number of resources. At this time, each process All are waiting for resource allocation, forming a deadlock;

The solution to the deadlock caused by improper allocation of similar resources: m is the total amount of resources, n is the number of processes, and k is the resources required by each process. Satisfying m >= n * (k-1) + 1 can avoid deadlock

57 Chapter 1 Knowledge of Computer Systems

Process resource map

Pi represents the process, Ri represents the resource type; each Ri can have multiple resources; the arrow pointing to the process indicates the allocation of resources; the arrow pointing to the resource indicates the application resource;

Allocate resources first and then apply for resources. The process that does not satisfy the resources after the allocation application is "blocking"

Whether it can be reduced: depends on whether it is possible to release resources after a process is completed and allow subsequent processes to complete

Reducible is non-deadlock

Deadlock avoidance

Deadlock handling strategy: ostrich strategy (ignore), prevention strategy, avoidance strategy, detection and removal of deadlock

Deadlock avoidance algorithm: banker's algorithm, that is, before each allocation of resources, it is detected whether the system is safe after allocation of resources (whether it is safe or not depends on whether the system can have a certain sequence after allocation of resources to complete all processes) , high resource utilization, but increased detection overhead

Banker’s Algorithm Calculation: 1. First calculate the number of resources still needed, 2. Then calculate the number of remaining resources

threads

During the creation, cancellation, and switching of processes, the system will pay a large time and space overhead, so the system will not introduce too many processes, and the frequency of process switching will not be too high, so the introduction of threads

Thread is the basic unit of scheduling and allocation, process is the unit of independent allocation of resources, and thread is an entity in the process

Not visible between threads, but threads can share process resources

58 Chapter 1 Knowledge of Computer Systems

Principle of locality

Time limitation: When a certain instruction of the program is executed, the instruction may be executed again in the near future, and if a storage unit is accessed, it may be accessed again in the near future

Space limitation: the program accesses a certain storage unit, and its nearby storage units may also be accessed in the near future

Relevant question type "elimination" questions

in memory to be eliminated

Weed out unvisited ones first

unmodified

Paging storage management

The address structure of pure paging storage management: n-bit page number + m-bit in-page address

The page size of the computer is 4k => it represents n-bit page address 2^n = 4 * 1024 => n=12

Page conversion table Logical address to physical address => Logical address is the address structure of pure paging storage management, composed of n-bit page number + m-bit in-page address => The composition of the page address remains unchanged, replace the page number with The corresponding physical block number in the "page conversion table" is enough

Segment page storage management

Segment page storage management address structure: n-bit segment number + k-bit segment page number + m-bit in-page address

Single buffer

The buffer can only have one "job", the buffer can be entered when it is empty, and the buffer has a job

59 Chapter 1 Knowledge of Computer Systems

can send when

I/O device—input (T)—>buffer—transfer (M)—>working area (processing C)

Time taken to calculate n job ticket buffers: (T+M)*n + C

double buffer

There are two buffers, each of which can store a "job"

Time taken to calculate double buffer for n jobs: T*n + M + C

Disk scheduling algorithm

First come, first served (FCFS): Start the disk drives in the order of requesting visitors, with a large average seek length

Shortest seek time first ( SSTF ): Let the one with the shortest distance from the current track position be executed first, regardless of the order of visitors

Scanning Algorithm or Elevator Scheduling Algorithm (SCAN): Starting from the current position of the magnetic head, along the moving direction of the magnetic head, select the nearest cylinder, if there is no requested cylinder in the moving direction of the magnetic head, reverse the direction and select the nearest cylinder

Cyclic Scanning Algorithm (CSCAN): Based on the scanning algorithm, after turning the direction, the nearest cylinder is no longer selected, but moved to the innermost end

Rotation scheduling algorithm

The disk rotation will not stop. After the disk rotates one sector, it means that the record of one sector has been read. The disk will not stop within the processing time after the record is read.

If n is processed sequentially, the total time = (read time + sector round time)*(n-1) + read time of the first sector + processing time of the first sector

Optimizing processing: rearrange the sectors so that the position where the first sector stops after processing is at the start position of the sector where the second record is located, and the time spent = (reading time + processing time)*n

60 Chapter 1 Knowledge of Computer Systems

Multi-level index structure

Direct address index: the index starts from 0, and an address entry points to a disk data block

Level-1 indirect address index: An address item points to a disk index block (also called a first-level index block), and a disk index block contains many address items, and the address item in the disk index block points to a disk data block

Second-level indirect address index: Compared with the first-level indirect address index, there is one more disk index block

file directory

In order to realize access by name, the system sets a data structure for description and control for each file, including at least the file name and the physical address of the stored file. This structure is called the file data block FCB

The file directory is composed of file control blocks for file retrieval

The file control block contains three types of information

Basic information: file name, file physical address, file length, number of file blocks, etc.

Access Control Information: File Access Permissions

Usage Information: Created Date, Last Modified Date, Current Usage Information

A crash occurs when the directory file is modified, which has a great impact on the system

Directory structure

Multi-level directory structure: an inverted rooted tree, also known as a tree directory structure

Full path name: from the root directory to the file name D:\

Absolute path: starting from the root directory and ending with /

Relative paths: start with the current directory and end with /

bit view

The bit view uses binary to represent the usage of a physical block, 0 means free and 1 means used

61 Chapter 1 Knowledge of Computer Systems

The size of the bit view is determined by the size of the disk space (number of physical blocks). The bit view has strong description ability and is suitable for various physical structures

Assuming that the computer system has n bits, the 0th word of that bit view can correspond to the 0~n-1 physical block on the memory, the first word can correspond to the n~2n-1 physical block on the memory, and so on

Other

Variable partition allocation scheme: process P has an upper-adjacent free area or a lower-adjacent free area, then after the process P is released, the free areas are merged into one

When the user enters an application system through the mouse or keyboard, the interrupt handler first obtains the input information of the keyboard or mouse

The real-time of the real-time operating system means that the computer can process external information at a fast enough speed and respond quickly within the time allowed by the controlled object.

Hierarchy of the I/O system: Hardware - "Interrupt Handler - "Device Driver - "Device Independent Program - "User Process

The I/O software hides the implementation details of I/O operations, which is convenient for users to use

In the disk scheduling management, the arm-shifting scheduling is performed first and then the rotation is performed. When accessing information on different cylinders, the arm-shifting scheduling is required first. When accessing the same track, only rotation is required.

62 Chapter 1 Knowledge of Computer Systems

Chapter 11 Structured Development

Modularity

Modularization refers to decomposing a software to be developed into several small simple parts-modules, each module is independently developed and tested

Module independent

Module independence means that each module completes a relatively independent specific sub-function, and the connection with other modules is simple. There are two criteria for measuring module independence: cohesion and coupling

Coupling

Coupling is a measure of the relative independence (closeness of interconnection) between modules, the higher the degree of coupling, the weaker the independence of the modules

The seven types of coupling are ordered from low to high

No direct coupling: no direct relationship between two modules (no calls, no information passed)

Data coupling: there is a call relationship between two modules, and simple data values ​​are passed (value transfer)

Mark coupling: there is a call relationship, and the data structure is passed

Control coupling: there is call concern, and the control variable is passed, which allows the callee to selectively execute a certain function

External coupling: Modules are connected through an environment outside the software

Common Couplings: Couplings between those modules that interact through a common data environment

Content coupling: when a module directly uses the internal data of another module, or passes it into another module through an abnormal entry

63 Chapter 1 Knowledge of Computer Systems

cohesive

Cohesion is a measure of how closely the various elements in a module are combined with each other. The higher the degree of cohesion, the stronger the independence of the module.

Cohesive where types are sorted from low to high

Accidental cohesion (coincidental cohesion): There is no connection between elements within the module

Logical cohesion: refers to the execution of several logically similar functions in a module, and determines which one to execute through parameters

Time aggregation: a module formed by combining actions that need to be executed simultaneously at a specific time,

Process cohesion: a module completes multiple tasks, and these tasks must be performed according to the specified process

Communication cohesion: processing elements within a module all operate on the same data structure

Sequential cohesion: a single function, each processing element in the module is closely related, and needs to be executed sequentially

Functional cohesion: all the elements in the module work together to complete the same function, and one cannot be separated

System structure design principles

Decomposition-coordinating principle: breaking down large problems into smaller ones

Top-down principle: grasp the main function of the system and decompose it hierarchically from top to bottom

Information hiding-abstract principle: the upper layer specifies what the lower layer does and coordinates between modules, but does not specify how to do it

Consistency principle: unified specification, unified standard, unified file mode

The principle of clarity: each module has clear functions and interfaces, eliminates multiple functions and useless interfaces, avoids pathological links, and eliminates interface complexity

High cohesion and low coupling

Moderate fan-in and fan-out: a module calling other modules is called fan-out, and being called by other modules is called fan-out

The module size is moderate: if it is too large, the decomposition is insufficient; if it is too small, the independence of the module may be reduced

64 Chapter 1 Knowledge of Computer Systems

The scope of the module should be within its control range

System Documentation

System documentation is the "trace" of the system construction process, a guide for system maintainers, and a communication tool between developers and users

The role of system documentation among system developers, project managers, system maintainers, system evaluators, and users

User-system analyst: feasibility study report, overall planning report, system development contract, system program specification

System developers – project managers: system development plan, system development monthly report, system development summary report, project management documents

System testers – system developers: system solution specification, system development contract, system design specification, test plan document

System Developers – Users: User Manuals, Operating Guides

System developers – system maintainers: system design specification, system development summary report, technical manual

User-maintenance personnel: system operation report, maintenance modification suggestion

Data Flow Diagram

basic graphic elements

External entity: rectangle, generally represented by Ei

Data storage: two horizontal lines or a rectangle with missing sides, generally represented by Di

Data flow: directed edge, starting point ———— > end point

Processing: rounded rectangle or circle, generally represented by Pi

The top-level data flow diagram describes the input and output of the system, and the 0-level data flow diagram is a subdivision of the top-level data flow diagram

External entities: people, things, external systems outside the current system

People: students, teachers, staff, supervisors

65 Chapter 1 Knowledge of Computer Systems

Objects: sensors, controllers, vehicles, procurement departments

External systems: payment system, vehicle transaction system, inventory management system

Data storage: store data, provide data

Store processed output data and provide processed input data

For example: xxx table, xxx file

Processing: process the input data to get the output data

A process has at least one input data stream and one output data stream

Only input and no output is called "black hole"

Only output without input is called "white hole"

Insufficient processed data to produce output "grey holes"

Data flow: the starting point or focus of data flow must be processing

Balance between parent graph and child graph

The data flow in the parent graph must also be present in the sub-graph. In fact, it is to go to the sub-graph one by one according to the parent graph to see if there is a data flow in the parent graph but not in the sub-graph.

Tips for Finding Lost Data Streams

parent graph subgraph balance

Processing requires both input data and output data streams

Data conservation (go to the content of the title to find what is missing in the picture)

Data Modeling – ER Diagram; Behavioral Modeling – UML; Functional Modeling – Data Flow Diagram

Data dictionary

The data dictionary describes each data flow, file, process, and data item that makes up the data flow diagram

The data dictionary has four categories of entries: data flow, data storage, basic processing, data item

Data items are the smallest elements that make up data streams and data storage, and external entities are no longer described in the dictionary

Common processing logic description methods: structured language, decision table, decision tree

66 Chapter 1 Knowledge of Computer Systems

Structured development methodology

The general guiding ideology is top-down, layer-by-layer decomposition (from abstraction to concrete)

The basic principle is the decomposition and abstraction of functions

The earliest method proposed in software engineering, especially suitable for problems in the field of data processing

It is not suitable for solving large-scale and particularly complex projects, and it is difficult to adapt to changes in requirements

Structured design

Architecture Design: Define the main structural elements and relationships of the software

Data design: determine the file system structure and database table structure

Interface design: describe the external interface used by the software, and the internal interface between various components

Process Design: Define the algorithms and internal data structures within each component

Interface Design Golden Principles

user manipulation control

Reduce user memory burden

Keep the interface consistent

Problems needing attention when constructing hierarchical data flow diagrams

appropriately named

data flow

One processing does not fit too much data flow

Break down as evenly as possible

67 Chapter 1 Knowledge of Computer Systems

Chapter 12 Software Engineering

• CMM ( Capability Maturity Model )

The CMM divides software process improvement into the following five levels

Initial level: disorganized, with no clearly defined steps

Repeatable level: Basic project management processes and practices are established with the necessary process discipline to repeat previous success on similar projects

Defined level: software process documented, standardized

Managed level: Detailed metrics for software process and product quality are developed

Optimal level: quality analysis is strengthened, continuous improvement through process quality feedback, new concepts, new technologies, etc.

• CMMI ( Capability Maturity Integration Model )

Staged model, similar in structure to CMM, focusing on the maturity of the organization

Initial, process unpredictable lack of control

managed, the process serves the project

defined, the process serves the organization

Quantitatively managed, the process is measured and controlled

Optimized, focused on process improvement

Continuous model, focusing on the capabilities of each process area

CL0 (incomplete): Indicates that one or more objectives of the process area have not been met

CL1 (Performed): Process area specific objectives accomplished, transforming identifiable input target products, producing identifiable output target products

CL2 (Managed): Managed process institutionalization, focusing on the ability to target individual process instances

CL3 (Defined level): Institutionalization of defined processes, focusing on organizational standardization and deployment of processes

68 Chapter 1 Knowledge of Computer Systems

CL4 (quantitative management level): process institutionalization with quantitative management

CL5 (optimized): Institutionalization of optimization process, continuous improvement optimization

Waterfall model

The waterfall model is a model that defines each activity in the software life cycle as a number of stages linked in a linear order

Requirements analysis "design" coding "test" operation and maintenance, from front to back, a fixed sequence of mutual connection

The waterfall model assumes that a system requirement to be developed is complete, concise, consistent, and can be completed prior to design and implementation

Guide the development process with project phase review and document control

advantage:

Easy to understand and low management cost

Emphasis on early development planning, requirements research, and product testing

It is suitable for a system with clear development requirements, which is roughly fixed and will not be changed at will

shortcoming:

Customers must complete, correct, and clearly express their needs

Difficult to assess progress in the first two to three stages

Towards the end of the project, a lot of integration and testing work occurs

System capabilities cannot be demonstrated until the end of the project

Errors in requirements or design can only be found at the later stage of the project

Project risk control ability is weak

The V-model is a variation of the Waterfall model, with a focus on quality assurance activities and communication, breaking down basic questions and actually performing a series of tests

Incremental model

Incorporating the basic components of the waterfall model and the iterative nature of prototype implementation, it assumes that requirements can be segmented into a series of increments, each of which can be developed separately

69 Chapter 1 Knowledge of Computer Systems

The first increment is often the core product, with customer usage and evaluation of each increment serving as new features and functionality for the next increment

Each increment releases an operational version

advantage:

Has all the benefits of the waterfall model

Very little cost time to first shippable version

There is little risk involved in developing small systems represented by increments

shortcoming:

If there is no plan for the user's change requirements, the resulting initial increment may cause instability in subsequent increments

Manage cost, schedule, complexity

Evolutionary models

The evolutionary model is an iterative process model that enables software developers to gradually develop more complete software versions

Evolutionary models are particularly suitable for situations where there is a lack of accurate knowledge of software requirements,

Typical evolution models are prototype model and spiral model

The difference between the evolutionary model and the incremental model: incremental development of small functional modules each time, and evolutionary development of the entire product each time

prototype model

The prototype model is more suitable for the situation where the user's needs are unclear and the needs change frequently. It is more appropriate when the system scale is not too large and not too complicated.

A prototype does not have to meet all the constraints of the target software, its purpose is to build a prototype quickly and at low cost, and quickly develop a tangible system framework

Steps: Communication "Quick Planning" Rapid Design Method Modeling "Building Prototypes" Deployment Delivery and Feedback – Continue the cycle steps after completion

Prototyping begins with communication and its purpose is to define the overall software and effectively capture user needs

70 Chapter 1 Knowledge of Computer Systems

Prototype mode is not suitable for large-scale system development

spiral model

The spiral model combines the waterfall model and the evolutionary model, adding a risk analysis that both models ignore

Each spiral cycle roughly corresponds to the waterfall model

Each spiral cycle is divided into four steps

Make a plan: determine the software goals, develop an implementation plan, and clarify the constraints of project development

Risk Analysis: Identify Risks, Eliminate Risks

Implementation engineering: software development, verification phased products

User evaluation: Put forward correction suggestions and formulate the development plan for the next cycle

The spiral model is characterized by the addition of risk analysis, which is suitable for large-scale, high-risk, and demand-changing systems

Disadvantages: Too many iterations will increase development costs and delay submission time

Fountain model

The fountain model is a model driven by user needs and driven by objects, suitable for object-oriented development methods

The fountain model overcomes the limitation that the waterfall model does not support software reuse and integrates with multiple development activities

Fountain model development was iterative and seamless

No gap means that there is no clear boundary between development activities (analysis, design, coding), allowing activities to intersect and iteratively proceed

advantage:

High software development efficiency, saving development time

shortcoming:

Overlapping development phases, requiring a large number of developers, and high management costs

71 Chapter 1 Knowledge of Computer Systems

Strict management documents are required, and auditing is difficult

Unified Process Model ( UP )

is a use-case and risk-driven, architecture-centric, iterative and incremental development process

5 core workflows per iteration

Four technical stages and milestones of the unified process

Initial stage: focus on project start-up; Milestones: life cycle goals

Elaboration Phase: Requirements Analysis and Architecture Evolution; Milestone: Lifecycle Architecture

Construction phase: Focus on the construction of the system and generate an implementation model; Milestone: Initial test operation function

Transition phase: Focus on software delivery work, resulting in software increments; Milestone: Product release

Agile development

The overall goal is to deliver valuable software as early as possible and continuously

Agile development enables users to add or change requirements later in the development cycle

Extreme Programming (XP)

XP is a lightweight (agile), efficient, low-risk, flexible, predictable, and scientific software development method

The Four Values ​​of XP: Communication, Simplicity, Feedback, Courage

5 Principles: Rapid Feedback, Simplicity Assumptions, Incremental Revision, Championing Change, and Quality Work

12 Best Practices: Planning Game, Small Releases, Metaphors, Simple Design, Test First, Refactoring, Pair Programming, Collective Code Ownership, Continuous Integration, 40h Weekly, Live Clients, and Coding Standards

Crystal: Every different project requires a different set of strategies, conventions and methodologies

Parallel Solicitation Method (Scrum): Using an iterative approach, one sprint every 30 days, as needed

72 Chapter 1 Knowledge of Computer Systems

level to achieve product

Adaptive Software Development (ASD): 6 Fundamentals

have a mission as a guide

Characteristics are seen as key points of customer value

"Redo" is as important as "do"

Changes are not considered corrections, but "adjustments"

Lead times force consideration of critical requirements for each production release

Risk includes

Agile Unified Process (AUP): Adopt the principle of "continuous on the large, iterative on the small" to build

Each AUP performs the following activities: modeling, implementation, testing, deployment, configuration and project management, environment management

Requirements analysis

Software requirements: Refers to the user's expectations of the target software in terms of functionality, behavior, performance, design constraints, etc.

Functional requirements: Consider what the system does and when

performance requirements

User or human factors: users understand the difficulty of using the system, the possibility of wrong operation

Environmental requirements: hardware or software environment, model, operating system, platform

Interface requirements: consider input and output from other systems, data format storage medium requirements

Documentation needs: Who is the document aimed at

Data requirements: receiving and sending data format

Resource usage requirements: computer resources required for operation, manpower required for maintenance

Security and confidentiality requirements: data isolation, system backup

Reliability requirements: isolate errors, restart and wait for errors

Software cost consumption and development progress

Other non-functional requirements

73 Chapter 1 Knowledge of Computer Systems

Outline design

Design the overall structure of the software system: Divide the complex system into modules by function, and determine the function, calling relationship, interface, structure and quality of the module

Data structure and database design: database design (conceptual design ER, logical design, physical design)

Write high-level design documents: high-level design specification, database design specification, user manual, revised test plan

review

Detailed design

Detailed algorithm design for each module

Design the data structure in the module

Physically design the database

Other designs

Write a detailed design specification

review

System testing

The meaning of system testing: the process of executing a program in order to find errors. A successful test is the discovery of errors that have not been found so far.

The purpose of testing: to find potential errors and defects with the minimum manpower and time

Basic principles of testing:

Early and continuous testing throughout the entire development phase

Avoid testing by the original developer

When testing, there must be expected output results and compare them with the test results

Concerned about the consequences of unreasonable inputs or operations

Check not only if the program does what it should do, but also whether it does what it shouldn't

74 Chapter 1 Knowledge of Computer Systems

to do

Test strictly according to the plan to avoid randomness

Keep test cases and related documents properly

Subsequent tests are modified on the basis of previous tests

The test objective of system testing comes from the requirement analysis stage

Unit testing

Unit test, also known as module test, starts to execute after the module is written and there are no compilation errors

Unit testing focuses on the processing logic and data structures inside the module

Unit tests check 5 characteristics of modules

Module interface: ensure that the data flow of the test module can be input and output normally; test: whether the formal parameters match the actual parameters, the use of global variables, I/O format, file processing, etc.

Local Data Structures: Test: Variable Definition and Use

Important Execution Path

error handling

Boundary conditions

unit testing process

Since the modules do not run independently, there is a call-to-call relationship between each module, so two kinds of modules need to be developed during testing

 Driver module: equivalent to a main program, which receives the data of the test case, inputs the data to the tested module, and outputs the test result

 Stub module (stub module): used to replace the submodule called by the test module, and detect the input of the test module

High cohesion simplifies unit testing

Integration testing

Integration testing is to combine modules according to the system specification to test, aiming to discover and interface

75 Chapter 1 Knowledge of Computer Systems

related error

Possible problems after integration:

Data lost across modules

The functionality of one module has harmful effects on other modules

After the module is integrated, the expected function is not achieved

Problem with global data structure

Error accumulation after module combination

Top-down integration test: it belongs to the incremental method, the module integration sequence starts from the main control module, and gradually goes down along the control level, without writing the driver module, you need to write the stub module

Bottom-up integration test: the module integration sequence starts from the bottom atomic module, and gradually goes up along the control level. It is not necessary to write a stub module, but a driver module is required

Regression testing

Software changes may cause problems with the original normal functions. At this time, regression testing is required to re-execute some subsets that have been tested to ensure that no undesired side effects are propagated.

Regression testing helps ensure that unintentional behavior or additional bugs are not introduced

Smoke test

Integrate software components that have been converted to code into components

Design a series of tests to expose bugs that prevent the build from performing its function correctly

Test method

Static testing: The program under test does not run on the machine, and the program is tested by manual detection and computer-aided static analysis

Dynamic testing: find errors by running the program, black box testing and white box testing can be used

Test case: It consists of test input data and expected output results. When designing a test case, it should include

76 Chapter 1 Knowledge of Computer Systems

Contains reasonable input conditions and unreasonable input conditions

Black box testing

Regardless of the internal structure and characteristics of the software, treat the software as a black box and test the external characteristics of the software

Common black box testing techniques

Equivalence class division: Divide the program input and output into several equivalence classes, and then take a representative data from each equivalence class as a test case, and test valid equivalence classes and invalid equivalence classes at the same time

Boundary Value Analysis: Inputs are more prone to errors at the border than in the middle, should test borderline values ​​and values ​​just beyond the borderline

Incorrect guessing: guessing based on experience

Cause-and-Effect Diagram: Convert Decision Tables Through a Cause-and-Effect Diagram

• McCabe measure

Establishes a measure of program complexity by defining cyclomatic complexity, which is based on the number of loops in the program graph of a program module

The calculation formula is V(G) = m - n + 2 ; G represents the program graph, V(G) represents the cycle complexity of the program graph, m represents the number of directed arcs, n represents the number of nodes

It can also be found by finding how many closed regions k there are in the graph, then V(G) = k + 1

White box testing

White box testing is also called structural testing, and test cases are designed according to the internal structure of the program

Common techniques for white box testing are: logic coverage, loop coverage, basic path testing

White box testing principles:

All independent paths in a program module are executed at least once

In the logical judgment, both "true" and "false" are executed at least once

77 Chapter 1 Knowledge of Computer Systems

Each cycle is executed once for the boundary conditions and for the general conditions

Test the validity of program internal data structures

logic coverage

Statement coverage: select enough test data so that each statement in the program under test is executed at least once (as long as each statement is guaranteed to be executed, so some judgment logic branches may be missed), statement coverage has a low degree of coverage of program execution logic, is very weak logical coverage

Decision coverage: design enough test cases, each decision expression of the test program obtains "true" and "false" values ​​at least once, and each branch that takes "true" and "false" runs at least once, so also known as branch coverage

Condition coverage: Create a set of test cases, each possible value of each logical condition in each judgment statement is satisfied at least once

Judgment/Condition Coverage: It is necessary to meet the requirements of both judgment coverage and condition coverage

Condition coverage combination: Under the premise of judgment/condition coverage, all the different combinations of true and false of the judgment expression must be tested, for example, A>0 && b>0 will test four kinds of T&&T, F&&F, T&&F, F&&T

Path coverage: refers to the test case to cover all possible paths in the program

 Path coverage: each path may have

 How to cover all the paths: as long as you can walk all the paths once

If there is pseudocode, convert it into a flowchart first, and then calculate

Operation and maintenance

Software maintenance is the last stage of the software life cycle, not part of the system development process

System maintainability evaluation indicators: understandability, testability, modifiability

Documentation is a determinant of software maintainability

Maintainability needs to be considered at the development stage, and at every stage thereafter

78 Chapter 1 Knowledge of Computer Systems

Software Documentation

Writing high-quality documentation improves software development quality

Documentation is also a part of software products, and software without documentation cannot be called software

The preparation of software documentation occupies a prominent position in software development, and a considerable workload

High-quality documentation is of great significance to the benefits of software products

Overall the software documentation has to be fair

Software maintenance content

Correctness maintenance: correcting problems not found during the system development and testing phase

Adaptive Maintenance: Modifications to adapt to changes in the industry environment and management needs

Improvement maintenance: changes made to expand functionality and improve performance

Preventive maintenance: proactive prevention to adapt to future software and hardware environment changes

Software reliability, availability, maintainability formulas

Reliability: Probability of failure-free operation under given conditions for a given time interval MTTF/ (1+MTTF) MTTF: Mean Time Between Failures

Availability: The probability of the system operating correctly at a given time MTBF/(1+MTBF) MTBF: Mean time between failures

Maintainability: Given time, the probability of completing a maintenance activity using specified processes and resources 1/(1+MTTR) MTTR: ​​Mean Time to Repair

Communication path calculation

n programmers, no master programmer: (n-1)*n/2

n programmers, one master programmer: n-1

79 Chapter 1 Knowledge of Computer Systems

Software project estimation

COCOMO estimation model: accurate and easy-to-use cost estimation model, divided into basic COCOMO, intermediate COCOMO and detailed COCOMO according to the degree of refinement

Basic COCOMO model: static univariate model

Intermediate COCOMO model: Static multivariate model, which divides the system model into two levels: system and component

Detailed COCOMO model: divide the software system model into three parts: system, subsystem and module

COCOMOII Estimation Model: Estimation Model of Hierarchical Structure, Divided into Three Stages

Application assembly model: used in the early stages of software development, using object point estimation

Early design stage model: used when the requirements have stabilized and the basic software architecture has been established, using function point estimation

Architectural Phase Model: used during software construction, using lines of code estimation

• Gannt chart (Gantt chart)

A simple horizontal bar chart that describes project tasks based on the calendar. The horizontal axis shows the calendar timeline. Each horizontal bar represents a task. The start and end points of the horizontal bar correspond to the calendar time, indicating the start and end of the task, and its length Represents the duration of the task, and multiple horizontal bars in the same time period are concurrent relationships

Advantages: clearly describe the start and end time of the task, the progress of the task, and the parallel relationship between tasks

Disadvantages: Can't clearly reflect the dependencies between tasks, can't reflect the key of the project, can't reflect the potential part of the plan

• PERT chart

A PERT graph is a directed graph

The arrows in the figure indicate "tasks", and the time on the arrows indicates the time required to complete the "tasks"

80 Chapter 1 Knowledge of Computer Systems

The nodes in the graph represent "events", which indicate the end of the "task" pointing to the current node. and started by the "task" pointed to by the current node

The task pointed to by the current node will start when all tasks flowing into this node have finished

The event itself does not consume time and resources, it only represents a point in time

The "event" node consists of three parts: event number, the earliest time when the event occurs, and the latest time

Start node: a node with no inflowing tasks, there can be multiple nodes, and the earliest moment of the start node is 0

End node: There can only be one node with no outgoing tasks, and the latest time of the end node is equal to the earliest time of itself

The earliest time is calculated from the start node to the end node, and the latest time is calculated from the end node to the start node

Earliest time: The earliest time at which the subsequent tasks starting from the event node can be started. When there are multiple incoming tasks, the calculated largest one will be selected

The calculation of the earliest time of the node: the required time of the inflowing task + the earliest time of the inflowing previous node

The latest time: the task starting from this node must start before this time, otherwise the project cannot be completed as scheduled. When there are multiple outgoing tasks, choose the smallest one after calculation

The latest time calculation of the node: the latest time of the node pointed to by the outflow task - the time required for the outflow task

Slack time: Under the premise of not affecting the construction period, how much room for maneuver is available to complete the task, which is hung under the task

Calculation of slack time: Among the two nodes of the link task, the latest moment of the node pointed by the arrow - the time consumption of the task - the earliest moment of the node at the tail of the arrow

Critical path: from the start node to the end node, the slack time is 0

Advantages of the PERT chart: It gives the start and end time of the task, and also gives the dependency relationship and critical path built by the task

Disadvantages of the PRET graph: it cannot reflect the parallel relationship of task keys

81 Chapter 1 Knowledge of Computer Systems

Project Activity Diagram

Similar to a PERT diagram, where: the vertices represent milestones, the changes at link vertices represent activities, and the numbers on the edges represent the time required for activities

Except that the graphics and naming are different from the PERT chart, other calculations are basically similar

Software configuration management

The main goals of software configuration management: change identification, change control, version control, ensuring correct implementation of changes, change reporting

Software configuration management content: version management, configuration support, change support, process support, team support, change reporting, audit support

Software Configuration Management Contents (Second Edition): Software Configuration Identification, Change Management, Version Control, System Establishment, Configuration Review, Configuration Status Reporting

The configuration database is divided into three categories: development library, controlled library, product library

Risk management

Characteristics of Software Risk: Uncertainty (may or may not occur) and Loss (may or may not occur)

Classification of Software Risks

Project risk: delay project progress, increase project cost. Examples: Uncertainty about budget, schedule, personnel, resources, project complexity, size, and structure

Technical risk: quality and delivery time. For example: design, implementation, interface, maintenance, etc.

Business Risk: Threats to Software Viability

Risk identification

Systematically point out the threats to the project plan (estimation, schedule, resource allocation, etc.). After identifying known and predictable risks, the project manager should avoid these risks and control them when necessary.

82 Chapter 1 Knowledge of Computer Systems

these risks

One way to identify risks is to create a "Risk Entry Checklist"

Risk item checklist format: List the relevant characteristics of each type, and finally give a set of risk factors and driving factors and probability of occurrence, risk factors include performance, cost, support, progress

Risk prediction

Also known as risk estimation, it is evaluated through two aspects, 1. The probability of risk occurrence 2. The consequences of risk occurrence

Risk Prediction Activities:

Establish a scale or standard to reflect the possibility of risk occurrence

Describe the consequences of the risk

Estimate the impact of risks on projects and products

Label the accuracy of risk predictions to avoid misinterpretation

Risk Prediction Techniques: Building a Risk Table

Three factors affect the consequences of risk: the nature, scope, and timing of the risk

Overall risk exposure = probability of risk occurrence * cost brought to the project by risk occurrence

Risk assessment: A useful technique for risk assessment is to define risk reference levels

Risk control

The purpose of risk control: to assist the project team to establish strategies to deal with risks

Risk avoidance: the best way to deal with risks is to actively avoid risks

Risk Monitoring: Project managers should monitor certain factors that can provide an indication of whether risks are becoming lower or higher

RMMM Plan: All risk analysis work is documented and used by the project manager as part of the overall project plan

risk mitigation is a problem avoidance activity and risk monitoring is a project tracking activity

83 Chapter 1 Knowledge of Computer Systems

Another task of risk monitoring is to find the "origin"

Software quality

ISO/IEC 9126 software quality model: consists of three layers: quality characteristics "quality sub-characteristics" quality metrics

Functionality: those functions that satisfy stated or implied needs

Suitability: suitability for the relevant software attributes

Accuracy: being able to get correct results

Interoperability: the ability to interact with other systems

Compliance: Comply with relevant standards and regulations

Security: Avoid unauthorized access, and accidental access

Reliability: the ability of software to maintain a level of performance over time

Maturity: the frequency with which software faults cause failures

Fault Tolerance: Maintaining a specified level of performance in the event of software errors or violations of specified interfaces

Recoverability: the ability to recover after a failure

Ease of use: whether it is easy to use, and the cost of learning is paid for using it

Understandability:

Ease of learning

Ease of use

Efficiency: Software performance level and resource usage

Time Characteristics: Response Processing Time

Resource properties: Amount of resource used

maintainability

Ease of analysis: the cost of diagnosis

Ease of Change: Defect Correction Costs

Stability: the ability to avoid risks 4. Ease of testing

portability

84 Chapter 1 Knowledge of Computer Systems

Adaptability: time-consuming and costly transfer of software to different environments

Ease of installation: the cost of installing software in a specified environment

Consistency: The software capabilities are consistent after installation in different environments

Replaceability

Software review (low probability test)

Design quality: the design specifications meet the user's requirements

Program Quality: The program is correctly executed according to the conditions stipulated in the specification

Design quality review content:

Evaluate whether the software specifications meet user requirements

Review reliability, that is, whether input exceptions can be avoided

Review the implementation of confidentiality measures

Review performance implementation

Review software for modifiability, extensibility, interchangeability, and portability

Review software testability

Review software reusability

Program quality review content: Review from the perspective of developers, directly related to development technology, focusing on the structure of the software itself

Functional structure

versatility of function

Module Hierarchy

Module structure: the correspondence between control flow structure, data flow structure, module structure and functional structure

Structure of the process

Interface with the operating environment: interface with hardware, interface with users

The central activity in completing a quality assessment is the technical review, which aims to uncover quality issues

85 Chapter 1 Knowledge of Computer Systems

Software fault tolerance technology

Minimize the impact of unavoidable errors

Definition of Fault Tolerant Software

Software that has the ability to shield itself from errors

Software that recovers from an error state to a normal state to some extent

software that fails and still does what it is supposed to do

somewhat fault-tolerant

The general method of fault tolerance: the main means to achieve fault tolerance is redundancy, which refers to the redundant part of resources for realizing system functions

Four types of redundancy technology

Structural redundancy: static redundancy, dynamic redundancy, hybrid redundancy

Information redundancy: a part of information added to detect or correct errors in information operation or transmission

Temporal Redundancy: Repeated execution of a program or instruction to first out the effects of transient errors

Redundant additional technology:

Additional technologies for shielding hardware errors: 1. Redundant storage of key programs and data. 2. Detection, voting, switching, reconstruction, error correction, recalculation

Additional technologies for shielding software errors: 1. Storage and recall of redundant backup programs. 2. Implement error checking and error recovery. 3. Firmware required to implement fault-tolerant software

Software tools

software development tools

Requirements Analysis Tool

Design Tools

Coding and Debugging Tools

test tools

86 Chapter 1 Knowledge of Computer Systems

Software Maintenance Tool

version control tool

Document Analysis Tool

Development Repository Tool

reverse engineering tools

reengineering tool

Other

High-quality documentation characteristics: pertinence, precision, clarity, completeness, flexibility, traceability

Fundamentals of Software Engineering: Methods, Tools, and Processes

Less complex software in the field of data processing is suitable for a structured development approach

Software debugging method:

Heuristics: Guess where the problem is, and get error clues through the output statement

Backtracking method: starting from the location where the problem was found, trace the code back along the control flow of the program

Binary search method: find the problem by narrowing the scope of the error

Induction method: collect correct and incorrect data, analyze the relationship between them, and propose hypothetical causes of errors

Deductive method: list all possible causes of errors, eliminate, try

87 Chapter 1 Knowledge of Computer Systems

Chapter 13 Data Structures and Algorithms

Complexity

Big O notation: The number of repeated executions (frequency) of basic operations in the algorithm is used as the measure of the time point of the algorithm. Generally, it is only necessary to roughly calculate the order of magnitude

O(1) < O(log2 n) < O(n) < O(nlog2 n) < O(n^2) < O(n^3) < O(n!) < O(n^n)

Constant order < logarithmic order < linear order < linear logarithmic order < square order < cubic order < factorial order < nth order

Complexity calculation rules: the highest item is reserved for multi-addition; multi-multiplication is reserved; addition and multiplication are mixed, according to the calculation rule; the coefficient is converted to 1

Time complexity, related to the loop

Space complexity, see if there is a new space opened up, such as an array

Progressive symbols

O(g(n)): Indicates the asymptotic upper bound, 10n^2+4n+2 = O(n^2) is true, because the complexity in the parentheses of the asymptotic upper bound is greater than or equal to the calculation result on the left side of the equation

Ω(g(n)): Indicates the asymptotic lower bound, 10n^2+4n+2 = O(n^3) is not true, because the complexity in the parentheses of the asymptotic lower bound is less than or equal to the calculation result on the left side of the equation

Θ(g(n)): Indicates an asymptotically compact bound, 10n^2+4n+2 = O(n^3) is not valid, because the complexity in the parentheses of the asymptotically compact bound is equal to the calculation result on the left side of the equation

Time and space complexity of recursion

Time complexity of recursion = number of recursions * time complexity of each recursion

Recursive space complexity: If there is a variable declaration assignment in the recursion, it is equivalent to an array whose length is the number of recursions

88 Chapter 1 Knowledge of Computer Systems

Recursive main method:

If the title gives a recursive expression that looks like T(n) = aT(n/b) + f(n), then you can try the following method

For example, given a topic T(n) = 2T(n/2) + nlgn to find the complexity

Then convert according to the formula to get a=2; b=2; f(n) = nlgn;

If there is lg correlation in f(n), then apply this formula f(n) = Θ(n^(logb a)lgk n); Substitute the converted data, that is, nlgn = (n^(log2 2)lgk n) ; get k = 1; and then substitute into this formula T(n) = Θ(n^(logb a)lgk+1 n) to get T(n) complexity is nlg2n

If there is no lg in f(n), directly substitute into T(n) = Θ(n^(logb a))

Linear table

Linear relationship: a data relationship with a single predecessor and successor, the elements are arranged one after the other

Linear table: the simplest and most common linear data structure, usually expressed as (a1,a2,…an)

Linear table features: 'the first element' and 'the last element' are unique and only one; except that the first element has only a successor, and the last element has only a predecessor, the rest of the elements have predecessors and successors

Linear table sequential storage: refers to using a group of continuous storage units to store the data in the linear table at one time, that is to say, the physical locations are adjacent

Advantages: Random access to elements in the table, high query efficiency

Disadvantages: Insertion and deletion need to move elements, deletion and insertion are inefficient, the table length is n, and new values ​​inserted move n/2 on average; deleted values ​​move on average (n-1)/2

The time complexity of inserting elements in the sequence table: the last insertion of the sequence table is O(1); the first insertion of the sequence table is O(n); the average complexity is O(n)

Time complexity of deleting elements in the sequence table: O(1) for deleting the last digit of the sequence table; O(n) for deleting the first digit of the sequence table; average complexity O(n)

Time complexity of finding elements: query directly according to the array subscript, so it is O(1)

89 Chapter 1 Knowledge of Computer Systems

Linear table chain storage

Link nodes through pointers to store data elements, which are divided into data fields + pointer fields; the node addresses of data elements are not continuous, and the node space is only applied when needed

If the node has only one pointer field, it becomes a linear linked list or a singly linked list

Head node: no data is stored (!! can also store the length of the linked list), only the address of the first node of the linked list is stored

Head pointer, tail pointer: With the tail pointer, you can traverse and search directly from the tail. With the tail pointer, the time complexity will change

The time complexity of inserting elements in continuous storage: O(1) for first bit insertion; O(n) for last bit insertion; average complexity O(n)

The time complexity of deleting elements in continuous storage: first delete O(1); last delete O(n); average complexity O(n)

The time complexity of finding elements in continuous storage: the first search O(1); the last search O(n); the average complexity O(n)

Circular singly linked list: based on the singly linked list, the pointer of the tail node points to the head node, and the time complexity is consistent with that of the singly linked list

Double-linked list: each node pointer not only points to the subsequent node, but also points to the predecessor node, that is, a node knows the address of the previous node and the address of the next node

stack

Definition of stack: A linear data structure that can only store and retrieve data by accessing one end of it

The modification of the stack is carried out according to the principle of first in first out and last in first out. One end of the insertion and deletion operation is called the top of the stack, the other end is called the bottom of the stack, and the one without elements is called an empty stack

Understanding: The stack can be imagined as a cup, first in first out, similar to the recursive execution process

Chained storage of the stack: A stack that uses a linked list as a storage structure, also known as a linked stack, does not need to set a head pointer. The head pointer of the linked list is the top pointer of the stack

90 Chapter 1 Knowledge of Computer Systems

queue

Definition of queue: a first-in-first-out linear table that only allows insertion of values ​​at one end of the table and deletion of elements at the other end

Sequential queue: For queues that use sequential storage, you need to set the queue head pointer and queue tail pointer

Circular queue: It can handle the overflow and out-of-bounds insertion value in the sequential queue, only need to change the queue head and queue tail pointer, avoiding the traversal caused by linear table interpolation

Queue chain storage

Double-ended queue: entry and exit can be performed at both ends

Two stacks can simulate a queue, but two queues cannot simulate a stack

skewers

A string is a special linear table whose data elements are characters, which is a limited sequence of characters, for example: 'abc'

Empty string: has length zero and contains no characters

Substring: A sequence of consecutive substrings of any length in the string. The substring of 'abc' can be 'ab' but not 'ac'

String comparison: when comparing two strings, the ASCII code value of the character is used as the basis

String pattern matching: can be understood as the desired effect in JS a.indexOf(b)

Complexity: main string length n, substring length m

Best case (the first digit is a successful match): complexity O(m) | O(1)

Worst request (compared to the last m digits): complexity O(n m) => (n-m+1) m

Average complexity: O(n+m)

String pattern matching KMP algorithm

String prefix: a substring containing the first character but not the last character

String suffix: a substring containing the last character but not the first character

91 Chapter 1 Knowledge of Computer Systems

KMP: It can improve the pattern matching efficiency of strings, and the time complexity is: O(n+m)

The numerical calculation of KMP next: the next value of the i-th character = the length of the longest "string prefix === string suffix" in the string before the i-th character + 1; where next[1] = 0

One-dimensional array

LOC: indicates the first address of the first element; L: indicates the size of each element

Calculate the address of an element i in the array: ai = LOC + i*L

Two-dimensional array

The storage of the two-dimensional array will store the second row consecutively after the first row (column-first storage, then the second column will be stored consecutively after the first column storage)

LOC: indicates the first address of the first element; L: indicates the size of each element; N: the number of rows; M: the number of columns;

Calculate the address of the two-dimensional array i = LOC + ( how many elements before i ) * L

Row-first storage: LOC + (i*M + j) * L

Column-first storage: LOC + (j*N + i) * L

When N == M and i == j, the address by row or by column is the same, and the offset is also the same

Symmetric matrix

Any element in the matrix has the characteristics of Ai,j = Aj,i

According to the symmetry of the main diagonal, it is divided into an upper triangular area and a lower triangular area

When storing, you only need to store the lower triangle + the main diagonal, and generally use a one-dimensional array to store

Stored by row: when i >=j, Ai,j = (i+1)i/2 + j + 1; when i < j, because

92 Chapter 1 Knowledge of Computer Systems

If the main diagonal is piled up, it can be changed to calculate Aj,i

Tridiagonal matrix

Only the area immediately on both sides of the main diagonal has a value, and the other areas are 0

When storing, only the value of the middle area is stored, and the position of 0 is not stored, and it is stored in a one-dimensional array

Store by row: Ai,j = 2i+j+1

sparse matrix

The matrix is ​​very large, but there are very few non-zero elements stored

Compressed storage method: use triple sequence table to store [i, j, data] [row, column, value]

Another Compression Method: Cross Linked List

tree

A very important non-linear structure, an element can have zero or more successor elements

A tree is a finite combination of n nodes. When n=0, it is called an empty tree, and there is only one root node.

Sibling nodes: nodes with the same parent

Degree of node: The number of subtrees of a node is counted as the degree of the node

Leaf nodes: terminal nodes, nodes with no children, nodes with degree 0

Internal nodes: branch nodes, nodes whose degree is not 0

Hierarchy of nodes: the root is the first level, the child is the second level, and so on

Tree height: The maximum number of layers of a tree is counted as the height of the tree or the depth of the tree

The degree of the tree: the maximum value of the degrees of all nodes in the tree

Properties of the tree

Total number of nodes in the tree = sum of degrees of all nodes + 1

93 Chapter 1 Knowledge of Computer Systems

In a tree with degree m, there are at most m^(i-1) nodes on the i-th layer, and the most cases are that each layer has m nodes

A tree with a height of h and a degree of m has at most (m^h - 1)/(m-1) nodes. The most common case is that each layer has m nodes, and there are a total of h layers

The minimum height of a tree with n nodes and degree m is logm (n(m-1) + 1). To achieve the minimum height, each layer must have m nodes

Binary tree

A finite set of n nodes, when n=0, it is an empty tree, or it is composed of a root node and two binary trees called left and right subtrees that do not want to intersect

Difference Between Tree and Binary Tree

The subtree in the binary tree is divided into the left subtree and the right subtree, even if there is only one subtree, the left and right must be distinguished

The maximum degree of a node in a binary tree is 2

Properties of binary trees

There are at most 2^(i-1) nodes in the i-th layer of the binary tree, which is actually the formula of the tree. In China, the degree == 2 is substituted into

A binary tree with a height of h has at most 2^h - 1 nodes; in most cases, the number of nodes in each layer is the number of nodes in all previous layers + 1

For any binary tree, the number of leaf nodes with a degree of 0 = the number of nodes with a degree of 2 + 1; it is inferred from "the total number of nodes in the tree = the sum of the degrees of all nodes + 1"

The height of a complete binary tree with n nodes is (log2 n + 1) rounded down or (log2 (n+1)) rounded up

94 Chapter 1 Knowledge of Computer Systems

full binary tree

A binary tree with a height of k, if there are 2^k -1 nodes, it is a full binary tree, which can be numbered from top to bottom and from left to right

complete binary tree

Except for the last layer, all other layers are "full", and the nodes of the last layer are also placed in order from left to right; in this case, each node in the complete binary tree can correspond to the full binary tree of the same depth

Cattelan number

How many kinds of binary trees with n nodes are there: (C2n n)/(n+1); where (Cn m) = n!/m!*(nm)!

Sequential storage of binary trees

Use a set of consecutive storage units to store the nodes in the binary tree

Relationship between tree node and number i

Find the parent node: if i=1, it is the root node, and the root node has no parent node; if i>1, the parent node of the node is the rounded down integer of i/2

Find the left child node: 2i<=n, then the number of the left child node of the node is 2i, otherwise there is no left child node

Find the right child node: 2i+1<=n, then the number of the right child node of the node is 2i+1, otherwise there is no right child node

Sequential storage is more suitable for a complete binary tree, but for an ordinary binary tree, in order to maintain the relationship, there will be many "virtual nodes"

Single-branched tree, except for leaf nodes, the degree of other nodes is 1

95 Chapter 1 Knowledge of Computer Systems

Binary tree chain storage

Binary linked list storage, each binary linked list node stores [the data element of the current node, the left child node pointer, the right child node pointer], if there is no corresponding child node, it will store NULL

The number of effective pointer fields in the binary linked list storage: that is, the number of effective associations in the tree structure, each child node has only one parent node (except the root node), so the effective number = total number of nodes - 1

Three-fork list: Add a pointer field pointing to the parent node on the basis of the binary linked list

Binary tree traversal

Preorder traversal: traverse in order from root to left

Inorder traversal: traverse in order from left root to right

Post-order traversal: traverse in the order of the left and right roots

Hierarchical traversal: starting from the root node, each layer is accessed from left to right

Restore the binary tree

A single traversal result cannot restore the tree

The first position of pre-order traversal and hierarchical traversal is the root node, and the last position of post-order traversal is also the root node, so the combination of in-order traversal and any other traversal can restore the binary tree

Balanced binary tree

The difference between the left and right subtree heights of any node in a binary tree is no more than 1, and a complete binary tree must be a balanced binary tree

Binary sort tree

binary check tree

Root node keyword: is the value of the root node

96 Chapter 1 Knowledge of Computer Systems

The key of the root node is greater than the keys of all nodes in the left subtree

The key of the root node is less than the keys of all nodes in the right subtree

The left and right subtrees are also binary sorted trees, recursively

The inorder traversal of a binary sort tree is an ordered sequence

Calculation problem: A keyword sequence will be given. The last element of the keyword sequence is the root node. If the number behind is larger than the root, it will be placed on the right, and if it is smaller than the root, it will be placed on the left. If the node is empty, it will be inserted directly. If it is not empty, it will be compared with it. Then insert to the lower layer, and other elements can be judged and inserted into the binary sorting tree in turn

The efficiency of binary sorting tree search is related to the number of search layers. The higher the number of layers, the worse the efficiency

Optimal Binary Tree

Also known as the Huffman tree, it is a tree with the shortest weighted path length

Path: A path from one node of the tree to another

path length: number of branches on the path (several lines)

The path length of the tree: the sum of the path lengths from the root node to each leaf node, multiplied by the weight value represents the weighted path length of the tree

Optimal binary tree construction

Question: Construct a set of weights (for example: {1,3,3,4} ) into a binary tree

Construction method:

Find the two smallest weights from front to back

The smaller of the two is used as the left subtree, and the larger one is used as the right subtree to construct a new binary tree. The weight of the root of this binary tree is equal to the addition of the two

Add the calculated root to the end of the weight set

Continue the above steps until there is only one left in the set

The ones with larger weights are closer to the root node, and the ones with smaller weights are farther away from the root node

The optimal binary tree only has nodes with degree 0 and nodes with degree 2

Total number of nodes = (number of weights * 2) - 1

97 Chapter 1 Knowledge of Computer Systems

Huffman coding

Equal-length encoding: Compile a binary code of the same length for each character, for example, 26 characters in English, which requires 2^5, that is, a 5-digit binary string representation

Huffman coding is not equal-length coding

After the receiver divides the message into groups of 5 digits, the decoding is realized through correspondence

Question: Generally, a string of characters will be given and the weight of the characters will be explained

We draw the Huffman tree according to the weights, and replace the nodes with the corresponding characters

The connection between the root node and the left child node is 0, and the connection between the right child node and the right child node is 1, mark the connection between each node

The encoding of a character is composed of 0 and 1 on the path from the root node to the current character node

Huffman encoding compression ratio: that is, the compression of each character from equal-length encoding to Huffman encoding

Huffman coding is based on a greedy strategy

Threaded binary tree

Ordinary binary tree, using the binary linked list as the storage structure, there will be a null pointer field in the linked list, use this null pointer field to store the predecessor and successor information of the node

diagram

In the graph, there may be a relationship between any two nodes, and a node may have multiple predecessors or multiple successors

Graph, denoted G(V,E) V represents a non-empty finite set of vertices; E is a finite set of edges in the graph

Directed graph: each edge has a direction, then the vertex relationship uses v1 as the starting point and v2 as the end point

Undirected graph: each edge is undirected, then the vertex relationship is (v1,v2)

Complete Graph: Every vertex has an edge with every other vertex, then it is called a complete graph

98 Chapter 1 Knowledge of Computer Systems

Assuming that the undirected complete graph has n vertices, then the complete graph has a total of n(n-1)/2 edges

The total number of edges in a directed complete graph is n(n-1), because there are two edges between every two vertices

Degree of an undirected graph vertex: the number of edges associated with the vertex

Out-degree and in-degree of a directed graph: Out-degree – the number of edges pointing out from the vertex; in-degree – the number of edges pointing to the vertex; total degree = out-degree + in-degree

Total degree of graph = number of edges * 2

Path: It is through the combination of those edges to achieve from one top line to another vertex; the path length is the number of edges or arcs on the path

A path whose first vertex is the same as the last vertex is called a cycle or cycle

Simple path: On the path, except the starting point and the ending point can be the same, the rest of the vertices are not the same path

connected graph

Connected graph: In an undirected graph, if there is at least one path between any two vertices, it is called a connected graph

For an undirected graph with n vertices, at least n-1 edges can be connected, and at most n(n-1)/2

Strongly connected graph: In a directed graph, any two vertices are connected by two paths back and forth, called a strongly connected graph

For a directed graph with n vertices, at least n edges can be connected, and at most n(n-1)

Graph storage structure

Adjacency matrix notation: use a matrix to represent the relationship between the vertices in the graph. For a graph with n fixed points, its adjacency matrix is ​​of order n. The value in the matrix is ​​1 for edges, and 0 for no edges.

The adjacency matrix of an undirected graph is symmetric, but not necessarily for a directed graph

99 Chapter 1 Knowledge of Computer Systems

The adjacency matrix of an undirected graph calculates the degree of a fixed point: the degree of a fixed point vi is the number of non-zero elements in the i-th row

The directed graph adjacency matrix calculates the degree of a fixed point: the out-degree of the fixed point vi – the number of non-zero elements in the i-th row; the in-degree: the number of non-zero elements in the i-th column

Adjacency linked list representation: build a singly linked list for each node in the graph, the specifics depend on the graph, so I won’t explain it in detail here

有向图的邻接链表,有几个指出来的表结点就有几条边;无向图的邻 接链表,有 n 指出来的表结点就有 n/2条边

稠密图和稀疏图,边多的就是稠密图,边少的就是稀疏图

邻接矩阵表示法适合稠密图,邻接链表适合稀疏图

网:边或弧带有权值的图,称为网;网的邻接矩阵中有边的会用权值 表示,没有边的用 oo 无穷表示

图的遍历

深度优先搜素:从一个顶点 A 按照出度向另一个顶点 B , B 在按 照出度向顶点 C, 这样先从路径的起始遍历到路径的末尾,然后在通 过回溯,换一个路径遍历

深度优先搜素的时间复杂度: n 表示顶点数, e 表示边数,邻接矩 阵存储的复杂度为 O(n^2) ;邻接链表的时间复杂度 O(n+e) ,用 栈的方式

广度优先搜索:先遍历一个顶点 A 的所有出度的节点,在遍历出度节 点的所有出度节点,以此类推,相当于一层层遍历

广度优先搜素的时间复杂度: n 表示顶点数, e 表示边数,邻接矩 阵存储的复杂度为 O(n^2) ;邻接链表的时间复杂度 O(n+e) ,用 队列的方式

拓扑排序

AOV 网:一种有向无环图

100第 1 章 计算机系统知识

AOV 网中 弧的尾部是前趋,弧指向的是后继,前趋对后继有制约关 系

拓扑排序:是 AOV 网中所有定点排出的线性序列,并且网中任意路 径 的前后顶点在这个线性序列中 vi 排在 vj 前

假设 AOV 图是一个工程的计划,那 AOV网的一个拓扑排序就是工程 顺利完成的可行方案

拓扑排序计算方式:

在 AOV 网中选择一个入度为 0 的顶点,且输出它

在网中删除该顶点及与该顶点相关的所有弧

重复上述两步直到不存在入度为 0 的顶点为止

例如:得到 614325 这个拓扑序列,那么对于 6 与 4 来 说,可能存在弧6->4;不可能存在弧 4->6;可能存在 6->4 的路径, 一定不存在 4->6 的路径

查找

Lookup table: a collection composed of elements of the same type, and there is a completely loose relationship between the elements of the collection

The static lookup table only performs the following two operations

Query whether a specific element is in the lookup table

Retrieve various properties of a specific element

The dynamic lookup table, in addition to the function of the static lookup table, also performs the following operations

Insert a data into the lookup table

Delete a data from the lookup table

The keyword is the value of a data item of the data element, which can be used to mark the data element

Static lookup tables include: sequential search, binary search, block search

Dynamic lookup tables are: binary sorting tree, balanced binary tree, B_ tree, hash table

The basic operation of lookup: compare the key of the record with the given value

Sequential search: search from left to right, does not need to be ordered, suitable for sequential storage and chain storage, the average search length is (n+1)/2

101 Chapter 1 Knowledge of Computer Systems

binary search

Also known as half search, it is to compare the given value with the middle value of the lookup table, find the middle value (comparison value) with the subscript, round down the middle value to a decimal, and discard the middle value after the comparison for the next round of comparison

For example, if there are 10 values, take (1+10)/2 => 5 in sequence; (6+10) => 8; (9+10)/2 => 9; (10+10)/2 => 10;

Sequential storage is required, and it must be stored in an orderly manner

When the binary search is successful, the number of comparisons of the given value is at most [log2 n) + 1 and rounded down

The average search length of a binary search is: (log2 (n+1)) - 1

hash table

Hash table: get the storage address of the record by calculating a function (hash function) with the recorded keyword as an argument

According to the set hash function H(key) and the method of dealing with conflicts, a set of keywords is mapped to a limited set of continuous addresses

For a hash function, when two different keywords have the same address after being searched by the hash function, it is called a conflict, and keywords with the same hash function value are called synonyms

In general, conflicts can be reduced as much as possible, but cannot be avoided. To reduce conflicts, it is necessary to map keywords to each storage unit in the storage area as evenly as possible.

For the hash table, there are two main considerations, one is how to construct a hash function, and the other is how to resolve conflicts

Hash function construction method:

When constructing a hash function, the keywords are generally calculated, and all the components of the keywords work as much as possible

Remainder method after division: H(key) = key % m = address; Find the remainder of the keyword key as the address, where m is a prime number close to n but not greater than n, n is the length of the hash table, and the address generally starts from 0

102 Chapter 1 Knowledge of Computer Systems

method of conflict resolution

To resolve the conflict is to find another address to store the conflicting keyword when there is a conflict

Linear detection method: if H(key) conflicts, then follow Hi = (H(key) + i) % m; calculate another address, where i=1,2,3,... means to calculate again if there is still a conflict Then, increase the code and calculate again until there is no conflict

Secondary detection method: if H(key) conflicts, then follow Hi = (H(key) + di) % m; calculate another address, where i=1^2,-1^2,2^2,- 2^2,... means that if there is still a conflict in the calculation again, try in the order of 1^2,-1^2,2^2,-2^2,... until there is no conflict, and linear Compared with the detection method, it is to test back and forth around the H(key) address

Filling factor: a = the length of the record tree/hash table loaded in the table, a represents the fullness of the hash table, the larger a is, the greater the probability of collision

heap

A sequence {k1,k2,...kn} composed of n key codes, which satisfies the following relationship, is called a heap

Small top heap: ki <= k2i && ki <= k(2i+1) means a binary tree whose root node is smaller than its child nodes, the result of hierarchical traversal

Large top heap: ki >= k2i && ki >= k(2i+1) means a binary tree whose root node is larger than its child nodes, the result of hierarchical traversal

Resize a sequence into a large top heap or a small top heap

First restore it to a binary tree according to the result of hierarchical traversal

Judging from the leaf node upwards, for example, to restore a small top heap, it is necessary to make the value of the local root node smaller than the child node, and exchange it if it does not match

Sort

By sorting, the keywords satisfy the ascending or descending relationship

Stable: In the original sequence, the keywords of Ri and Rj are the same, and Ri is before Rj. After sorting algorithm, it can keep Ri before Rj, that is stable.

103 Chapter 1 Knowledge of Computer Systems

otherwise unstable

Homing: The final sorting position can be determined during sorting. For example, Ri should be placed at position 3 after sorting. If it is not placed at position 3 at the beginning of calculation, it will not be homing

Direct insertion sort

Start with R1 in the new sequence, traverse the original sequence R, and compare each Ri in turn with the keywords starting from the end in the new sequence. The larger one is directly inserted to the back, and the smaller one continues to judge until it is inserted

Direct insertion sort, stable, non-homing, average complexity O(n^2), maximum complexity O(n^2), minimum complexity O(n), space complexity O(1)

Applicable to the case of basic order

Hill sort

It is an improvement of the direct insertion sort algorithm

method:

Set a sequence of increments, eg: 5 , 3 , 1

Cut the original sequence into multiple segments in sequence according to the incremental sequence. For example, when the increment is 5, the elements at positions 0, 5, and 10 are grouped into one group, elements at positions 1, 6, and 11 are grouped, and so on

The elements between each group are directly inserted and sorted, and the sorted values ​​are swapped and inserted back into the original sequence

Follow the sequence of increments until the increment is 1

Hill sort: unstable, non-homing, average complexity O(n^1.3), maximum complexity O(n^2), minimum complexity O(n), space complexity O(1)

counting sort

Suitable for sorting with a small amount of data

104 Chapter 1 Knowledge of Computer Systems

It is to count the numbers to be sorted and count how many of each type of data there are, and then add them to the sequence in turn

Simple selection sort

Starting from the first position, compare the following keywords with the current keyword in turn, and select the smallest one to replace the current position until the last position

Simple selection sort: unstable, homing, average complexity O(n^2), maximum complexity O(n^2), minimum complexity O(n^2), space complexity O(1)

Heap sort

First, the original sequence is traversed according to the level to form a binary tree; then a large root heap or a small root heap is constructed; the root of the heap is exchanged with the end of the heap, and then the large root heap or the small root heap is constructed again; the operation is repeated until The entire heap becomes a new sequence

Heap sort: unstable, homing, average complexity O(nlog2 n), maximum complexity O(nlog2 n), minimum complexity O(nlog2 n), space complexity O(1)

Bubble sort

Starting from the first bit of the original sequence, compare ki with k(i+1) respectively. If ki is larger, exchange the order, so that until the last bit is swapped, it can be guaranteed that the last bit is the largest. Repeat

Bubble sort: stable, homing, average complexity O(n^2), maximum complexity O(n^2), minimum complexity O(n), space complexity O(1)

Quick Sort

Based on the idea of ​​divide and conquer, the records to be sorted are divided into two parts (the first half area and the second half area) by one-pass sorting. The keywords in the first half area are not greater than the keywords in the second half area, and then load the two parts Perform quick sort, followed by recursive operation

105 Chapter 1 Knowledge of Computer Systems

The quick sort of the basic ordered sequence has the lowest efficiency and the largest time complexity O(n^2)

Quick sort: unstable, homing, average complexity O(nlog2 n), maximum complexity O(n^2), minimum complexity O(nlog2 n), space complexity O(log2 n)

merge sort

Based on the idea of ​​​​divide and conquer, a sequence is divided into two, and each half is divided into two, so recursively until each item is 1, from the bottom up, compare between every two items, and compare between every four items For comparison, recurse upwards until the outermost layer;

Merge sort: stable, homing, average complexity O(nlog2 n), maximum complexity O(nlog2 n), minimum complexity O(nlog2 n), space complexity O(n)

Backtracking

N Queens Problem: Given an N * N chessboard, there are N queens to be placed on the board, and any two of the queens are not in the same row, column, or diagonal

判断是否同一列: Qi 列 == Qj列;判断是否在一个对角线: |Qi 行 - Qj 行 | == |Qi 列 - Qj 列 |

代码求解 N 皇后问题

非递归(循环、迭代)

递归

深度优先

主要考的是下午 C 语言计算题,放弃。。。

分治法

用递归来实现的

递归是指自己调用自己,或者间接的自己调用自己,有两个基本要素: 1. 需要有边界条件(递归出口), 2. 递归模式(递归体)

分治法的基本思想:

106第 1 章 计算机系统知识

规模越小,解题所需时间越少,越容易处理

将一个难以解决的大问题,分解成一些规模较小的相同问题,以 便各个击破,分而治之

分治算法三个步骤:

3. 分解:原问题分解为子问题

4. 求解:递归求解各个子问题

5. 合并:子问题的解合并成原问题的解

动态规划法

与分治法类似,基本思想都是将问题分解为若干个子问题,然后求解 子问题,在通过子问题的解得到原问题的解

不同点:适合动态规划法的问题分解为的子问题往往不是独立的(有 相同的子问题)

操作上,将动态规划法会用一个表记录所有已解决的子问题答案,在 后续计算中如果有相同的问题,则直接找出已求解的答案,避免重复 计算

动态规划算法,通常用来求解某种最优性质的问题(全局最优解)

适合动态规划法求解的问题的两个特征:

最优子结构:一个问题的最优解中包含其子问题的最优解(需要 注意,贪心算法也有这个特性)

重叠子问题:原问题的递归算法可反复的解同样的子问题,对每 个子问题只解一次,保存在表中,需要时查表

• 0-1 背包问题

问题详情表: n 个物品,第 i 个物品价值为 vi, 重量 wi, 背包容量 W ,如何装,使得背包物品价值最大

0-1 表示物品要么装入,要么不装入

代码实现:放弃~~~

107第 1 章 计算机系统知识

背包问题时间复杂度: O(n * w); 空间复杂度: O(n * w)

Matrix multiplication

Realized by dynamic programming method

Time complexity: O(n^3); Space complexity: O(n^2)

Calculation method:

The number of multiplications required to multiply matrices A(mn) and B(np) is m * n * p

The result after multiplication can be similarly expressed as AB(mp), and multiplied with C(pk) again, the number of multiplications required is m * p * k

Therefore, the multiplication times of A(mn), B(np), C(p*k) can be m * n * p + m * p * k

Multiply multiple matrices, the optimal calculation order is to multiply the largest one among m, n, p, k first, and eliminate

Greedy method

The greedy method is similar to the dynamic programming method and is also used to solve optimization problems, but in terms of problem-solving strategies, the greedy method does not consider the overall optimal, but local optimal

Two characteristics of problems suitable for greedy methods:

Optimal substructure: The optimal solution of a problem contains the optimal solution of its sub-problems (note that greedy algorithms also have this feature)

Greedy choice property: the overall optimal solution to the problem can be achieved through a series of locally optimal choices, namely greedy choice

partial knapsack problem

Based on the 0-1 knapsack problem, items can be partially loaded into knapsacks

Branch and Bound

Similar to the backtracking method, it is also a method of searching the solution of the problem on the solution space tree T of the problem.

108 Chapter 1 Knowledge of Computer Systems

Used to find a solution that satisfies the constraints

The search method adopts breadth first or minimum consumption first

Guess you like

Origin blog.csdn.net/m0_63722685/article/details/130668993