Soft exam database system engineer review materials (full version)

Table of contents

Chapter 1 Computer System Knowledge 2

Chapter 2 Data Structure and Algorithm

Chapter 3 Operating System Knowledge

Chapter 4 Basics of Programming

Chapter 5 Network Basics

Chapter 6 Multimedia Basics

Chapter 7 Database Technology Fundamentals

Chapter 8 Relational Databases

Chapter 9 SQL Language

Chapter 10 System Development and Operation

Chapter 11 Database Design

Chapter 12 Networks and Databases

Chapter 13 Database Development Trends and New Technologies

Chapter 14 Basic Knowledge of Intellectual Property

Chapter 15 Basic Knowledge of Standardization

Chapter 1 Computer System Knowledge

1. Computer software = program + data + related documents.

2. The operand included in the instruction is immediate addressing, and the address of the operand included in the instruction is direct addressing.

3. Typical structure of computer hardware: single-bus structure, dual-bus structure, large-scale system structure using channels.

4. The CPU is composed of arithmetic unit and controller; the controller is composed of program counter (PC), instruction register (IR), instruction decoder (ID), status condition register, timing generator and micro-operation signal generator.

a) PC: pc automatically increases a value to point to the next instruction to be executed. When the program transfers, the transfer address is sent to PC.

b) IR: used to store the current instruction to be executed.

c) ID: Analyze the current instruction to determine the instruction type, the operation to be completed by the instruction and the addressing mode.

5. The process of command execution:

a) Instruction fetch: The controller first fetches an instruction from the memory according to the instruction address indicated by the program counter.

b) Instruction decoding: Send the operation code part of the instruction into the instruction decoder for analysis, and then issue control commands according to the function of the instruction.

c) Execute by instruction opcode.

d) Form the next instruction address.

6. The basic functions of the CPU:

a) Program control

b) Operational control

c) Time control

d) Data processing - the fundamental task of the CPU

7. The difference between computer architecture and computer composition: the problem that the architecture needs to solve is the problem that the computer system needs to solve in terms of overall and function, while the problem that the computer composition needs to solve is how to implement it logically.

8. Classification of computer architecture (instruction flow, data flow, polyploidy):

a) Flynn classification: A traditional sequentially executed computer can only execute one instruction (that is, only one control flow) and process one data (that is, only one data flow) at the same time, so it is called a single instruction flow single data flow computer Single Instruction Single Data is SISD computer). For most parallel computers, multiple processing units perform different operations and process different data according to different control flows. Therefore, they are called multiple instruction stream multiple data stream computers, namely MIMD (Multiple Instruction Multiple Data) computer. In addition to the scalar processing unit, the vector computer, which has been the mainstream of super parallel computers for a long time, has the most important hardware unit capable of vector calculation. When performing vector operations, one instruction can operate on multiple data (forming a vector) at the same time, which is the concept of Single Instruction Multiple Data (SIMD). Therefore, we refer to vector computers as SIMD computers. The fourth type is the so-called Multiple Instruction Single Data (MultipleInstructionSingleData) computer. In this kind of computer, each processing unit forms a linear array, respectively executes different instruction streams, and the same data stream passes through each processing unit in the array in sequence. This system structure is only suitable for some specific algorithms. Relatively speaking, SIMD and MISD models are more suitable for special calculations. In commercial parallel computers, the MIMD model is the most common, followed by SIMD, and MISD is the least used.

9. Classification of memory:

a) According to the location of the memory: internal memory (main memory) and external memory (secondary memory).

b) Materials by memory: magnetic memory, semiconductor memory (static and dynamic) and optical memory.

c) By working mode: read-write memory and read-only memory. Read Only Memory (ROM/PROM/EPROM/EEPROM/Flash)

d) By access mode: memory accessed by address and memory accessed by content (connected memory).

e) By addressing mode: random access memory (RAM), sequential memory (ASM)—tape, direct memory (DAM)—disk is direct memory.

10. Input/Output: Direct Program Control, Interrupt Mode, Direct Memory Access (DMA).

11. Pipeline technology

a) Throughput rate and settling time are two important technical indicators of pipeline technology. The throughput rate refers to the number of results flowing out of the pipeline processor per unit time; the maximum throughput rate can only be reached after a period of time (establishment time) after the pipeline starts working. If the time used by m sub-processes is t0, the establishment time is m*t0, otherwise t0 is the longest time in the sub-process. Then the time required to complete the execution of n instructions is the time for the first complete execution plus the time used for the next n-1 instructions (n-1)*m*t0.

12. Virtual memory:

a) Page type: less page table hardware, fast table lookup speed, less fractions of main memory; paging is illogical, which is not conducive to storage protection.

b) Paragraph:

c) Segment page type: the address conversion speed is relatively slow.

13. Only 20% of the instructions are often used with a frequency of 80% → RISC (Reduced Instruction Set Computer) simplifies the CPU controller and improves the processing speed. The characteristics are:

14. Basic elements of information security:

15. Computer security level (technical security, management security, policy and legal security): divided into four groups and seven levels.

group security level

1 A1

2 B3

B2

B1

3 C2

C1

4 D (lowest level)

16. Characteristics of computer viruses:

a) parasitic

b) Concealment

c) illegality

d) infectious

e) destructive

17. Types of computer viruses:

a) System boot virus———BOOT virus

b) File shell virus——attack command.com file

c) Hybrid virus———Flip virus, One Half virus (ghost)

d) Directory-type virus——change directory items and dare not change related files

e) Macro virus - use macro word or excel file

18. Computer Reliability:

a) mean time between failures (MATBF=1/λ);

b) Probability (Availability/Reliability) of the computer working properly A = (MTRF Mean Time to Repair).

c) Failure rate: the ratio of the number of failed components per unit time to the total number of components, expressed by λ. The relationship between reliability and efficiency is: R(t)=e-λt.

19. Computer reliable model:

a) Series system: reliability is equal to R=R1R2…RN; failure rate λ=λ1+λ2+…+λN

b) Parallel system: reliability is equal to R=1-(1-R1)(1-R2)…(1-RN); failure rate

c) m-mode redundant system: reliability  

20. Symmetric encryption technology: the encryption key and the decryption key are the same.

a) DES (Data Encryption Standard Algorithm): use replacement and shift method to encrypt, use 56 bits to encrypt 64-bit data (that is to say, only 56 is valid), and encode 64-bit data 16 times for each encryption , the key length is 64 bits. It encrypts fast and the key is easy to generate. Due to the short key of DES, it cannot resist the exhaustive search attack on the key.

b) RC-5 algorithm.

c) IDEA algorithm: the length of plaintext and ciphertext is 64 bits, and the key is 128 bits.

21. Asymmetric encryption technology: use public key encryption and private key decryption.

a) RSA algorithm: RAS technology refers to reliability (R), availability (A), and maintainability (S)

b) The information digest is a one-way hash function. A fixed hash value is obtained through the hash function. Commonly used information digest algorithms include MD5 and SHA algorithms, and the hash values ​​are 128 and 160 bits respectively.

c) Digital signature: encrypt with private key and decrypt with public key.

d) Digital time stamp technology: one of the e-commerce security service items, it can provide the security protection of the date and time information of electronic files. It adds time to the data encryption and consists of a summary, the date and time of the file, and a data signature.

22. Information transmission encryption:

a) Link encryption: encrypt the transmission path;

b) Node encryption:

c) End-to-end encryption:

23. SSL security protocol: mainly used to improve the security factor of data between applications. The services provided are:

a) Legitimacy authentication of users and servers.

b) Encrypt data to hide the data being transmitted.

c) Protect data integrity.

24. Comparison of DES and RAS:

25. Computer fault diagnosis technology

a) Computer failure:

i. Permanent failure

ii. Intermittent failure

iii. Transient failure

26. Memory capacity = last address - first address + 1.

27. Storage-related computing problems:

a) Calculate the number of tracks: Number of tracks = (outer radius - inner radius) × track density × number of recording surfaces. Note: The first side and the last side of the hard disk are used for protection to be subtracted, that is, the number of recording sides of a disk with n double sides is n×2-2.

b) Unformatted disk capacity: capacity = bit density × π × diameter of the innermost circle × total number of tracks. Note: The bit density of each track is different, but the capacity is the same, and track 0 is the outermost track with the smallest bit density.

c) Formatted disk capacity: capacity = number of sectors per track x sector capacity x total number of tracks.

d) (Format) average data transmission rate: transmission rate = number of sectors per track x sector capacity x disk rotation speed.

e) Access time = seek time + wait time. Among them: the seek time refers to the time required for the head to move; the waiting time refers to the time required for the sector waiting for reading and writing to go to the bottom of the head.

f) (Unformatted) average data transmission rate: transmission rate = innermost diameter x π (3.14) x bit density x disk rotation speed. Note: Unformatted is generally used.

28. Number system operation

29. Code system

a) Inverse code: The inverse code of a positive number is the same as the original code, and the inverse code of a negative number is the bitwise inversion of the original code (the sign bit remains unchanged).

b) Complement code: The complement code of positive numbers is the same as the original code, and the complement code of negative numbers is the last bit of the complement code plus 1 (that is, the sign bit is removed and the last bit is reversed and 1 is added).

c) Code shift (increase code): Negate the sign bit of the complement code.

d) [X + Y ] Complement = [X] Complement + [Y ] Complement

e) [X - Y ] Complement = [X] Complement - [Y ] Complement

f) [- Y ] Complement = - [Y ] Complement

30. Check code:

a) Cyclic Check Code (CRC):

i. Modulo two division: refers to the division in which the carry is not considered during the division operation.

b) Hamming check code:

i. Determine the check digit according to the information digit, 2r≥k+r+1. k is the number of information bits, r is the check number, and the minimum r that satisfies the inequality is the check number.

Chapter 2 Data Structures and Algorithms

1. Data structure refers to the organization of data elements.

2. The sequential storage structure of the linear table:

a) It is characterized by the adjacency relationship on the physical location to represent the logical relationship of the nodes, which can randomly access any node in the table, but it is inconvenient to insert and delete.

b) The i-th element in the lookup table LOC(ai) = LOC(a1)+(i-1)*L

3. Linked storage structure of linear table:

a) Use a group of arbitrary storage units to store the data elements of the linear list, and the logical order and physical order of the nodes in the linked list are not necessarily the same.

Data field pointer field

4. Insertion and deletion of linear tables

a) Sequential storage: Einsert = n/2 Edelete = (n-1)/2

b) Chain storage:

5. Sequential storage of the stack: use two sequential stacks to share a data space: (first in, last out)

bottom of stack 1

Top of stack 1 … Top of stack 2

bottom of stack 2

6. Queue: It is only allowed to insert elements (tail) at one end of the table and delete elements (head) at the other end. (FIFO)

7. A substring is included in its main string at the position where the first character of the substring first occurs.

8. Guan Yi table

  

9. Properties of binary trees:

a) The number of nodes on the i-th layer of the binary tree is at most 2i-1 (i≥1).

b) A binary tree with a depth of K has at most 2k-1 nodes (k≥1).

c) In any binary tree, if the number of terminal nodes is n0 and the number of nodes with degree 2 is n2, then n0=n2+1.

d) The depth of a complete binary tree with n nodes is (rounded down).

10. Tree and binary tree conversion: the left child remains unchanged, and its brother node becomes the right child of the left child; stand up. like:

  

11. The pre-order traversal of the tree is the same as the pre-order traversal of the binary tree; the post-order traversal of the tree is the same as the in-order traversal of the binary tree.

12. Hashing is to convert an input of any length into a fixed-length output through a hash algorithm. The output is a hash value. The table created in this way is a hash table, which can be dynamically created.

13. Binary search (half search): It is required that the keywords must adopt a sequential storage structure and must be sorted according to the size of the keywords.

14. Find binary tree (binary sort tree) - dynamic lookup table: either empty tree or satisfy:

a) The left and right subtrees of the search tree are each a search tree.

b) If the left subtree of the search tree is not empty, the value of each node on the left subtree is less than the value of the root node.

c) If the right subtree of the search tree is not empty, the value of each node on the right subtree is greater than the value of the root node.

d) Balanced binary tree: either an empty tree, or satisfying: the depth difference between the left and right subtrees of any node in the tree does not exceed 1. The balance degree of a node: the depth of its right subtree minus the depth of its left subtree (so the balance degree can only be 1, 0, -1).

15. The sum of out-degrees of all vertices in a directed graph is equal to the sum of in-degrees.

16. In a graph, the number of edges is equal to half the sum of the degrees of all vertices.

17. The number of edges with vertices n in a directed graph is equal to n, and the number of edges in an undirected graph is equal to.

18. In C language, each member in struct occupies its own memory space, and the total length is the sum of the lengths of all members, while the length in union is equal to the length of the longest member.

Chapter 3 Operating System Knowledge

1. Type of operating system:

a) Batch operating system (single and multi-process)

b) Time-sharing system (multi-channel (simultaneity), independence, interactivity, timeliness) Note: UNIX is a time-sharing system for multi-user and multi-tasking.

c) Real-time system - high reliability

d) Network Operating System

e) Distributed Operating System

f) Microcomputer operating system

g) Embedded operating system

2. Use PV operation to realize mutual exclusion and synchronization of processes.

3. Network Operating System

a) Centralized mode

b) Client/server mode

c) peer-to-peer mode

4. Interrupt response time: the time taken from sending an interrupt request to entering the interrupt processing.

5. Interrupt response time = the maximum time for turning off interrupts + the time to protect the internal registers of the CPU + the execution time to enter the interrupt service function + the time to start executing the first instruction of the interrupt service routine (ISR).

6. When the disk drive writes data to the magnetic coating of the disk, it is recorded on the magnetic tracks of the disk in a serial manner one bit after another.

7. The composition of the cache: Cache consists of two parts: the control part and the Cache memory part.          

8. The address mapping between the Cache and the main memory is to convert the main memory address sent by the CPU into a Cache address. There are three ways:

a) Direct image: It divides the main memory space into areas according to the size of the Cache, and each block in each area can only be mapped to the corresponding block position of the Cache one by one.

Main memory address: main memory area number + block number B + block address W

 Cache address: block number b + block address w

     Correspondence: block number B = block number b, block address W = block address w

b) Fully associative image: every page in the main memory can be mapped to any page in the cache.

Main memory address: block number B + block address W

Cache address: block number b + block address w

    Correspondence: block number B corresponds to block number b through the address conversion table, block address W = block address w

c) Group associative image: It is a compromise between direct image and fully associative image. That is, direct mapping between groups and fully associative mapping within groups.

Main memory address: area code E + group number G + block number B in the group + address W in the block

Cache address: group number g + block number b in the group + address w in the block

There is a direct mapping relationship between groups, and a fully connected mapping relationship within a group

Correspondence: group number G=group number g, block number B in the group corresponds to block number b in the group through the address conversion table, address W in the block = address in the block w

9. Cache memory:

a) Hit rate: t3=μ×t1﹢﹢1-μ﹚×t2. Among them: μ is the access hit rate of Cache (1-μ) is the miss rate, t1 is the cycle time of Cache, t2 is the cycle time of main memory, and t3 is the average cycle of "Cache+main memory".

b) The multiplier after using Cache: r = t2/t3.

10. Replacement Algorithm: The goal is to make the Cache get the highest hit rate. Commonly used algorithms are as follows:

a) Random replacement algorithm. It is to use a random number generator to generate a block number to be replaced, and replace the block;

b) First-in-first-out algorithm. It is to replace the information block that entered the Cache first. This method is simple but it does not mean that the first to enter is not often used;

c) Least recently used algorithm. This method is to replace the information block in the least recently used Cache. This algorithm is better than the first-in first-out algorithm. However, this method cannot guarantee that it has not been used in the past and will not be used in the future.

d) Optimize the replacement algorithm. When using this method, the program must be executed once to count the replacement of the Cache. Note: http://apps.hi.baidu.com/share/detail/30866296

11. Locality theory and Denning's working set theory:

a) The basis of the virtual storage management system is the locality theory of the program: the locality of the program is manifested in the locality of time and space. Time locality means that the most recently accessed storage unit may be accessed again soon. Spatial locality refers to the storage unit that is accessed immediately, and its adjacent or nearby units may also be accessed immediately.

b) According to the locality theory of the program, Denning proposed the working set theory: when the process is running, if it can ensure that its working set pages are all in the main memory, the number of page faults in the process will be greatly reduced, and the process will run efficiently ;Otherwise, there will be frequent page transfer in/out due to some working pages not in the memory, resulting in a sharp decline in system performance, and in severe cases, "jitter" phenomenon will appear.

12. Process Status

13. Conditions for no process deadlock: number of system resources = number of processes * (number of resources required by each process - 1) + 1.

14. The predecessor graph is a directed acyclic graph.

15. PV Operations: Producer and Consumer Problems.

a) Critical resources: resources that need to be shared among processes in a mutually exclusive manner, such as printers.

b) Critical section: the section of program code that accesses critical resources in each process.

c) s: semaphore; P operation: make S = S-1, if S<0, the process suspends execution and puts it into the waiting queue of the semaphore; V operation: make s = s+1, if s≤0, wake up Wait for a process in the queue.

d) P operation is performed when entering the critical area, and V operation is performed when exiting the critical area.

16. Process communication (indirect communication)

a) Send letter: If the designated mailbox is not full, send the letter to the position indicated by the pointer in the mailbox, and release the waiter waiting for the letter in the mailbox; otherwise, the sender is set to the waiting mailbox state.

b) Receiving letters: If there is a letter in the designated mailbox, take out a letter and release the waiter waiting for the mailbox, otherwise the recipient of the letter is set to the state process communication of waiting for the letter in the mailbox.

17. Storage management:

a) Page storage management: Logical address is divided into page number + page address, page table is divided into page number + block number, and block number corresponds to memory block number. Physical address = block number + page address. The address in the page is determined by the size of each page. For example, if the logical address is 16K=214, and the page size is 2K=211, the address in the page is 11 bits, and the number is 3 bits. That is: P=INT[A/L];d=[A]MOD L. The logical address is A. The page size is L, page number P, and address d in the page.

b) Segment storage management method: the logical address is divided into segment number + intra-segment address, and the segment table is divided into segment number + segment length + base address. The base address corresponds to the memory address. Physical address = base address + segment address.

c) Segment page storage management method: the logical address is divided into segment number (s) + page number in the segment (P) + address in the page (w). Consists of a segment table and multiple (a set of page tables). Physical address = block number + page address. In a multi-program environment, each program also needs a base number as a user ID. Then the physical address = (base number + segment number + page number) * 2n + address in the page. Among them, 2n is to splice the n-bit in-page address to the back.

18. The main functions of the file system are: to achieve access to files by name, and to read the control information of the file from the auxiliary storage to the internal memory by using the open file (open).

19. Disk partition capacity in the FAT16 file system = cluster size × 216.

20. Spooling technology is a technology that simulates another type of physical equipment with one type of physical equipment, and the functional module that realizes this technology is called a Spooling system. Features of the Spooling system:

a) Improved I/O speed.

b) Transform exclusive devices into shared devices.

c) The function of the virtual device is realized.

21.

Chapter 4 Basics of Programming

1. Types of programming languages:

a) Imperative programming languages: Action-based languages ​​such as fortran, pascal, and c.

b) Object-oriented programming languages: java, C++.

c) Functional programming language: mainly used for symbolic data processing, such as integral calculus, mathematical logic, game deduction and artificial intelligence.

d) Logic programming language: It is not necessary to describe the specific succession process, but only to give some necessary facts and rules as a development tool for the expert system.

2. The basic components of a programming language:

a) Data components: constants and variables, global and local quantities, data types.

b) Operational components:

c) Control components: sequence structure, selection structure and loop structure.

d) Function: function definition, function declaration, function call.

3. The basic characteristics of object-oriented programming language:

a) abstract data objects;

b) Support template operations, specifically function templates and class templates, that is, generic programming.

c) Support dynamics;

d) Support inheritance - the main difference with other languages.

e) The class library is a sign of maturity.

4. The characteristic of C language is that procedural programming belongs to static language and all components can be determined at compile time.

5. The scripting language is a dynamic language, which can be changed at runtime and cannot generate an independent target program.

6. Errors in writing programs include:

a) Dynamic error: refers to the logic error in the source program, which occurs when the program is running, such as the subscript of the array whose divisor is 0 is out of bounds.

b) Static errors: Divided into syntax errors and semantic errors.

Chapter 5 Network Basics

1. TCP is the transmission control protocol of the fourth layer (transport layer); IPSec is the VPN protocol of the third layer (network layer); PPOE works on the second layer (data link layer); SSL works on the TCP protocol security protocol.

2. FTP transmission needs to be established:

a) Control connection: file transfer command, requested by the client to the server.

b) Data connection: file transfer, the active mode is actively connected by the server, and the passive mode server waits for the client to connect.

3. Port number:

Port number service process description

20 FTP File Transfer Protocol (data connection)

21 FTP File Transfer Protocol (Control Connection)

23 TELNET virtual terminal network

25 SMTP Simple Mail Transfer Protocol

53 DNS domain name server

80 HTTP Hypertext Transfer Protocol

110 POP3 Post Office Protocol (simple mail reading)

111 RPC Remote Procedure Call

143 IMAP Interactive Access Protocol (Message Access)

4. E-commerce transactions: Through identity authentication, the identity of an entity can be determined to prevent one entity from pretending to be another entity; the combination of authentication and authorization can prevent others from modifying and destroying data without authorization; protecting the confidentiality of information can Prevent information leakage from monitored communications. Non-repudiation prevents a party involved in the transaction from denying that the transaction ever occurred

5. Network security technology: The protection of information access includes user identification and verification, user access control, system security monitoring, computer virus prevention and control, and data encryption.

a) VPN technology: connect the two internal networks through the public network through the tunnel to make it an overall network.

b) Firewall technology: types are

i. Packet filtering firewall (shielding router): the router is placed in the internal network, and the network layer is safe.

ii. Application proxy firewall: that is, dual-homed host firewall, application layer security.

iii. State inspection technology firewall: the combination of the above two technologies, the shielding router is placed on the external network, and the dual-homed host is placed on the internal network.

iv. Shielded subnet firewall: Set up a DMZ (Demilitarized Zone) consisting of shielded routers and dual-homed machines.

6. The characteristics of multimode fiber are: low cost, wide core wire, good light concentration, large dissipation, low efficiency, and used for low-speed and short-distance communication. The characteristics of single-mode fiber are: high cost, narrow core wire, laser source required, low dissipation, high efficiency, and used for high-speed and long-distance communication.

7. Ping command: to judge the connectivity between the user and the external site, first, ping 127.0.0.1 (local circular address), if the ping cannot be performed, it means that the local TCP/IP protocol cannot work normally; second, if ping + local IP fails, it means If the network adapter (network card/MODEM) is faulty, 3. If the IP of the computer on the same network segment fails, it means that the network line is faulty; netstat command: used to display statistical data related to TCP, UDP, IP, and ICMP protocols, generally used to test this The connection status of the network port of the computer; ARP command: you can view and modify the ARP entries of the local computer, and it is very useful to view the ARP cache and solve address resolution problems. Tracert command: You can track network connections. Tracert (route tracking) is a route tracking program, which is used to determine the path taken by the IP datagram access target, and you can check which route has connection problems.

8. DHCP (Dynamic Host Configuration Protocol): It is used to dynamically assign IP addresses to hosts in the network. By default, the client uses the IP address assigned by the first DHCP server it reaches.

9. Internet protocol:

a) TCP/IP protocol: It is the core protocol of the Internet protocol, basic features (logical addressing, routing selection, domain name resolution protocol, error detection and flow control)

b) ARP (Address Resolution Protocol) and RARP (Reverse Address Resolution Protocol). ARP translates IP addresses into physical addresses (MAC addresses).

10. Network design principles:

a) Advancement: adopt advanced technology;

b) Practicality: use mature and reliable technology and equipment to achieve the purpose of effective use;

c) Openness: the network system adopts open standards and technologies;

d) Economy: save costs as much as possible on the basis of meeting demand;

e) High availability/reliability: the system has a high mean time between failures, such as: finance, railway securities, etc.

Chapter 6 Multimedia Basics 

1. Attributes that measure the characteristics of the sound (three elements):

a) Volume: also called sound intensity, which measures the intensity of sound.

b) Pitch: Sound frequency.

c) Timbre: Determined by overtones mixed with the fundamental tone.

2. Bandwidth of the sound: the frequency range of the sound signal.

a) The audio range that the human ear can hear (other sounds): 20HZ~20KHZ

b) Audio range of human voice: 300~3400HZ

c) Audio range of musical instruments: 20HZ~20KHZ

3. Digitization of sound signals: —— sampling-quantization method

a) Sampling: recording of signal measurements. Note: The sampling frequency of the voice signal is generally 8KHz, and the sampling frequency of the music signal should be above 40KHz.

b) Digital signals are discrete, while analog signals are continuous.

c) Quantization (digital-to-analog conversion): A/D conversion

4. The difference between graphics and images: the graphics will not be distorted when enlarged, but the image will be distorted when enlarged.

5. Three elements of color:

a) Brightness: the feeling of brightness.

b) Hue: It reflects the type of color.

c) Saturation: the purity of the color, that is, the degree of mixing white light and the vividness of the color.

6. Color space:

a) RGB color space: computer. red yellow green

b) CMY color space: print. Cyan, Magenta, Yellow

c) YUV color space: TV.

7. Image file size calculation:

a) Known pixel and number of bits: capacity = pixel * number of bits / 8B

b) Known pixel and color number: capacity = pixel * number of digits / 8B (2 digits = color number, that is, n digits can represent 2 digit colors)

8. Audio file size calculation:

a) Uncompressed :

Data transmission rate (b/s) = sampling frequency (Hz) * quantization bit number (sampling bit number) (b) * channel number (if you are looking for bytes, you should divide it by 8)

b) Storage space (capacity) required after digitization:

Sound signal data volume = data transmission rate (b/s) * duration / 8 (B)

9. Calculate the size of the video file:

a) Storage capacity (number of bytes) = capacity of each frame image (B) * frames per second * time

Note: The capacity (B) of each frame image is calculated in the same way as the image file capacity.

b) Transmission rate during playback = capacity of each image * number of images transmitted per second

10. Common video standards:

a) MPEG-1: MPEG-1 layer 1 is for composite coding such as: digital audio cassette; MPEG-1 layer 2 is for video coding such as: DAB, VCD; MPEG-1 layer 3 is for audio coding, such as Internet , MP3 music; Layer 4 is used for checking. Digital TV standard.

b) MPEG-2: Application to interactive multimedia. DVD, the digital television standard.

c) MPEG-4: Many different video formats, applications such as virtual reality, distance education and interactive video. Standard for multimedia applications.

d) MPEG-7: MPEG-7 is not a compression coding method. Its official name is the multimedia content description interface. Its purpose is to generate a standard for describing multimedia content. This standard will provide information interpretation A certain degree of freedom can be transmitted to equipment and computer programs, or retrieved by equipment or computer programs.

e) MPEG-21: "Multimedia Framework" or "Digital Audiovisual Framework", which aims to integrate standards to support harmonized technologies to manage multimedia commerce, the purpose is to understand how to combine different technologies and standards What new standards and the combination of different standards.

f) The image resolution of CIF video format is: 352*288 (commonly standardized image format); QCIF: 176*141; DCIF: 528*384

g) The output video data rate of the MPEG-1 encoder is 15Mbps; under the PAL system, the image resolution is 352×288, and the frame rate is 25 frames per second.

11. Image file format

g) Static format: GIF/BMP/TIF/PCX/JPG/PSD

h) Dynamic format: AVI/MPG/AVS

i) The encoding and compression standard currently used for images: JPEG/MPEG/H.261.

12. Audio format

a) WAVE/MOD/MP3 (the third layer of MPEG-1)/REAL AUDIO/MIDI/CD AUDIO

b) Audio files are usually divided into sound files and MIDI files. A sound file is the original sound recorded by a sound recording device; MIDI is a sequence of music performance instructions, which is equivalent to a musical score, performed by an electronic musical instrument, does not contain sound data, and has a small file size.

13. Compression technology

a) Redundancy in multimedia data: temporal redundancy, spatial redundancy, visual redundancy, information entropy redundancy, structural redundancy, and knowledge redundancy.

b) Basic ideas and methods of video image compression technology: In terms of space, image data compression adopts JPEG compression method to remove redundant information, and the main methods include intra-frame prediction coding and transform coding; in terms of time, image data compression uses inter-frame prediction Encoding and motion compensation algorithms to remove redundant information.

c) Lossless compression is also called redundant compression or entropy coding; lossy compression is also called entropy compression. The difference is that lossless compression can be restored. Huffman coding and run-length coding methods belong to lossless compression, while predictive coding, transform coding and motion compensation belong to lossy compression.

d) Entropy coding: Entropy coding is the coding that does not lose any information according to the principle of entropy during the coding process. Common entropy coding includes: LZW coding, Shannon coding, Huffman coding and arithmetic coding (arithmetic coding) .

Chapter 7 Database Technology Basics

1. Database (DB) refers to a collection of organized and shareable data stored in a computer for a long time.

2. A database system (DBS) consists of a database, hardware, software and personnel.

3. Development of database technology:

a) Manual management stage

b) Document management stage

c) Database system stage (with high data independence)

4. Three elements of the data model:

a) Data structure

b) Data manipulation

c) Data constraints

5. For data operations: DDL language (CREATE/ALTER/DROP/integrity constraints), DML language (SELECT/INSERT/DELETE/UPDATE); for permission operations, there is DCL language.

6. The data model is divided into: conceptual data model (ER model) and basic data model (hierarchical, network, relational model) and the currently proposed object model.

7. Entity Properties

a) Simple attribute (not subdivided) and compound attribute (dividable such as address (province, city...))

b) Single-valued attributes (only one value) and multi-valued attributes (such as multiple phone numbers)

c) NULL attribute (none or unknown)

d) Derived attributes (derivable from other attributes)

8. Components of the ER method:

9. Extended ER model

a) Weak entity (depends on another entity to exist)

b) Specialization———P375

10. Architecture of database system

a) Three-level schema structure (three layers and two images)

i. Data Physical Independence

ii. Data logic independence

b) Centralized database system: two-phase commit protocol: lock phase (expansion phase) and unlock phase (shrink phase)

c) Client/server database architecture

d) Parallel database system (multiple CPUs)———P387

i. Shared memory multiprocessor

ii. Shared nothing parallel architecture

e) Distributed database system: Two-phase submission protocol: voting phase and execution phase

f) Web database

11. Full code: means that all attribute groups in the relational model are candidate keys for this relational model.

12. Database control function

a) Transaction management (indivisible logical unit of work)

i. Atomicity: do both or none

ii. Consistency: Contains only successfully submitted transactions

iii. Isolation: when multiple things are executed concurrently, they are isolated from each other

iv. Persistence: Once the transaction is successfully submitted, it will be permanently reflected in the database

b) Failure recovery

i. Internal failure of things

ii. System failure

iii. Media failure

iv. Computer virus

v. Recovery methods: static dump and dynamic dump, mass dump and incremental dump, log files

vi. Things recovery steps: reverse scan file log, perform reverse operation on the update operation of things, continue to reverse scan log files until the start sign of things

vii. Database Mirroring

c) Concurrency control

i. Problems caused by concurrent operations: bring about data inconsistency (lost updates, non-repeatable reads, and read dirty data); destroy the isolation of things.

ii. Concurrency control technology: blockade, exclusive lock (X lock) and shared lock (S lock)

iii. Three-level blocking protocol: Level 1: Solve lost updates; Level 2: Solve reading dirty data; Level 3: Solve non-repeatable reads

iv. Serializability of concurrent scheduling: Serializability is the correctness criterion for concurrent things, and it is correct concurrent scheduling if and only if it is serializable

v. The granularity of the blockade: the scope of the blockade

vi. Things cannot be nested, because it violates the atomicity of things; things can only be executed if and only when there is no thing currently executing.

d) Security and Authorization

i. Security breach (unauthorized reading, modification, destruction of data)

ii. Authorization

1) read: Allow reading, not modify

2) insert: Insertion is allowed, modification is not allowed

3) update: Allow modification, not delete

4) delete: Allow deletion

5) index: Allows to create or delete indexes

6) resource: Allows creation of new relationships

7) alteration: Allows adding or removing attributes in a relationship

8) drop: allows to delete the relationship

13. Execution status of things:

a) Active state: The initial state of the thing.

b) Partial submission status: all completed.

c) Failure state: Due to hardware or logic errors, things cannot continue, and things in a failed state must be rolled back. Then things go to a halt.

d) Suspended state: The transaction is rolled back and the database is restored to the state before execution began.

e) Commit status: When the thing is successfully completed, the thing is in the submitted state, and only when the thing is in the submitted state can it be said that the thing has been submitted.

14. Isolation level of things (high to low):

a) Serializable (read phantom): SERIALIZABLE

b) Repeatable read: REPEATABLE READ

c) Read submitted data: READ COMMITTED

d) Can read uncommitted data: READ UNCOMMITTED

e) SQL statement definition: SET RANSACTION SOLATON LEVEL a)/b)/c)/d)

f) Phantom phenomenon: the data records obtained by two accesses of the same thing to the data object are different, and the problem of non-repeatable reading

15. Data warehouse

a) The basic characteristics of DW: subject-oriented, data is integrated, data is stable first, and data reflects historical changes (the time limit is generally 5 to 10 years).

b) Data schema - fact table, multidimensional data schema including (star schema, snowflake schema, fact star schema)

c) Data Warehouse Architecture

i. Usually used: data warehouse server, OLAP (Online Analytical Processing), front-end server

ii. From the perspective of structure: enterprise warehouse, data mart, virtual warehouse

16. Design of data warehouse:

a) The difference between the data model of the data warehouse and the operational database: ○1 does not contain purely operational data; ○2 expands the code structure and adds time attributes as part of the code; ○3 adds some derived data.

b) The physical design of the data warehouse: mainly to improve the I/O performance, and improve the performance of the system through granularity division and data segmentation.

17. Data mining technology: massive data collection, powerful multi-processing computer and data mining algorithms.

18. Commonly used techniques in data mining: artificial neural networks, decision trees, genetic algorithms, nearest neighbor algorithms, and rule pushback.

19. The application process of data mining

a) Determine the mining object

b) Data preparation (60% of data mining workload), including ○1 data selection; ○2 data preprocessing (cleaning); ○3 data conversion.

c) Build a model

d) data mining

e) Analysis of results

f) knowledge application

20. Data dump: The process in which the DBA periodically copies the entire database to tape or another disk to save it.

a) Dynamic dump: It means that the database is allowed to be accessed or modified during the dump. That is, dumps and user transactions can execute concurrently.

b) Static dump: A dump operation performed when there are no running transactions in the system.

c) Incremental dump: refers to dumping only the updated data after the last dump each time.

d) Mass dump: refers to dumping the entire database each time.

e) From a recovery point of view, it is generally more convenient to use the backup copy obtained from the mass dump for recovery. However, if the database is large and the transaction processing is very frequent, the incremental dump method is more practical and effective.

21. OLAP (Online Analytical Processing): Usually used for data mining of data warehouses; OLTP (Online Transaction Processing) is the execution of transaction-oriented programs, usually corresponding to intensive update procedures, and applied to database operations. OLAP does not have strict time requirements, while OLTP is business-oriented and has relatively high timeliness requirements. OLAP is used for data mining to provide decision support, and OLTP is used for specific businesses.

Chapter 8 Relational Databases

1. The relational model is the foundation of a relational database, consisting of a relational data structure, a set of relational operations, and relational integrity rules.

2. The degree of a relation refers to the number of attributes in the relation, and the potential of a relation refers to the number of tuples in the relation.

3. All fields in the relational model should be atomic data (1NF).

4. Three types of relationships: basic table, query table, view table

5. Integrity constraints: entity integrity, referential integrity, user-defined integrity.

6. Traditional set operations in relational algebra require that the relations involved in the operation have the same degree and the corresponding attributes are taken from the same domain.

7. Relational operations:

a) Relational Algebra Language

b) Relational Calculus Language

c) A language (SQL) with the above two dual characteristics

8. Query optimization guidelines in relational algebra:

a) Execute the selection operation as early as possible

b) Perform the projection operation as early as possible

c) Avoid directly doing the Cartesian product, and combine the operation before the Cartesian product with a series of selections and projections after it.

9. Design issues of relational schema:

a) Data redundancy: the same data is repeated multiple times.

b) Operation exceptions (update exceptions): modification exceptions, insertion exceptions, and deletion exceptions.

c) A principle of normalization: "If there is redundancy in the relational schema, decompose it".

10. Informal design guidelines for relational schemas:

a) The design of the relational schema should contain only directly related attributes as far as possible, and not include indirectly related attributes.

b) Insertion, deletion, and operation exceptions should not occur as much as possible.

c) Avoid placing attributes that are often null values ​​as much as possible.

d) Make the equivalence join on the primary key and foreign key as much as possible, and ensure that no extra tuples will be generated.

11. Functional dependencies:

a)  

b) Functional dependencies are equal if the closures of the set of functional dependencies are equal.

c) If there is a FD W→A, if any subset X of W does not have X→A, then W→A is said to be a complete functional dependency. Otherwise it is called local functional dependency.

d) Transfer function dependence: If X→Y, Y→A, and Y does not→X, A does not∈Y, then X→A is a transfer function dependence.

e) FD and key code: Let the attribute set U of schema R, X is a subset of U, if X→U holds on R, then X is a superkey of R. If X→U holds on R, but X1→U does not hold for any proper subset X1 (note: no redundant attributes), then X is a candidate key for R.

f) If A is an attribute in the candidate key in the relational schema R, then A is said to be the primary attribute of R, otherwise it is a non-primary attribute.

g) Minimal functional dependency: (excluding redundant functional dependencies) the following three conditions are met (minimum functional dependency set G):

i. The right side of each FD in G is a single attribute.

ii. There are no redundant FDs in G.

iii. The left side in G has no redundant attributes.

12. Paradigm of relational schema - normalization

a) 1NF: If the attribute values ​​of each relation r of relation R are indivisible atomic values. (normalized relationship)

i. Problems with 1NF: large redundancy and abnormal update.

b) 2NF: If every non-key attribute is completely functionally dependent on the candidate key.

c) Each non-primary attribute of 3NF Fugou does not transmit a candidate key dependent on R.

d) BCNF: If each attribute is not transitively functionally dependent on R's candidate key.

e) 4NF: Suppose R is a relational schema, and D is a multi-valued dependency function on R. If a non-trivial multi-valued dependency X→→Y is established in D (that is, X and Y are in D), X must be a superkey , then R is in 4NF.

13. The relational schema R is decomposed into a 2NF schema set: if there are FDs W→Z, X→Z, X?W in the relational schema R, where w is the primary key and Z is the non-primary attribute, then W→Z is the local functional dependency . Decomposed into R1 (XZ), the primary key is X; R2 (Y), Y=UZ, the primary key is W, and the foreign key is X.

14. Decompose the schema R into 3NF: If there is FD W→Z, X→Z in the relational schema R, X is not a candidate key, where w is the primary key, Z is the non-primary attribute, and Z is not? X, then there is W→Z is a transitive dependency. Decompose positive R1 (XZ), the primary key is X, R2 (Y), Y=UZ, the primary key is W, and the foreign key is X.

15. There are three equivalent cases of schema decomposition:

a) Decomposition has lossless connectivity

b) Decomposition should maintain functional dependencies

c) Decomposition requires both lossless connections and functional dependencies

16. The necessary and sufficient condition for lossless decomposition is: if p(R1, R2) is a decomposition of R, it must satisfy: (R1∩R2)→(R1-R2) or (R1∩R2)→(R2-R1) , or R1∩R2 is a superkey of R1 or R2, it is lossless decomposition.

17. Keep functional dependencies: Let p(R1,R2...Rk) be a decomposition of R, F is FD on R, if there is, keep functional dependencies.

18. Test of lossless connection:

Let the relational schema R=A1,...,An, FD set F established on R, a decomposition of R p={R1,...,Rk}. The judgment steps of lossless connection decomposition are as follows:

(1) Construct a table with k rows and n columns, each column corresponds to an attribute Aj (1≤j≤n), and each row corresponds to a pattern Ri (1≤i≤k). If Aj is in Ri, then fill in the symbol aj in row i and column j of the table, otherwise fill in the symbol bij.

(2) Treat the table as a relation of schema R, repeatedly check whether each FD in F is established in the table, if not, modify the elements in the table. The modification method is as follows: For an FD in F: X→Y, if there are two rows in the table that are equal on the X component but not equal on the Y component, then change the two rows to be equal on the Y component. If one of the components of Y is aj, then the other is also changed to aj; if there is no aj, then use one of the bij to replace the other (try to change ij to a smaller number, that is, take the value of i smaller that).

(3) If during the modification process, it is found that there is a row in the table full of a, that is, a1, a2, ..., an, then it can be immediately concluded that p is a lossless connection decomposition relative to F, and no further modification is necessary at this time. If after several revisions until the table cannot be modified, it is found that there is no row in the table that is all a, then the decomposition is lossy. In particular, it should be noted that there is a cyclic and repeated modification process here, because one modification may cause the table to continue to be modified.

19. Judgment of candidate keywords:

a) L-type attributes: attributes that appear only in the left half of functional dependencies; R-type attributes: attributes that only appear in the left half of functional dependencies; LR-type attributes, attributes that appear on the left and right sides of functional dependencies; N-type attributes , an attribute that does not appear on either side.

b) ○1 Divide all attributes in the relational schema R into the above four categories, use X to represent L and N, and use Y to represent LR. ○2 Find X+, if X+ contains all the attributes of the relational schema, then X is the only candidate key of R, otherwise the next step. ○3 Take an attribute A in Y, find (XA)+, if it contains all the attributes of R, go to the next step, otherwise change to another attribute. ○4 If all the candidate keys are found, it will end, otherwise, take two, three... from Y, and find the closure of their attributes until all the candidate keys are found.

Chapter 9 SQL Language

1. Create a basic table:

a) CREATE TABLE C

(C# CHAR(4) ○1NOT NULL UNIQUE / ○2NOT NULL PRIMARY / ○3PRIMARY KEY,

CNAME CHAR(10) NOT NULL)

b) CRATE TABLE C

(C# CHAR(4)○1,

CNAME CHAR(10) NOT NULL,

PRIMARY KEY(C#)) Note: ○1 can be omitted at this time

c) When defining a foreign key, you can write it together: T# CHAR(4) FOREIGN KEY(T#) REFERENCES T(T#), or you can write T# CHAR(4) in two lines,

FOREIGN KEY (T#) REFERENCES T(T#),

2. To define cascade delete, add ON DELETE CASCADE when defining the B table foreign key (primary key of A table) attribute. At this time, when the primary key of table A is deleted, its primary key is a foreign key in the corresponding table (the foreign key of table B) will be deleted at the same time. Triggers can also be defined.

3. Modification of the basic table:

a) Add a new column: ALTER TABLE<basic table name>ADD<column name><type>{can set default value 0,——DEFAULT=0}

b) Delete column: ALTER TABLE<basic table name>DROP COLUMN<column name>[integrity constraint CASCADE|RESTRICT]

c) Modify data type: ○1ALTER TABLE<basic table name>ALTER COLUMN<column name><type>○2ALTER TABLE<basic table name>MODIFY<column name><type>

4. Revocation of the basic table: DROP TABLE<basic table name>[CASCADE|RESTRICT]

5. Data deletion: DELETE FROM <basic table name> WHERE <conditional expression>

6. Note: CASCADE means that all constraints and views are automatically deleted, and RESTRICT means that no views and constraints can be deleted.

7. Data modification: UPDATE<basic table name> SET<column name> = <value expression> WHERE<conditional expression>

8. Create the index:

a) The role of the index: By creating a unique index, the uniqueness of the data can be guaranteed; the retrieval speed of the data can be improved; the connection between tables can be accelerated, which is of great significance for the referential integrity of the data; use ORDER BY and GROUP BY can reduce the time of grouping and sorting in queries.

b) The clustered index sorts the data in the physical data page of the table by column, and then re-stores it to the disk, that is, the clustered index and the data are mixed together, and the actual data is stored in its nodes .

c) The non-clustered index has a structure that is completely independent of the data row. It is not necessary to sort the data in the physical data page by column. The key value and row position of the index are stored in the node.

d) Create an index: CREATE [UNIQUE][CLUSTERE] INDEX<index name>ON<basic table name>(<column name[DESC][ASC]>, <column name[DESC][ASC]>,….)

e) Delete an index: DROP INDEX<index name>,<index name>,…

9. View operations:

a) The view is based on the query and is a virtual table. The data of the view must not be stored in the database according to the view storage structure, but stored in the table referenced by the view.

b) Advantages and disadvantages of views: views update data in real time, are safe, and store space only takes up code space, but the execution process is somewhat slow.

c) View creation: CREATE VIEW <view name> (<column name sequence>) AS <SELECT query statement>[WITH CHECK OPTION]

Note: The ORDER BY clause and DISTINCT are generally not allowed in subqueries (SELECT statements). WITH CHECK OPTION allows the user to update the view. where the column names are either omitted entirely or specified.

d) View deletion: DROP VIEW<view name>

e) View update (only row and column subset view (view is derived from a single base table using only select, projection operations))

10. Data Definition Language (DDL): CREATE, ALTRE, DROP; Data Manipulation Language (DML): SELECT, INSERT, DELETE, UPDATE; Data Control Language (DCL): Constraint permissions

11. Query statement:

12. The UNION operator is used to combine the result sets of two or more SELECT statements. By default, the UNION operator picks distinct values. If duplicate values ​​are allowed, use UNION ALL. like:

SELECT column_name(s) FROM table_name1

UNION

SELECT column_name(s) FROM table_name2

13. SQL left join, etc.: http://www.cnblogs.com/afirefly/archive/2010/10/08/1845906.html

14. Character usage: sname like '王%' matches any character after '王'; sname like '王_' matches a character after '王'; if the pattern contains special characters, use escape symbols, use key The word escape is defined, such as:

15. Integrity constraints in SQL:

a) More constrained: define a new field COLOR

CERATE DOMAIN COLOR CHAR(6) DEFAULT '???' - set the color default to ? ? ?

CONSTRANINT COLORS—Indicates that the domain constraint is named colors

CHECK(VALUE  IN

(’Red’,’Yellow’.’Blue’.’Green’,???’))

b) Constraints of the basic table: primary key, foreign key, check (CHECK)

c) Assertions (ASSERTIONS):

CERATE ASSERTION <assertion name>CHEC0 (<condition>)

DROP ASSERTION <assertion name>

16. Security mechanisms in SQL: views, permissions, roles, auditing.

17. Integrity constraints in SQL: domain constraints, basic table constraints, assertions, triggers.

18. Permissions

a) User permissions (6 types): select, insert, delete, update, references, usage. References means that users are allowed to define new relationships and refer to primary keys of other relationships as foreign keys; usage allows users to use defined domains.

b) Authorization statement: GRANT<authority table>ON<database element>TO<username table>[WITH GEANT OPTION] WITH GEANT OPTION means that the obtained authority can also be transferred, and the installation authority can be granted to other users. like:  

Among them, ALL PRIVILEGES means to use all permissions (the above 6 types).

c) Recycling statement: REVOKE<authority table>ON<database element>FROM<username table>[RESTRICT|CASCADE] CASCADE means chain recycling, and recycling can only be performed when there is no chain recycling in RESTRICT. like:  

PUBLIC means that there are all current or possible future users.

19. The use of triggers; a trigger is a statement that is automatically executed by the system to modify the database. A trigger consists of three parts: event, condition and action.

a) Create a trigger: CERATE TRIGGER <name>

b) Drop a trigger: DROP TRIGGER <name>

20. Embedded SQL

a) Distinguish between SQL statement and main language statement (format): EXEC SQL<SQL statement>END_SQL (used in C language; not END_SQL)

b) Communication between the main language work unit and the database work unit:

i. SQL communication area (SQLCA): transmits the status information of SQL statement execution to the host language, so that the host language can control the program flow according to the secondary information.

ii. Shared variable (main variable): The main language provides parameters to the sql statement through the main variable, which is defined by the main language and explained by the DECLARE statement in sql.

c) Cursor (CURSOR): The main language is record-oriented and the sql language is set-oriented. Multiple records or specified records can be obtained through the cursor.

i. Define the cursor:

ii. Open the cursor:

iii. Advance the cursor: the cursor advances one line and sends the current value to the main variable,

iv. Close the cursor:

d) Dynamic SQL statement:

21. Stored procedure: a module written by SQL statements and flow control statements, which is compiled and optimized and stored in the database on the database server side. Has the following advantages:

a) Improve the running speed.

b) Enhanced the functionality and flexibility of SQL.

c) It can reduce the traffic of the network.

d) Reduce the workload of programming.

e) Indirect realization of security control functions.

f) Mask the details of the table to simplify user operations.

Chapter 10 System Development and Operation

1. The six phases of the software life cycle: project planning, requirements analysis, design, coding, testing, operation and maintenance.

2. Software development model:

a) Waterfall model: the earliest, using a structured analysis and design method.

b) Evolutionary model: Global development model, also called rapid prototyping model.

c) Spiral Model: Combined with waterfall model and rapid prototyping model, risk analysis is added, used with large-scale systems.

d) Fountain model: Driven by user needs and driven by objects, it is developed using surface-like objects.

3. The requirements analysis phase is an important phase of software engineering, which defines business requirements for a new system. The key to the requirements analysis phase is to describe what a system is, or what a system must do, not how the system should be implemented. Specifically, the requirements analysis phase needs to complete the following requirements:

? Determine the functional requirements and non-functional requirements of the software system;

? Analyzing the data requirements of the software system;

? Export the logical model of the system;

? Revise the project development plan;

• If necessary, a prototype system can be developed.

4. Software design can usually be divided into outline design and detailed design. The task of the outline design is to determine the structure of the software system, divide the modules, and determine the function, interface and calling relationship between modules. The main task of designing the structure of a software system is to determine the compositional relationship between modules.

5. System testing is to combine the software system with other factors such as hardware, peripherals, and network to conduct various assembly tests and confirmation tests of the information system. The purpose is to find out the developed system by comparing it with the system requirements Where it does not match or contradict user needs. Common system tests mainly include recovery testing, security testing, strength testing, performance testing, reliability testing and installation testing.

6. Software project estimation:

a) Lines of code, function points and workload estimation are the most basic project estimation content.

b) IBM Estimation Model: Static univariate model based on lines of code.

c) CoCoMo (constructive cost) model: It is divided into three levels: basic, intermediate and detailed, and software project types are divided into organizational, semi-independent and embedded.

d) Putnam model: Dynamic multivariate model.

7. Risk analysis:

a) Risk identification: performance risk, cost risk, support risk, schedule risk. Create a risk entry checklist.

b) Risk prediction: establish a risk table to estimate the impact of risks on the project.

c) Risk assessment: further review of the accuracy of the estimates made during the risk prediction phase, attempting to prioritize the risks identified and begin to consider how to control and/or avoid possible risks.

d) Risk control: risk avoidance, risk monitoring, risk management and monitoring plan.

8. Schedule management (arrangement) usually uses Grant (Gantt chart) and PERT (Program Review Technique) charts. PERT charts and Gantt charts are two commonly used project management tools. A PERT (Project Evaluation and Review Technology) diagram is a graphical network model that describes tasks and the relationships between tasks in a project. A Gantt chart is a simple horizontal bar chart that depicts project tasks relative to a calendar. The abscissa in the Gantt diagram represents time (such as hours, days, weeks, months, years, etc.), the ordinate represents tasks, and the horizontal line segment in the figure represents the schedule of a task. The starting point and end point of the line segment correspond to the time on the abscissa The time represents the start time and end time of the task respectively, and the length of the line segment represents the time required to complete the task.

9. Grant cannot reflect the dependencies between tasks.

10. PERT cannot reflect the parallelism between tasks.

11. The CMM is a description of the evolutionary stages of a software organization through which the capability of a software organization progresses step by step as the software organization defines, implements, measures, controls, and improves its software processes. The CMM divides the maturity of the software process into five levels, namely:

? Initial level. The software process is characterized by disorganization and sometimes confusion, with few clearly defined steps, and success depends entirely on individual efforts and heroic core tasks.

? Repeatability level. Basic project management processes are established to track cost, schedule, and functionality, and the necessary process discipline is in place to repeat past success on similar projects.

? Define the class. The software process for management and engineering has been documented, standardized, and integrated into a standard software process for the entire software development organization. All projects develop and maintain software using standard software processes with modifications.

• Management level. Detailed metrics for software engineering and product quality are developed. The quality of both the software process and the product is understood and controlled by the members of the development organization.

? Optimization level. Quantitative analysis is strengthened, and the process can be continuously improved through feedback from process quality and feedback from new ideas and new technologies.

12. Software development methods: structured method, data structure-oriented method, prototyping method, object modeling.

13. Software quality characteristics:

a) Level 1: Quality characteristics

b) The second layer: mass sub-characteristics

c) The third layer: metrics

14. Main tasks in the system analysis stage:

a) Conduct a detailed survey of the current system and collect data.

b) Create a logical model of the current system

c) Analyze the current situation, put forward suggestions for improvement and the goals that the new system should achieve

d) Create a logical model of the new system

e) Compile the specification of the system scheme

15. System analysis method:

a) Structural analysis method

b) Object-oriented analysis method

16. Data structure-oriented analysis and design (Jackson): The design principle is to make the program structure correspond to the data structure (problem structure); use the data structure as the basis of design, and derive the program structure according to the input/output data structure. Large data processing system.

17. UML:

a) Use case diagrams; static diagrams (class diagrams, object diagrams, package diagrams); behavior diagrams (state diagrams, activity diagrams); interaction diagrams (sequence diagrams, collaboration diagrams); implementation diagrams (construction diagrams, deployment diagrams).

   

18. Aggregation relationship-whole-part relationship; generalization relationship-general-specific relationship.

19. Software testing:

a) White box testing (structural testing): design test cases based on the internal structure and logical structure of the program and related information, and check whether all logical paths in the program meet the requirements.

b) Black-box testing (behavioral testing): It is not necessary to consider the logical structure and internal characteristics of the program, but only to check whether the requirements are met according to the requirements specification of the program.

20. CVS is a version control tool.

Chapter 11 Database Design

1. Database system life cycle: database planning, requirements analysis and collection, database design, database system implementation, testing phase, operation and maintenance

2. Data dictionary: It is the sorting and description of user information requirements (needs analysis stage). Includes data items, data structures, data flow, data storage and processing.

3. Tasks in the requirements analysis stage: ○1 Analyze user activities and generate business flow diagrams; ○2 Determine system scope and generate system correlation diagrams; ○3 Analyze data involved in user activities and generate data flow diagrams; ○4 Analyze system data generation Data Dictionary.

4. The result of the requirements analysis stage is the system specification, including data flow diagram, data dictionary and various explanatory documents.

5. Data Flow Diagram (DFD): The top-level DFD determines the system boundary and regards the system to be developed as a process, so there is only one process and some external entities and the input and output data flow between the two. Level 0 DFD determines data storage.

6. Data structure-oriented method (Jackson method)

a) Design idea: Based on the data structure, it derives the program structure according to the input/output data structure, which is suitable for small-scale data processing systems.

b) Basic idea: derive its program structure from the data structure of the problem. As an independent system design method, it is mainly used for the development of small-scale data processing.

c) The starting point for considering the problem is: data structure.

d) Final goal: to obtain a procedural description of the program.

e) Best scope of application: In the detailed design, determine the logical process of some or all modules.

f) Obey the principle of "top-down" step-by-step refinement of structural programming, and use it as a common basis;

g) A set of mapping rules subject to the structure of the exported program.

7. Notes for drawing DFD:

1) The data flow, processing, data storage and external entities should be appropriately named, and the name should reflect the actual meaning of the component, and avoid using empty names.

2) Draw a data flow diagram, not a control flow diagram.

3) A processed output data stream should not have the same name as the input data stream, even if their composition is exactly the same.

4) Allow one processing to have multiple data flows to another processing, and also allow one processing to have two identical output data flows to different processing.

5) Keep the balance between parent graph and sub graph. That is to say, the input and output streams of a process in the parent graph must be the same in number and name as the input and output data streams of its subgraph. It is worth noting that if one input (output) data stream in the parent graph corresponds to several input (output) data streams in the sub-graph, and all the data items that make up these data streams in the sub-graph are exactly This one data flow, then they are still considered balanced.

6) In the top-down decomposition process, if a data store is only related to one process when it first appears, then this data store should be used as the internal file of this process without drawing.

7) Maintain data conservation, that is, the data in all output data streams of a process must be directly obtained from the output stream of the process, or through the data that can be generated by the process. 8) Each process must have both an input data stream and an output data stream.

9) In the entire set of data flow diagrams, each data store must have both a read data flow and a write data flow. But in a certain subgraph, there may be only reading but no writing, or only writing but no reading.

10) The data flow must be processed (that is, there cannot be data flow between external entities and external entities, external entities and data storage)

8. Conceptual design stage - ER diagram

a) Three methods of abstracting real things: classification (inherent common features and behaviors, such as: students and teachers are different classifications), aggregation (defining the attributes of a certain type, such as: student number , name, etc.) and generalization (defining a new type from a known type, that is, obtaining a subcategory, such as: graduate student is a subcategory of student, which extends from the student type).

b) Create a conceptual model with ER diagrams:

i. Data abstraction: According to the data flow diagram, use the above three abstraction methods to abstract, from high-level (general reference to data) to low-level (more detailed).

ii. Design the partial conceptual model: determine the entities in the partial application, the attributes of the entities, the identifiers of the entities and the links between the entities. Note: 1) Attributes cannot be subdivided; 2) Attributes cannot be directly related to other entities.

iii. Synthesize the local model into a global model: among them, conflicts should be eliminated, such as attribute conflicts (types, etc.), structural conflicts (different abstractions, different attribute components, etc.) and naming conflicts (entity names, attribute names, and contact names, etc.).

iv. Optimization of the global ER model

○1 Merge Entity Type:

○2 Eliminate redundant attributes

○3 Eliminate redundant connections

9. Logical design stage - conversion of ER diagram to relational mode

a) The main tasks in the logic design stage: determine the data model, convert the ER model into the formulated data model, determine the integrity constraints, and determine the user view.

b) Conversion of ER diagrams to relational schemas (transformed into computer-recognizable ones):

i. Conversion of entity types: convert each entity type into a relational schema, the entity name corresponds to the schema name, the attribute corresponds to the attribute of the schema, and the entity identifier corresponds to the key of the schema.

ii. Conversion of connection type (binary connection): ○1 If the connection between entities is 1:1, add the other’s primary key (as the foreign keys) and properties of the relationship. ○2 If the relationship between the entity types is 1:N, then add the primary key of the entity type at the 1 terminal (as the foreign key of the current mode) and the attributes of the relationship to the model converted from the N terminal. ○3 If the connection between entities is M:N, then the connection type is also converted into a relational model, its attribute is the key of the entity type at both ends of the entity plus the implementation of the connection type, and the primary key is the combination between the entities at both ends ( At the same time, the two primary keys are also foreign keys).

iii. Conversion of ternary relationship: ○1 If the relationship between entities is 1:1:1, add the primary key (as the foreign key of the current mode) and the relationship type of the other two to any one of the converted 3 modes properties. ○2 If the connection between entities is 1:1:N, then add two primary keys of 1-side (as the foreign key of the current mode) and attributes of the connection type to the N-side. ○3 If the connection between entities is 1:N:M, then the connection type should also be converted into a relational model, and its attribute is the primary key of the entity type at the M end and the N end (as a foreign key) plus the attribute of the connection type. The primary key is A combination of primary bonds at the M and N terminals. ○4 If the connection between entities is M:N:P, the connection type is also converted into a relational model, and its attribute is the primary key of the three-terminal entity type (as a foreign key) plus the attribute of the connection type, and the primary key is the three-terminal entity A combination of primary keys.

c) Normalization of relational schema

i. Determine the data dependencies of relational schemas based on semantics.

ii. Determine the paradigm of a relational schema based on data dependencies.

iii. If the requirements are not met, decompose according to the decomposition algorithm of the pattern to reach 3NF, BCNF or 4NF.

iv. Evaluation and revision of relational schemas. Eliminate redundant update exceptions, etc.

d) Identify integrity constraints.

e) Determine the user view (design subschema). Improve data security and independence.

10. Physical design stage - database storage structure and access method (determine data distribution, determine storage structure, determine access method)

a) Structure design of storage records

b) Determine where the data is stored

c) Design of access method

d) Integrity and security considerations

e) Program design

11. Implementation of the database:

a) Define the structure of the database with DDL

b) Organize data into database

c) Compiling and debugging applications

d) Database test run

12. Database security measures:

a) Authority mechanism

b) view mechanism

c) Data encryption

13. Input and output errors may occur when drawing data flow diagram processing:

a) There is only input but no output or a black hole;

b) only output but no input or miracle

c) The input data flow cannot be processed to generate an output flow or it is a gray hole

d) The input data stream has the same name as the output data stream

14. Concurrency control of the database:

a) Problems caused by concurrent operations: data inconsistency (lost modification, read dirty data and non-repeatable read problems).

b) Solution to the problem: start by ensuring the isolation of things.

c) Focus on problem solving: things interfere with each other without control when reading data.

d) Blocking protocol: two-stage blocking protocol, which shortens the lock holding time, improves concurrency, and solves data inconsistency at the same time. For the correct use of concurrent scheduling of things, two-stage blocking protocols are used.

e) Serializability (property) is the correctness criterion for concurrent things.

15. A class diagram is a diagram that shows a set of classes, interfaces, collaborations, and the relationships between them. Class diagrams are used to model the static design view of the system. When modeling a static view of a system, class diagrams are typically used in one of three ways.

1) Model the vocabulary of the system.

2) Model simple collaborations.

3) Model the logical database schema. Think of a schema as a blueprint for the conceptual design of a database. In many fields, to store persistent information in relational or object-oriented databases, you can use class diagrams to model the schema of these databases.

16. A state diagram shows a state machine consisting of states, transitions, events, and activities. A dynamic view of the system is illustrated with a state diagram. Statecharts are very important for modeling the behavior of an interface, class or collaboration. A state diagram emphasizes the behavior of an object in a sequence of events.

17. Activity diagrams show the flow from activity to activity. An activity diagram shows a set of activities, a sequential or branching flow from activity to activity, and the objects on which actions occur or upon which actions are imposed. Illustrate the dynamic view of the system with an activity diagram. Activity diagrams are very important for modeling the functionality of the system. Activity diagrams emphasize the flow of control between objects.

Chapter 12 Networks and Databases

1. Distributed databases should have the characteristics of site transparency and decentralized storage.

2. A fully distributed database should satisfy:

a) Distribution

b) Logical dependencies

c) Site Transparency

d) Site autonomy

3. Features of distributed databases:

a) Centralized control of data

b) Data independence

c) Data redundancy reliability

d) Site autonomy

e) Validity of access

4. Architecture of distributed database: four-layer model structure - global outer layer, global concept layer, local concept layer, and local inner layer.

5. There are two-phase commit protocol (2PC) and three-phase commit protocol (3PC) for distributed transactions.

a) 2PC: coordinator and participant, only the coordinator has the voting right to submit and revoke the transaction, and the steps are to vote first and then execute.

b) 3PC: On the basis of 2PC, two messages of global pre-submission and preparation are added to confirm the status of all participants.

c) Distributed transaction failures have more communication failures (media failures, system failures, transaction failures) than centralized transaction failures.

6. Transparency of distributed databases:

a) Distribution transparency: users do not need to care about the details of the logical partitioning of data and the physical location distribution of data, as well as the issues of data consistency and the data model supported by local databases.

b) Sharding transparency: users don't have to care about the logical sharding of data.

c) Location transparency: users do not need to care about the details of data physical location allocation.

d) Replication transparency: users do not need to care about the replication of the database at each node in the network, and the updated data to be replicated is automatically completed by the system.

e) Partial data model transparency: It does not need to care about which data model the local data uses.

7. Transfer data between XML and database: template-driven and model-driven.

Chapter 13 Database Development Trends and New Technologies

1. Data transfer technology:

a) The purpose of data transfer in the data warehouse are: to improve the quality of data in the data warehouse and to improve the availability of data in the data warehouse.

b) Type of data transfer:

1) Simple transfer: Simple transfer is the basic unit of all data transfer.

2) Cleaning: To ensure consistent formatting and use of a field or group of fields.

3) Integration: Take business data from one or several sources, and map the data field by field to the new data structure of the data warehouse.

4) Aggregation and generalization: Find sporadic data from the business environment and compress it into fewer data blocks in the data warehouse.

2. The object-oriented database introduces two structural data types: array (collection) type and structure type. Convert composite attributes (such as dates) to struct types, and multi-valued attributes to collection types.

3. The goal of parallel databases: to achieve high performance, high availability, and scalability.

4. Parallel database architecture:

a) Shared memory structure

b) Shared disk structure

c) No shared resource structure (with dedicated processor, memory and disk).

5. Object-relational database system:

a) Nesting relationship:

b) Composite type: setof

c) Inheritance type: under

d) Reference type: ref

6. The development of enterprise resource planning has gone through basic MPR, closed-loop MPR, MPR-II and ERP. The basic MPR focuses on the management of the enterprise's material demand planning, and the closed-loop MPR emphasizes the impact of production capacity on demand planning. The MPR-II stage When it revolves around the basic business objectives of the enterprise and takes production planning as the main line, ERP theory takes one center (centered on profit), two types of business (planning and execution) and three main lines (supply chain management, production management and financial management) , and tie the three trunks tightly together.

7. Decision support system (DSS) is composed of database subsystem, model library subsystem, human-computer interaction system (organic combination of model library and database), and users. DSS adds a model repository and a model repository management system. On the basis of DSS, the knowledge base subsystem (including knowledge base and inference engine+) is added to form an intelligent DSS.

Chapter 14 Basic Knowledge of Intellectual Property

1. Term of Protection:

2. The concept of timeliness of intellectual property rights: Intellectual property rights have a statutory protection period. Once the protection period expires, the rights will automatically terminate and become knowledge that the public can freely use. As for the length of the period, according to the laws of each country. The protection period of invention patents in my country is 20 years, and the protection period of utility model patents and design patents is 10 years, both calculated from the date of patent application; the protection period of the publication right of Chinese citizens is the life of the author and 50 days after his death. Year. The term of protection of trademark rights in my country is 10 years from the date of approval of registration, but the term of rights can be renewed indefinitely according to the needs of the owner, and the renewal registration is applied for within 6 months before the expiration of the term. The validity period of each renewal registration 10 years, unlimited number of renewal registrations. If the trademark owner fails to renew the registration within the time limit, the trademark right will also be terminated. The period of legal protection of commercial secrets is uncertain. Once the secret is known to the public, it becomes knowledge that the public can use freely.

3. Determination of intellectual property rights owner:

4. Judgment of infringement:

5. Conditions for granting patent rights: novelty, inventiveness, and practicality.

6. The basic features of the patent system are: legal protection, scientific examination, public notification and international exchange.

Chapter 15 Basic Knowledge of Standardization

1. Basic concepts of standardization:

2. The essence of standardization: to achieve unification through the formulation, release and implementation of standards.

3. The purpose of standardization: to obtain the best order and social benefits.

4. Principles for setting standards:

a) Proceed from the interests of the overall situation and conscientiously implement the national technical and economic policies.

b) Fully meet the requirements for use.

c) It is conducive to promoting the development of economy and technology.

5. Stages of standard development: application, preparation, committee, review, release stages.

6. Standard update: standard review (the review cycle generally does not exceed five years), standard confirmation, and standard revision.

7. Classification of standards:

a) According to the scope of application: international (ISO/IEC International Organization for Standardization), national, regional, industry, local, enterprise, project specifications.

b) By nature: technical standards, management standards and work standards.

c) By function: basic standards, product standards, method standards, safety standards, hygienic standards...

d) Legally binding: mandatory standards, recommended standards.

8. Number of the standard:

a) International and overseas: standard code + professional class number + sequential year number + era number

b) my country: mandatory use of GB; recommended use of GB/T

9. Common standardization organizations:

10. Our standard classification:

a) National standard: (cardboard) QB 1457-1992; QB/T 1315-1991; GSB (national standard).

b) Industry standards: education industry standards (JY), financial industry standards (JR), some communication industry standards (YD), pharmaceutical industry standards (YY), aerospace (QJ), electronics industry (SJ), machinery industry (JB) .

c) Local standards: DB plus the first two digits of the province code; Shanghai DB31, Chongqing DB55, Beijing DB11.

d) Enterprise standard: Q+/+enterprise code.

11. Standardization of software engineering:

12. The national standard - "Computer Software Documentation Specification", generally produces 14 documents, among which the management personnel mainly use: project development plan, feasibility study report, template development file, monthly development progress report, project development summary report . Developers use: software requirements specification, project development plan, template development file, data requirements specification, outline design specification, detailed design specification, database design specification, test plan and test analysis report. Maintenance personnel use: design specification, template development file and test analysis report.

13. Standard review: ISO standards are reviewed every 5 years, with an average age of 4.92 years, and the validity period of my standards is generally 5 years.

Guess you like

Origin blog.csdn.net/m0_68557563/article/details/126076710