2023 System Analyst---Project Management

1. Project management

1. Basic concepts

  1. Scope Management: Determine the boundaries of the project, that is, what work should be done in the project and what work should not be included in the project
  2. Inputs to scope definition include: project charter, project scope management plan, approved change requests, organizational process assets
  3. The role of WBS includes: easy to estimate, clarify the scope, prevent demand from spreading, and the bottom layer is the work package

1. Time management

  1. The process of time management includes: definition of activities, sequencing of activities, asset estimation of activities, duration estimation of activities, planning, schedule control
  2. Three-point estimation method formula: (optimistic time + 4 * most likely time + pessimistic time) /6
  3. Progress Control
    1. Key points for judging needs:
      1. Step 1: Determine whether the delayed activity is a critical activity, and if it is a critical activity, it needs to be accelerated
      2. Step 2: If the delayed activity is not a key activity, then judge [whether the deviation is greater than the total time difference], if it is greater, speed up the progress
      3. Step 3: When the deviation is not greater than the total time difference, further judge whether the deviation is greater than the free time difference. If it is greater than the free time difference, do not adjust the plan first, but strengthen supervision
    1. Means to speed up progress:
      1. Crash work: increase resources, work overtime, add people
      2. Quick Follow-Up: Parallel Execution of Activities
  1. Progress Network Diagram---Critical Path Method (PERT)
    1. The critical path is a schedule network analysis technique used when making a schedule plan. The critical path method performs forward and reverse analysis along the project schedule network path to calculate the theoretical earliest start and completion dates of all planned activities, and the latest Late start and finish dates, regardless of any resource constraints.
    2. Total Slack (Slack Time): The maneuvering time for this activity without delaying the duration. The total time difference of an activity is equal to the difference between the latest completion event of the activity and the earliest completion time of the activity, or the difference between the latest start time of the activity and the earliest start time of the activity
    3. free time difference
      1. The maneuvering time for this activity without affecting the earliest start time of the next or
      2. For an activity with a successor activity, its free time difference is equal to the minimum of the difference between the earliest start time of all the successor activities minus the earliest completion event of the activity
      3. For an activity without a successor activity, that is, an activity whose completion node is the end node of the network plan, its free time difference is equal to the difference between the planned duration and the earliest completion time of this activity
      4. For the activities whose end node is the completion node in the network plan, the free time difference is equal to the total time difference. In addition, since the free time difference of an activity is a component of its total time difference, when the total time difference of an activity is zero, its free time difference must be zero, and no special calculation is necessary.
  1. Gantt diagram
    1. Advantages: Gantt chart is intuitive, simple, easy to manufacture, easy to understand, and can clearly mark the start and end time of each item. It is generally suitable for relatively simple small projects and can be used for any level of WBS, progress control, Resource optimization, resource and cost planning
    2. Disadvantages: It is impossible to systematically express the complex relationship between the various tasks included in a project, it is difficult to perform quantitative calculation and analysis, and plan optimization, etc.
    3. compared to PERT
      1. Based on the network diagram, the PERT diagram can express the complex logical relationship between activities
      2. The Gantt diagram is simple and intuitive, but it cannot express the complex logical relationship between activities
      3. The PERT diagram mainly describes the dependencies between different tasks, and the Gantt diagram mainly describes the overlapping relationship between different tasks.

3. Software Process Improvement CMMI

  1. Optimization level: organizational reform and implementation, causal analysis and solutions, key features: continuous optimization
  2. Quantitative management level: organizational process performance, quantitative project management; key feature: quantitative management
  3. Defined levels:
    1. Requirements development, technical solutions, product integration, verification, validation, organizational process focus, organizational process definition, organizational training, integrated project management, risk management, integrated team, decision analysis and solutions, organizational integrated environment; Key features: organization-level, document-level standardization
  1. Managed Level: Requirements Management, Project Planning, Configuration Management, Project Monitoring and Control, Supplier Contract Management, Measurement and Analysis, Process and Product Quality Assurance; Key Feature: Project Level Repeatable
  2. chaos level

2. Computer Composition and Architecture

1. Classification of computer architecture (Flynn taxonomy)

  1. single instruction stream single data stream
  2. single instruction stream read data stream
  3. Multiple instruction stream single data stream
  4. Multiple instruction stream multiple data stream

2. CISC and RISC

  1. Hardwired logic level: This is the core of the computer, consisting of logic circuits such as gates and flip-flops
  2. Microprogram level: The machine language at this level is a microinstruction set, and the microprograms used by programmers are generally directly executed by hardware
  3. Traditional machine level: The machine language at this level is the instruction set of the machine, and programs written by programmers using machine instructions can be interpreted by microprograms

3. Assembly line

Concept: pipeline execution time calculation, pipeline throughput rate, pipeline speedup ratio, pipeline efficiency

  1. Pipeline establishment time: 1 instruction execution time, denoted as t
  2. Pipeline cycle: the segment with the longest execution time
  3. Pipeline execution time (theoretical formula): (t1+...tn)+(n-1)*t
  4. Pipeline execution time (practical formula): k*tm+(n-1)*t
  5. Pipeline throughput rate: TP=number of instructions/pipeline execution time
  6. The maximum throughput of the pipeline = 1/tm
  7. Pipeline speedup: sequential execution time/pipeline execution time

4. Storage system

Concept: Cache, associative memory is accessed by content, generally used for Cache

  1. Related concepts of Cache:
    1. Function: Increase the rate of CPU data input and output, breaking through von Neumann, that is, the limitation of data transmission bandwidth between CPU and storage system
    2. In the computer storage system architecture, Cache is the fastest access level except for registers.
    3. The basis for using Cache to improve system performance is the principle of program locality
      1. Temporal locality: Once an instruction in the program is executed, it may be executed again in the near future
      2. Spatial locality: Once a program accesses a certain storage unit, its nearby storage units will also be accessed soon, that is, the addresses accessed by the program within a period of time may be concentrated within a certain range
  1. Cache is transparent to programmers (address mapping is done directly by hardware)
    1. Directly associated image: the hardware circuit is simple, but the conflict rate is high
    2. Fully associative image: the circuit is difficult to design and implement, only suitable for small-capacity cache, and the conflict rate is low
    3. Set associative image: a compromise between direct associative and fully associative
  1. The average access time, if h represents the hit rate of access to Cache, (1-h) is called the failure rate (miss rate), t1 represents the cycle time of Cache, and t2 represents the cycle time of the main memory, taking the read operation as an example , the average cycle of the system using Cache+main memory is t3, then t3=h*t1+(1-h)*t2
  2. Cache page elimination algorithm, random algorithm, first in first out algorithm, least recently used algorithm, least frequently used page replacement;
  3. Cache read and write process:
    1. Write through: Write to Cache and memory at the same time
    2. Write back: only write to Cache, when eliminating pages, write back to memory
    3. Notation method: only write to the memory, and clear the flag bit to 0, if the second data is used, it needs to be called again

Disk structure and parameters

  1. Access time = seek time + waiting time, seek time refers to the time required for the magnetic head to move to the track; waiting time is the time it takes for the sector waiting for reading and writing to go to the bottom of the magnetic head, and sometimes it is necessary to add data transmission time
  2. In the process of processing, if there is a use of the buffer area, it is necessary to understand that the single buffer area can only be used by one process at a time, that is, the data cannot be read from the buffer when transferring data to the buffer, and vice versa
  3. For the optimization of disk storage, it is because the magnetic head keeps rotating. When the read data is transmitted or processed, the magnetic head will move to the advanced position, and it needs to continue to rotate to return to the logical next disk block. Optimizing storage is to adjust the disk block. Position, so that the logical next disk block is placed at the position where the head will start reading the logical block
  4. Disk scheduling algorithm: first come, first serve FCFS; shortest seek time priority SSTF, scanning algorithm; circular scanning
  5. single and double buffer
    1. Concept: Single buffer: In the case of a single buffer, whenever a user process issues an IO request, the OS allocates a buffer for it in main memory. When the block device is input, it is assumed that the time to input a piece of data from the disk to the buffer is T, the time for the OS to transfer the data in the buffer to the user area is M, and the processing time for the CPU to process this piece of data is C, T and C are parallelizable
  1. Double buffer:
    1. Since buffers are shared resources, producers and consumers must be mutually exclusive when using buffers
    2. If the consumer has not taken the data in the buffer, the producer produces new data and cannot send it into the buffer, so two buffers are set.

5. System configuration and performance evaluation

1. Performance indicators

  1. Main frequency and CPU clock cycle (Clock Cycle): The main frequency is also called clock frequency, and the clock cycle is the reciprocal of the clock frequency; main frequency = FSB * multiplier
  2. Instruction cycle: the time to fetch and execute an instruction
  3. Bus Cycle (Bus Cycle): that is, the time it takes for an access memory or IO port operation
  4. The relationship between instruction cycle, bus cycle and clock cycle: an instruction cycle consists of several bus cycles, and a bus cycle time contains several clock cycles (that is to say, one instruction cycle contains several clock cycles)
  5. MIPS: The number of million-level machine language instructions processed per second, mainly used to measure the performance of scalar machines
  6. MFLOPS: Millions of floating-point numbers per second, which cannot reflect the overall situation, but can only reflect floating-point operations, and are mainly used to measure the performance of t-vector machines

2. Performance evaluation method

  1. Clock frequency method: measure speed by clock frequency
  2. Instruction execution speed method: the unit that expresses the machine operation speed is MIPS
  3. Equivalent instruction speed method: calculated by the proportion of various instructions in the program. Consider: the problem of different instruction ratios
  4. Data processing rate method: PDR value method to measure machine performance, the larger the PDR value, the better the machine performance; consider: CPU+storage
  5. Comprehensive theoretical performance method:
  6. Benchmark program method: The most frequently used part of the core program in the application program is used as the standard program for evaluating the performance of the computer system, which is called the benchmark test program. Benchmark method is a better method for testing system performance

6. Operating system

1. Process management

  1. Processes and threads (**)
    1. The basic unit of process resource allocation and scheduling, thread is the basic unit of scheduling, not the basic unit of resource allocation, some resources can be shared
    2. Each thread in the same process can share various resources of the process, such as memory address space, code, data, files, etc. The communication and exchange between threads is very convenient. All threads have their own independent cpu running context and stack, which cannot be shared
  1. Semaphore and PV operation
    1. Related concepts: mutual exclusion, synchronization, critical resources, critical sections, semaphores
      1. Mutual exclusion: Thousands of troops cross a single-plank bridge, the competitive relationship of similar resources (resource constraints, indirect constraints)
      2. Synchronization: there is a difference in speed, stop and wait in a certain situation, collaborative relationship between processes (process constraints, direct constraints)
      3. Critical resources: resources that need to be read and shared between processes in a mutually exclusive manner, such as printers, tape drives, etc.
      4. Semaphore: It is a special variable. When the semaphore is less than 0, it can also indicate the number of queued processes
    1. The process of PVC operation"
  1. Precursor map:
    1. Precursor map means: starting point ---> end point, all the arrows in the precursor map can be expressed in this form, and one arrow represents one precursor relationship.
    2. Mark the semaphore for the arrow line, the starting position of the arrow line is the V operation (that is, the V operation is used to notify the successor activity after the predecessor activity is completed); the end position of the arrow line is the P operation (that is, the P operation is used to check whether the predecessor activity is before the successor activity starts. Finish). Combine the predecessor diagram with PV, mark the semaphore according to the arrow in the predecessor diagram, and then fill in the blanks according to the process diagram.
  1. Deadlock and the banker's algorithm
    1. The process and resource allocation given in the pattern question stem determine the minimum number of resources or other parameters that form a deadlock; corresponding to this situation, when allocating resources, each process gets the number of resources that can complete the process minus one, and at this time a deadlock is formed In the worst case of , in this case, one more resource can solve the deadlock problem, that is, it is impossible to form a deadlock. Assuming that m processes each need w R resources, and the system has n R resources in total, the condition that it is impossible to form a deadlock at this time is: m*(w-1)+1<=n
    2. Banker's Algorithm: Judging the current remaining resources of the system; judging the number of resources still needed by each process; deadlock if the number of resources still needed by the current execution process exceeds the remaining system resources; if it does not exceed, the process will be executed; after the process is executed, all resources of the process will be released (The number of remaining resources in the current system is: remaining resources in the early stage of the system + allocated resources in the early stage of the current process).
    3. According to the banker's algorithm, it is judged whether the relevant process program sequence will form a deadlock, if it is, it is an unsafe sequence, and if all processes can be executed normally, it is a safe sequence.

2. Storage management

  1. Segment page storage:
    1. When the page size is known, the length of the address in the page can be judged in turn, and the page number of the address can be known accordingly
    2. The conversion between page number and page frame number can be done by looking up the table
    3. Segment address format, the segment address offset following the segment number cannot exceed the segment length
    4. Paging storage: Divide the program and memory into blocks of the same size, and load the program into the memory in units of pages
    5. Segment storage: divide the logical space according to the natural segments in the user's job, and then load it into the memory. The length of the segment can be different
    6. Segmented page storage: a combination of segmented and paged, first segmented and then paged. A program has several segments, each segment can have several pages, each page has the same size, but each segment has a different size.
  1. Page replacement algorithm:
    1. When the page is eliminated, the main principle is: first eliminate the one that has not been accessed recently (the access bit is 0), and secondly eliminate the one that has not been modified (that is, the modification bit is 0, because the modified page)

7. Embedded Technology

1. Classification of Embedded Microservice Processors

  1. Embedded microcontroller (MCU: micro controller unit): also known as single-chip microcomputer, on-chip peripheral resources are generally rich, suitable for controlling
  2. Embedded microprocessor (EMPU: Micro Processing Unit): called a single-chip microcomputer, developed from the CPU in a general-purpose computer, and only retains functional hardware closely related to embedded applications
  3. Embedded DSP processor (DSP: Digital Signal Processor): a processor dedicated to signal processing
  4. Embedded system-on-chip (SOC): an integrated container that pursues the maximum tolerance of product systems
  5. Successfully realized the seamless combination of software and hardware, directly embedded the code module of the operating system in the microprocessor chip
  6. The volume and power consumption of the system are reduced, and the reliability and design productivity are improved.

2. Embedded microprocessor architecture

  1. von Neumann structure
    1. Von Neumann architecture, also known as Princeton architecture, is a memory structure that combines program instruction memory and data memory. Generally used for PC processors; two instructions and data memory are combined; three instructions and data are transmitted through the same data bus;
  1. The Harvard structure is a memory structure that separates program instruction storage and data storage. The Harvard structure is a parallel architecture. Its main feature is that the program and data are stored in different storage spaces, that is, the program memory and the data memory are two independent memories, and each memory is independently addressed and accessed independently.
    1. Generally used in embedded system processors (DSP)
    2. Instructions and data are stored separately, can be read in parallel, and have a higher data throughput rate
    3. Data bus and address bus with instructions and data 4 buses

3. Embedded system software

  1. basic concept
    1. Embedded system is an application-centric, computer-based system that can adapt to the requirements of different applications in terms of function, reliability, cost, volume and power consumption, and integrates configurable and tailorable software and hardware. dedicated computer system.
    2. The characteristics of embedded systems: small scale, difficult development, high real-time and reliability requirements, solid storage requirements, etc.
  1. Embedded system software classification:
    1. According to the sensitivity of the system to events, embedded systems can be divided into:
      1. Embedded Systems
      2. Embedded real-time system: strong real-time system, weak real-time system
    1. From the perspective of security requirements, embedded systems can also be divided into:
      1. safety critical system
      2. non-safety critical system

8. Database system:

1. Database schema

  1. Architecture:
    1. Three-level mode: the outer mode corresponds to the view, the mode (also known as the conceptual mode) corresponds to the database table, and the inner mode corresponds to the physical file
    2. Two-layer image: external mode-mode image, mode-inner mode image; the two-layer image can ensure that the data in the database has high logical independence and physical independence
    3. Logical independence: When the logical structure changes, the call of the external mode by the user program does not need to be modified. Guaranteed by foreign schema-schema images.
    4. Physical independence: That is, when the internal schema of the database changes, the logical structure of the data remains unchanged. Guaranteed by schema-intra-schema images
  1. view:
    1. Database View: It is a virtual table (logical table) whose content is defined by a query (only SQL query statements are saved). Like real tables, views contain a series of named columns and rows of data. However, the view does not actually store this data, but dynamically generates the required data by querying the original table
    2. Advantages of views: views can simplify user operations, views enable users to view the same data from multiple perspectives, views provide a certain degree of logic for refactoring databases, and views can provide security protection for confidential data
    3. Materialized view: It is not a virtual view in the traditional sense, but a materialized view, which stores data itself. At the same time, when the data in the original table is updated, the materialized view will also be updated.

2. Relational Algebra

  1. And: (the result is the sum of the two ancestors to remove duplicate rows)
  2. Cross: (results are duplicate rows)
  3. Difference: (the former removes both duplicate rows)
  4. Cartesian product: The number of result columns is the sum of the column numbers of the two attributes, and the number of rows is the product of the row numbers of the two tuples. Cartesian product of two tables, the tuples of the result table are spliced ​​from the tuples of the front table and the back table, and different permutations and combinations form different result tuples.
  5. Projection: filter qualified attribute columns
  6. Selection: filter qualified ancestors, attribute names can be marked with serial numbers, and appear directly in the expression in the form of numbers.
  7. Natural join: the number of result columns is the sum of the number of attribute columns of the two minus the repeated columns, and the number of rows is the result tuple of the two attribute columns with the same name and the same value. Combination representations of Cartesian products, selections, and projections can be equivalent to natural joins
  8. The condition of ordinary connection will be written, if not written, it will be expressed as natural connection.

3. Normalization Theory

  1. Problems with non-planning:
    1. The normalization process is to solve data redundancy, delete exceptions, insert exceptions, update exceptions (modification operation consistency issues), etc.
    2. Data redundancy: more data is stored repeatedly, wasting storage space
    3. Update exception (causing inconsistency in modification operations): If you do not pay attention, some data will be modified and other data will not be modified, resulting in inconsistency in data modification
    4. Insert exception: To provide a primary key, when the primary key is empty, the insert operation cannot be performed
    5. Deletion exception: When deleting part of the information, the entire record will be deleted, and the original record cannot be found
  1. normalization
    1. functional dependencies
      1. partial functional dependency
      2. transitive functional dependencies
    1. ArmStrong axioms
      1. Reflexivity: If Y<=X<=U, then X->Y is established
      2. Augmentation: If Z<=U and X->Y, then XZ->YZ is established
      3. Transitivity: If X->Y and Y->Z, then X->Z is established
      4. Merger rules
      5. pseudo transitive rules
      6. Decomposition rules
    1. Keys and properties:
      1. Concept: A candidate key (candidate code) is a combination of attributes that can uniquely identify a tuple without redundancy. There can be many different candidate keys, and one of them can be selected as the primary key. Candidate keys can be obtained by using the graphic method to find the attribute with an in-degree of 0, and based on this, expand it, and finally find the minimum attribute combination that can traverse the entire graph as a candidate key. For an in-degree of 0, you can Interpreted as never appearing to the right of the arrow.
      2. The attributes that make up the candidate code are the main attributes, and the others are non-main attributes
      3. Foreign keys are primary keys of other relational schemas
      4. Find the primary key: use the graphic method to find the primary key
      5. Find foreign keys: foreign keys are the primary keys of other relational schemas
      6. Full code: All attribute groups of the relational model are candidate codes for this relational model, called full codes
      7. Integrity constraints: understand the concepts related to primary key and foreign key, and make relevant judgments based on the question stem
    1. Paradigm: Normalization is to understand data redundancy, deletion exceptions, insertion exceptions, update exceptions, etc.
      1. First Normal Form (1NF): In relational mode R, if and only if all fields contain only atomic values, that is, each attribute is an indivisible data item, then it is called relational mode R is the first normal form
      2. Second Normal Form (2NF): If and only if the relational schema R is in the first normal form (1NF), and each non-primary attribute is completely dependent on the candidate key (no incomplete dependence), then the relational schema R is said to be in the second normal form.
      3. Third Normal Form (3NF): If and only if the relational schema R is in the second normal form (2NF), and there are no non-key attributes in R that are transitively dependent on candidate keys, then the relational schema R is said to be in the third normal form
      4. Fourth Normal Form (BCNF): Suppose R is a relational schema, F is its dependency set, if and only if each dependent determinant in F must contain a candidate code of R, R belongs to BCNF
      5. Problem-solving ideas: Understand the concepts related to normalization theory. When the degree of normalization does not reach 3NF, it is generally believed that there will be data redundancy, modification exceptions, insertion exceptions, and deletion exceptions. For the solution to related problems, it is generally to decompose the table into schema, so as to improve its normalization degree to 3NF
    1. Schema Decomposition: Normalization Process - Removing Tables is Decomposing Relational Schemas
      1. Lossless decomposition: After decomposing a relational schema into several relational schemas, the original relational schema can still be restored through operations such as natural connection and projection:
        1. Formula method, theorem: If R is decomposed into p->{R1, R2}, and F is the set of functional dependencies satisfied by R, the necessary and sufficient conditions for decomposing p to have lossless connectivity are:
      1. keep functional dependencies

4. Concurrency control:

  1. Characteristics of Transactions (ACID)
    1. Atomicity: All operations in the entire transaction are either completed or not completed, and it is impossible to stagnate in a certain link in the middle.
    2. Consistency: A transaction can encapsulate state changes (unless it is a read-only one). Regardless of how many concurrent transactions are at any given time, transactions must always keep the system in a consistent state
    3. Isolation: Execute transactions in isolation, making them appear to be the only operations performed by the system at a given time
    4. Durability: After the transaction is completed, the changes made by the transaction to the database are permanently saved in the database and will not be rolled back.
  1. concurrency issues
    1. lost update
    2. non-repeatable read
    3. read dirty data
  1. Classification of locks:
    1. S lock: also called read lock or shared lock, after transaction T1 adds S lock to data R, other transactions succeed in adding S lock to R, but fail to add X lock
    2. X lock: also called write lock or exclusive lock, exclusive lock, after transaction T1 adds X lock to data R, other transactions fail to add S lock and X lock to R
  1. Blocking protocol:
    1. Level 1 blocking: Transaction T must add X lock to data R before modifying it, and it will not be released until the end of the transaction. Prevents lost modifications.
    2. Second-level blocking: the first-level blocking protocol plus the transaction T adds an S lock to the data R before reading it, and the S lock can be released after reading, which can prevent loss of modification and read "dirty" data
    3. Three-level locking: first-level locking protocol, first-level locking protocol plus transaction T adding S lock to data R before reading it, and not releasing it until the end of the transaction. It can prevent lost modification, read "dirty" data and prevent repeated reading of data.

Guess you like

Origin blog.csdn.net/qq_25580555/article/details/129669558