Introduction to Requirements Analysis: Architecture Talk (2) Non-functional Requirements

In the previous article, I briefly introduced the concept of architecture and the architecture design process, and briefly introduced the content of requirements analysis,
and finally pointed out: the output of requirements analysis should include non-functional requirements, common non-functional requirements as follows:

  • speed of task completion
  • The precision of the result
  • operational security
  • product capacity
  • range of allowed values
  • Throughput, such as tps
  • resource usage efficiency
  • reliability
  • Fault Tolerance and Robustness
  • Scalability
  • scalability

The red part is the key focus of common projects (Note: it varies according to the project background, for example, the most important thing about security products is security).
For a project, higher non-functional requirements represent higher system complexity and higher cost input (time cost, labor cost, hardware and maintenance cost), which need to be based on company/project/manpower/environment Comprehensive assessment of background factors.

non-functional requirements

The non-functional requirements output after the requirements analysis must be quantifiable, and the basic requirements are:

  • Non-functional requirements must be unambiguous, understandable, and testable.
  • Standards come first, otherwise there will be no basis for design, implementation or testing.
  • All requirements are measurable, and the scale of measurement is the unit used to test product conformity.

As an architect, the most important thing to pay attention to is non-functional requirements: requirements that cannot be measured are not real requirements

  • The metrics are defined as follows:
    • Refers to the process of evaluating software quality and performance through data collection and data analysis during software development.
      This process can help the development team understand the status quo of the software, the progress of development, discover potential problems and risks, make quantitative assessments of the software, and formulate improvement measures.

Examples of invalid non-functional requirements:

  • High concurrency must be supported: Is the system expected to be concurrent with 1,000 users, or 10,000 users?
  • Interface response should be as fast as possible: acceptable response time 100 milliseconds? 200 milliseconds? Or 1 second? Is it acceptable for 5% of users to exceed 1 second?

The following introduces the concepts of these non-functional requirements one by one, as well as some metrics. These metrics should be evaluated and collected during architecture design to evaluate the current system situation and future optimization directions.

1. The speed at which the task is completed

  • Explanation:
    That is, performance/efficiency, which usually refers to the "time/space" efficiency of the software, not just the running speed of the software.
  • Metrics:
    • Response time (average time/maximum time/95th percentile time)
    • CPU usage
    • cache hit rate
    • IO times/IO duration (disk IO/network IO)
    • Maximum number of connections/concurrency
  • Example:
    The maximum response time for a single user operation should not be higher than 2 seconds;
    the product must read the value of the sensor every 10 seconds;
    the system should identify whether an aircraft is an enemy or a friendly army within 0.25 seconds;
    after the cloud configuration is updated, the product Must update to latest configuration and complete refresh within 1 minute.
  • Data example:
    a 2.5GHz CPU, each CPU clock cycle is 0.4 nanoseconds,
    assuming that each CPU clock cycle is 1 second, the comparison data of common computer time consumption and relative 1 second clock cycle time consumption is as follows: Comparison
    insert image description here
    results It's still shocking, haha.
  • Memory is 200 times faster than gigabit network speed, so when you can use memory cache, don't consider Redis;
  • Context switching also consumes CPU performance, so do not consider multi-threading for tasks with few IOs;
  • Replace the SSD hard drive as soon as possible;
  • When doing high availability across data centers, try to ensure that each center has a complete set of services to ensure mutual access in the same data center.

2. The accuracy of the result

  • Explanation:
    Usually refers to the accuracy requirements of software output results (such as calculation results, measurement results, data analysis results, etc.).
  • Metrics:
    Its measurement standard is the size of the error between the output value and the true value.
  • Note:
    The precision of the result is generally not particularly reflected in most systems, but in some industries, precision is a particularly important indicator:
    • Banking system: Although the interest given by the bank to users is accurate to cents, internal calculations must be accurate to at least centimeters, or even higher, to ensure correct data aggregation;
    • Automatic driving system: The importance of automatic driving system to accuracy is obviously extremely high. Whether the distance between obstacles and the car is judged is accurate or not, and to avoid misjudgment of non-obstacles such as ghosting, etc.
      Usually the accuracy is indicated by a range interval, such as ±1 cm
  • Note:
    It is recommended to use integer storage for all storage and calculations involving amounts. For example, for ordinary RMB calculations, when writing to the DB, write in cents; the
    above-mentioned interest rate calculations can be written in DB and memory in centimeters calculations to avoid decimal precision problems that come with binary issues.

3. Operation safety

  • Explanation:
    Refers to the ability to protect a system from accidental access, destruction, and misuse. Including: authentication and authorization, access control, data protection, communication security, security log, regular audit, etc.
    Mainly through program review and test cases to confirm, such as:
    password complexity, algorithm security of two-step verification, role permissions and access control policies, etc., data encryption protection program evaluation, backup frequency and integrity, disaster recovery program, communication encryption Intensity, integrity and confidentiality of audit logs, frequency and depth of security audits
  • Remark:
    • The strength of encryption should be fully considered. For example, conventional MD5 with salt is not safe enough. A better way is:MD5( MD5(明文+用户特定盐值1) + 用户特定盐值2)
    • When the data involving the amount is written into the DB, a check field should be added, which is obtained by Hash and salt calculation based on all the fields of the row. When using it, the
      field needs to be checked. If it does not match, the transaction should be prohibited to avoid internal Data tampering by employees;
    • For systems with high security requirements, an independent encryption and decryption service should be deployed, and the user ID + source data should be passed in. The service obtains a specific key encryption based on the user ID and returns the encrypted data;
    • All sensitive data viewing/operation should be authenticated twice; all data export should be desensitized;
    • Product lines with high security requirements should be isolated from other product line networks to avoid attacks on this product from the intranet through other product vulnerabilities.

4. Product capacity

  • Explanation:
    Usually refers to the data capacity requirements of the software within a certain period of time.
  • Metrics:
    output according to software requirements, such as the number of registered users; the size of stored files
  • Example:
    The system can support 10,000 registered users within a month, and 100,000 registered users within a year;
    S3 storage can support 100,000 files/100GB within a month, and 1 billion files/5PB within a year.

5. Range of allowed values

  • Explanation:
    During the use of the software system, the scope of some input data is limited.
  • Example: the
    maximum length of the user's name,
    the minimum and maximum transaction amount, for example, the menu price of a restaurant is not allowed to exceed 10,000 yuan

6. Throughput

  • Explanation:
    Usually refers to the amount of data transferred by the software per unit time.
  • Metrics:
    • TPS: Transactions Per Second (the number of transactions per second transmitted)
    • QPS: Queries Per Second (queries per second)
      can be understood as adding a QPS once the server receives a request.
      Suppose you open the homepage of the website and initiate 3 requests to the server: homepage, login status, news list,
      at this time TPS plus 1, QPS plus 3
    • RT: Response Time (response time for each request)
    • Concurrency number: the number of requests processed by the system at the same time.
      For example, if a service receives 100 requests and is still processing no response, then the concurrency is 100 at this time.
      Many articles say that the number of concurrency = QPS * average response time, which is actually inaccurate Yes, these are two different performance indicators,
      indeed: the higher the concurrency, the longer the response time may be, and the QPS will indeed be correspondingly smaller,
      but these performance indicators are all related to the server status, network status, disk, etc. , and cannot be directly converted to each other.
  • Example:
    The TPS of the payment system reaches more than 4,000,
    the TPS of the shopping mall system reaches more than 2,000, and the QPS reaches more than 10,000,
    the concurrent number of the marketing system reaches more than 10,000, and
    the RT of the order system API is within 100ms

7. Efficiency of resource use

  • Explanation:
    Usually specify the usage, occupancy rate, and release of specific hardware components, such as CPU with 4 cores or more, memory with 16G or more; Gigabit bandwidth is required.
  • Metrics:
    • CPU usage time
    • memory usage
    • disk space
    • IO average value, peak value, total transmitted data volume, etc.
    • TPS unit time can be processed
  • Explanation:
    Through good architecture design and better algorithm implementation, resource usage efficiency can be optimized.
    When the machine cannot be expanded horizontally in special scenarios, it can be optimized by simply upgrading the hardware configuration of a single machine.

8. Reliability

  • Explanation:
    It refers to the ability of the product to complete the predetermined function within the specified time and under the specified conditions.
  • Metrics:
    • MTBF - the full name is Mean Time Between Failure, that is, the mean time between failures.
    • MTTR - the full name is Mean Time To Repair, that is, the average fault recovery time.
    • MTTF - the full name is Mean Time To Failure, that is, the average time between failures.
    • 可用性Availability=UpTime/(UpTime+DownTime)=MTBF / (MTBF + MTTR)
  • Example:
    The average failure-free time of the service should be higher than 99.9%, and a request error rate higher than 1% within 5 minutes is judged to be a failure.
  • Note:
    • Some articles say that MTBF = MTTF + MTTR.
      In my understanding, MTBF is the mean time between failures, which refers to the average time for the system to operate normally, and MTTR should not be included.
    • MTTF refers to how long the system can run normally before a failure occurs. The higher the reliability of the system, the longer the mean time between failures.
      Later, I will write an article on the topic to introduce reliability and usability.

9. Fault tolerance and robustness

  • Explanation:
    Refers to the processing ability and stability of the system in the face of errors and abnormal situations.
  • Metrics:
    usually measured by development specification constraints, code inspection quality, design review;
    and data evaluation such as the number of system crashes and error log statistics
  • Example:
    When the user enters any data, the system can process it normally or prompt an error, and will not crash or generate an error response and generate wrong data.
    When the system crashes, it can record abnormal information for troubleshooting, and can effectively respond to users (service degradation)

10. Scalability

  • Explanation:
    Without changing the system, adjust the system's ability to handle business by simply adjusting the configuration/addition or subtraction of servers.
  • Metrics:
    • Response time: As the number of users increases/the number of requests increases, does the response time increase accordingly?
    • Number of error responses: the number of users increases/the number of requests increases, and whether error responses increase accordingly, such as timeouts, network errors, current limiting errors, etc.
  • Example:
    In the case of smooth growth in traffic, it can automatically expand capacity to achieve service availability without affecting it;
    in the case of traffic decline, it can automatically shrink capacity to reduce costs, and service availability is not affected;

11. Scalability

  • Explanation:
    When new products or needs are added, the ability to transparently launch new products without affecting existing products.
  • Example:
    The system supports blue-green release, rolling release, and grayscale release.

reference knowledge

For non-functional requirements, there are already many industry standards for learning and reference, providing more non-functional requirements and descriptions, such as:

  • Jim McCall Software Quality Model (1977)

  • Barry W. Boehm Software Quality Model (1978)

  • FURPS/FURPS+ Software Quality Model

  • R. Geoff Dromey Software Quality Model

  • ISO/IEC 9126 Software Quality Model (1993)

  • ISO/IEC 25010 Software Quality Model (2011)

The following introduces two currently commonly used software quality models. It is recommended to read and understand carefully:

Software Quality Model ISO/IEC 25010

A software quality standard jointly formulated by ISO (International Organization for Standardization) and IEC (International Electrotechnical Commission), replacing the ISO/IEC 9126 standard; for details, refer to
: https://iso25000.com/index.php/en/iso-25000- standards/iso-25010
insert image description here

Software Quality Model FURPS/+

The FURPS model was first proposed by Hewlett-Packard's Robert Grady and Caswell, and was later extended to FURPS+ by Rational Software.
For detailed introduction, please refer to: https://sceweb.uhcl.edu/helm/RationalUnifiedProcess/process/workflow/requirem/co_req.htm
insert image description here

epilogue

The above briefly introduces some non-functional requirements and related indicators. The next two articles will introduce the usability and performance optimization of the software system.

Guess you like

Origin blog.csdn.net/youbl/article/details/131265878