软件测试之“白盒测试”

【引言】工作关系，作为曾经的独立测试部门，现在与开发团队一起组成Scrum Team融合阶段。

因为以前的项目系统问题较多，上边大老板为了提高开发团队的代码提交质量，要求开发除了必要的Unit Test之外，也到做一些E2E的Functioanl Testing俗称Dev Testing；而QA的SIT Testing则可以侧重更广的E2E范围甚至直接上Regression。

而跟开发最近几次的会议讨论中，发现很多开发人员多次提出Dev Testing的侧重点和测试范围，并反复提到很可能与QA Testing有重复性工作。

作为从独立的QA/Testing部门来的测试人员，当开发把问题抛给我们的时候，我们可能比较套路的回复说，建议开发做“白盒测试”，而测试团队负责“黑盒测试”。

但其实从一直以来的“黑盒测试”入门软件测试的人来说，对什么是真正高效和有效的“白盒测试”并不十分清楚。所以需要补充下理论知识，并尝试了解下业内较为成熟的实践部分，再回头找开发团队讨论，并希望提供给他们一些关于“白盒测试”的好的思路。

参考

ISTQB – White Box Testing Techniques in Software Testing

Structure Based or White Box techniques

White-box test design techniques (also called structural or structure- based techniques) are used to derive test cases from an analysis of the program(code). Deriving test case after analysis or understanding programs is white box testing.

In contrast to black box testing where test cases are designed from specification.

How to derive test cases from a program? And also how to derive coverage metrics for tests undertake?

Test cases from code are similar to black box test cases, but think test data here – like what values you will provide to run through the code. The emphasis here is what lines will be executed with the data you provide. And the minimum number of test cases to achieve this.

Blank Box testing limitations: During Black Box testing, depending on the tester’s experience, after a full round of System testing, the lines of code covered varies between 30% and 70%. So that leaves a lot of lines of code untested.

Coverage metrics help understand what lines are not covered and help us design test cases to increase the coverage. There are certain tools available to measure the lines of code covered when tests are run.

Statement coverage: If the test case executes every line of code in the program, it is called 100% statement coverage.

Decision coverage: If the test cases execute both the decisions, it is called 100% decision coverage.

Statement Coverage = Number of executable statements executed x 100
                        Total number of executable statements

Decision Coverage = Number of decisions exercised x 100
                        Total number of decisions

Coding Structures: A program code can be sequential where statements execute one after another. Execution of code can also be based on a condition being true or false.

Sequential or Linear statements: Example of a sequential code, where the program is executed line after line.

1  READ  A
2  READ B
3  Sum = A+ B

Test cases for Sequential statements are simple, assign values to A & B.

For eg, test case /data : A=1, B=2, will run through statements 1,2,3 or all the lines of this program. In this case, one test case is needed for 100% Statement coverage.

Selection or Decisions: In this case the computer has to decide if a condition is true or false. If it is true the computer takes one route, and if it is false the computer takes a different route.

IF (condition is true)
   Do this
ELSE  (condition is False)
   Do something else
END IF

Eg:

1 IF Age  > 16
2              Process License application
3 ELSE
4              Decline the License application process
5 END IF

Test cases: Age =18 tests lines 1,2,5 and Age =14 tests lines 1,3,4,5.

So, 2 test cases are needed to execute every line of code. And 2 test cases to execute the True and False conditions. So 2 test cases are needed for 100% Statement and Decision coverage.

There may not always be an ELSE part, as below:

 1 IF Age  >16
 2    THEN
 3     "Process License Application"
 4  ENDIF

Test cases for the above, assigning Age =18 tests lines 1,2,3,4, so only one test case is needed to execute every line of code.

But 2 test cases are needed to test the decisions: 1 to execute the True (Age > 16) and False(Age < 16). Why?

Because for Age <16 , we need to ensure the program does not do anything, it needs to go to the ENDIF.

Hence Decision coverage is stronger than Statement coverage. 100% Decision coverage ensures 100% Statement coverage. The vice versa is not true.

Consider the following:

1  IF (Age  > 16) and (Gender = Female)
2              Process License application
3 ELSE
4              Decline the License application process
5 END IF

1  IF (Age  > 16) or (Gender = Female)
2              Process License application
3 ELSE
4              Decline the License application process
5 END IF

Assign (Age =18 and Gender=Female) to test lines 1,2,5 and (Age=14, Gender = Male) to test lines 1,3,4,5. So, 2 test cases are needed to execute every line of code(100% Statement coverage). And 2 test cases to test both the decisions – true and false (100% Decision Coverage).

In the above examples, we see why Decision coverage is inadequate, all conditions are not fully tested:

Age >16 & Female, Age > 16 & Male, Age <16 and Female and Age <16 & Male will cover all the combinations. Hence the need for other stronger metrics.

Nested Ifs or multiple IFs within an IF:

1 IF ( Age > 17)
2  IF Age <50
3     Print Age is between 17 and 50
4   ELSE IF ( Age > 50)
5      Print Age is greater than 50
6  END IF
7 ELSE
8   Print Age is less than 17.
9 END IF

Test cases for the above, assigning Age =16, Age =51 and Age =29 (3 test cases) will test every line/statement and every decision..

CASE Structure: The nested If can also be expressed as a CASE structure.

Case Age > 50:
Report “Age is greater than 50”
Case Grade > 17:
Report “Age is between 17 and 50”
Default:
Report “Age is less than 17”
End Case

Again, assigning Age =16, Age =51 and Age =29 (3 test cases) will test every line and every decision..

Iterations or loops: When an action needs to repeated until the Condition becomes False, loops are used. There are 2 types of loops, While -Do and Repeat Until

DO WHILE condition
  sequence
ENDWHILE

The loop is entered only if the condition is true. The “sequence” is performed for each iteration. At the conclusion of each iteration, the condition is evaluated and the loop continues as long as the condition is true.

Eg:

WHILE ( Shopping Trolley not empty)
DO
     Add Cost of Item to Total.
     Print the item name and item cost
END DO
Print TotalCost

　　Test cases for Do-While – the loop is entered if the condition is true, so one test case that has a true value is enough to test all the lines of code and the decision.

　　There is another variation of the loop structure known as a REPEAT UNTIL loop.

This loop is similar to the WHILE loop except that the test is performed at the bottom of the loop instead of at the top. Two keywords, REPEAT and UNTIL are used. The general form is:

REPEAT
   sequence
UNTIL condition

Again, only one test case is sufficient to test all lines of code for both Statement and Decision Coverage.

白盒测试前期成本之高是无法想象的，一是白盒测试人员的培训，二是工具的开发。
人员培训包括要知道语句覆盖，条件覆盖，判断覆盖，条件判断覆盖，独立路径覆盖。再深入一点还要懂数据流测试(定义使用，程序切片…)。直接上来就设计白盒测试用例比较少，原因是设计的信息不足。一般是先按功能需求，以黑盒测试设计方法写用例。跑了用例，再用白盒的方法精简用例和增加用例。以白盒测试的方法增加用例，必要的技能是看懂代码吧。不然怎么保证用例能覆盖到某个if分支，while循环次数要求呢？要是对模块进行接口测试，要写驱动，要写桩程序吧。
要做上面的事还要写测试框架，把一些与具体用例无关，但是常用的公共功能，如参数化，结果断言，出报告等等实现。让测试人员专心为用例写测试代码。还好来源大潮提供了不少选择，如xunit，googletest。还有覆盖工具，如Emma,codecover, ncover,不过大部分也只提供了语句覆盖，判断覆盖的信息，少部分有循环覆盖信息。开源的数据流测试的工具没见过。

作者：知乎用户
链接：https://www.zhihu.com/question/28700827/answer/41799444
来源：知乎
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。

白盒测试因为其特性，优点自然是精准，但是缺点也是太费时间，而且都是国外产品比较贵，对测试人员的代码要求太高，现在互联网比的就是时间，自然不适合所以了大多都是黑盒，不过有取两者之间的工具，我这里推荐一款threadingtest的工具，也有云化的叫星云测试，国人做的还不错，他可以用黑盒的测试方法做白盒测试，对测试人员要求简单不少，通过报告也不用太懂代码就能测试代码的覆盖率

作者：一骑
链接：https://www.zhihu.com/question/28700827/answer/78493610
来源：知乎
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。

为了衡量测试的覆盖程度，需要建立一些标准，目前常用的一些覆盖标准从低到高分别是：

语句覆盖：是一个比较弱的测试标准，它的含义是：选择足够的测试用例，使得程序中每个语句至少都能被执行一次。它是最弱的逻辑覆盖，效果有限，必须与其它方法交互使用。
判定覆盖（也称为分支覆盖）：执行足够的测试用例，使得程序中的每一个分支至少都通过一次。判定覆盖只比语句覆盖稍强一些，但实际效果表明，只是判定覆盖，还不能保证一定能查出在判断的条件中存在的错误。因此，还需要更强的逻辑覆盖准则去检验判断内部条件。
条件覆盖：执行足够的测试用例，使程序中每个判断的每个条件的每个可能取值至少执行一次；条件覆盖深入到判定中的每个条件，但可能不能满足判定覆盖的要求。
判定/条件覆盖：执行足够的测试用例，使得判定中每个条件取到各种可能的值，并使每个判定取到各种可能的结果。判定/条件覆盖有缺陷。从表面上来看，它测试了所有条件的取值。但是事实并非如此。往往某些条件掩盖了另一些条件。会遗漏某些条件取值错误的情况。为彻底地检查所有条件的取值，需要将判定语句中给出的复合条件表达式进行分解，形成由多个基本判定嵌套的流程图。这样就可以有效地检查所有的条件是否正确了。
条件组合覆盖：执行足够的例子，使得每个判定中条件的各种可能组合都至少出现一次。这是一种相当强的覆盖准则，可以有效地检查各种可能的条件取值的组合是否正确。它不但可覆盖所有条件的可能取值的组合，还可覆盖所有判断的可取分支，但可能有的路径会遗漏掉。测试还不完全。

控制结构测试/白盒测试：

基本路径测试/分支测试
条件测试
数据流测试
循环测试。

基本路径测试就是这样一种测试方法，它在程序控制图的基础上，通过分析控制构造的环行复杂性，导出基本可执行路径集合，从而设计测试用例的方法。设计出的测试用例要保证在测试中程序的每一个可执行语句至少执行一次。

在程序控制流图的基础上，通过分析控制构造的环路复杂性，导出基本可执行路径集合，从而设计测试用例。包括以下4个步骤和一个工具方法：

程序的控制流图：描述程序控制流的一种图示方法。
程序圈复杂度：McCabe复杂性度量。从程序的环路复杂性可导出程序基本路径集合中的独立路径条数，这是确定程序中每个可执行语句至少执行一次所必须的测试用例数目的上界。
导出测试用例：根据圈复杂度和程序结构设计用例数据输入和预期结果。
准备测试用例：确保基本路径集中的每一条路径的执行。

工具方法：

1. 图形矩阵：是在基本路径测试中起辅助作用的软件工具，利用它可以实现自动地确定一个基本路径集。

1) 控制流图

图3 流图符号

2. 独立路径

独立路径：至少沿一条新的边移动的路径

第二步：计算圈复杂度

圈复杂度是一种为程序逻辑复杂性提供定量测度的软件度量，将该度量用于计算程序的基本的独立路径数目，为确保所有语句至少执行一次的测试数量的上界。

独立路径必须包含一条在定义之前不曾用到的边。

有以下三种方法计算圈复杂度：

流图中区域的数量对应于环型的复杂性；
给定流图G的圈复杂度V(G)，定义为V(G)=E-N+2，E是流图中边的数量，N是流图中结点的数量；
给定流图G的圈复杂度V(G)，定义为V(G)=P+1，P是流图G中判定结点的数量。

第三步：导出测试用例

根据上面的计算方法，可得出四个独立的路径。(一条独立路径是指，和其他的独立路径相比，至少引入一个新处理语句或一个新判断的程序通路。V(G)值正好等于该程序的独立路径的条数。)

第四步：准备测试用例

为了确保基本路径集中的每一条路径的执行，根据判断结点给出的条件，选择适当的数据以保证某一条路径可以被测试到。

必须注意，一些独立的路径，往往不是完全孤立的，有时它是程序正常的控制流的一部分，这时，这些路径的测试可以是另一条路径测试的一部分。

4) 工具方法：图形矩阵

导出控制流图和决定基本测试路径的过程均需要机械化，为了开发辅助基本路径测试的软件工具，称为图形矩阵(graph matrix)的数据结构很有用。

利用图形矩阵可以实现自动地确定一个基本路径集。一个图形矩阵是一个方阵

- 其行/列数控制流图中的结点数，每行和每列依次对应到一个被标识的结点，
- 矩阵元素对应到结点间的连接（即边）。

对每个矩阵项加入连接权值(link weight)，图矩阵就可以用于在测试中评估程序的控制结构，连接权值为控制流提供了另外的信息。最简单情况下，连接权值是 1(存在连接)或0(不存在连接)，但是，连接权值可以赋予更有趣的属性：

- 执行连接(边)的概率。
- 穿越连接的处理时间。
- 穿越连接时所需的内存。
- 穿越连接时所需的资源。

图9 图形矩阵

连接权为“1”表示存在一个连接，在图中如果一行有两个或更多的元素“1”，则这行所代表的结点一定是一个判定结点，通过连接矩阵中有两个以上（包括两个）元素为“1”的个数，就可以得到确定该图圈复杂度的另一种算法。