A / B test (A / B Test)

What is A / B testing?

A / B test is a method of optimizing the product, the development of two programs (for example two pages) to optimize the same target, allowing the user to use a part of the program A (called a control or control group), while another portion of the user B program (called a change group or the experimental group), statistical indicators and compare different conversion schemes, traffic, retention, etc., in order to determine the merits of different options and make decisions.

 

A / B testing nature:

A / B test is to examine whether there is a difference of two general version control data and test samples of the two versions, so its use is essentially independent samples t-test of hypothesis testing.

 

Null hypothesis: the test version of the overall control parameter is less than equal to the overall release parameters.

Alternative hypothesis: the test version of the control parameter is greater than the general version of the population parameters.

 

A / B testing of two approaches:

1, continuous test (Consecutive Testing): A version for all users to serve the first stage, second stage delivery B version.

The advantage is easy to deploy and track, the user does not need to shunt; drawback is that the test results are not so accurate synchronization, because as time goes on, many uncontrollable changes may occur during the test.

2, synchronization test (Synchronous Testing): In the same period, the user will be diverted to a different embodiment, a portion of the version of user A, user B version part.

Advantage over continuous accurate test; disadvantage is to achieve more complex, the need to select a valid test group developed diversion plan.

 

Generally, A / B tests are synchronization test method (Synchronous Testing) a.

 

A / B testing steps of:

1, through data analysis, to find existing products possible problems and proposes solutions to optimize product and assumptions. Example: "Suppose the registration process in the way of pictures check code, check code into a text message to the registered conversion rate may be lifting."

2, optimization goals and establish comparative indicators. Establish quantifiable, can be implemented into small goals can be implemented on a specific function points. For example: "By optimizing the registration process, the registration conversion rate of 20% increase."

3, design optimization and completion of the development version.

4, to determine the length of the test.

5, the program determines split (split ratio per beta version).

6, tested in accordance with the proportion of open bypass flow line.

7, to collect experimental data and determining the effect of the validity determination.

8. The test results are the following possible: ① release new versions; ② adjusting the proportion continue to test shunt; ③ in the case did not reach the target continue to optimize the effect of the iterative scheme, re-development and test on-line.

 

Other common questions:

1, the test duration should be set how long?

Duration of the test should not be too short. Users enter into the new program, most likely because of curiosity and behave more active, but over time, becoming more and more calm, the data show some return to this level, if the experimental observation period setting too early, it is easy draw the wrong conclusions. The length of the adaptation period usually sufficient users participating 2-3 days after the test is appropriate. Test the length of time after the adaptation period in addition to the study sample size, but also need to refer to user behavior cycle, say electricity supplier user buying behavior has a strong cycle regularity, weekends and weekdays purchases will be significantly different, which when the test cycle is needed to cover a complete cycle, that is, should be more than 1 week.

But the test time should not be too long, because the A / B testing is to test multiple versions of the line, which means that online systems need to maintain multiple versions available, a long A / B test will undoubtedly increase the complexity of the system.

 

2. How should the user diversion?

That is split sampling, we should guarantee the same time, homogeneity, uniqueness, stability.

① the same time: the diversion should be carried out simultaneously.

② homogeneity: the separation of user groups should be similar to the characteristics of each dimension. Clustering can be based on characteristics of the user's device (eg mobile phone model, operating system version number, phone language, etc.) and other labels users (such as gender, age, old and new users, membership levels, etc.), each A / B tests you can select a specific group of users to test.

③ Uniqueness: i.e., user is not required to be included in the test repeated.

④ stability: Each user should be assigned to the same experimental version, so you can ensure a consistent user experience, users can ensure a stable performance in the case of adapt to the new version.

 

3. What is A / A test?

A / A test will give the original version of the flow again divided, separated respectively two flow to the original version of the same two tests. A / A test used to assess whether two experimental groups are at the same level, in order to test Buried, shunt, experimental statistical accuracy, increasing the A / B testing the reliability of the conclusions. If the result of the experiment AA significant differences do not exist, then the results can be considered to be effective, in turn, can further judge the results of the new and old versions.

 

4, A / B testing can simultaneously test two programs do?

A / B test is not only test program A and program B, in fact, a test may comprise A / B / C / D / E multiple versions ...... /, but to ensure that a single variable test, such as the color of the button - - red / orange / yellow / green / green / blue / purple, then this program can be done at the same time seven a / B test, but if a program next added another button, even if the results had a significant difference we can not determine what the causes of this difference really is.

 

reference:

https://zhuanlan.zhihu.com/p/68019926

http://www.appadhoc.com/blog/ab-test-in-dianrong/

Guess you like

Origin www.cnblogs.com/HuZihu/p/11178068.html
Recommended