[Data Analysis] AB test

Reprinted Source: https://mp.weixin.qq.com/s/PQqPghR2-5GsL8px9Y9WDg

 

Foreword

The importance of AB test Needless to say, data, product and other employees will almost certainly know, good data scientist I think there must be more important than to know to understand the business model, and is accompanied by AB test tool for business growth.

If your heart AB test almost did not use the central limit theorem, hypothesis testing, z distribution, t distribution of knowledge, we recommended read this article.

This article of Contents :

What is A / B test is

A / B test works

What purpose A / B test is

A / B test process (like the interview to ask)

A / B profile Example test (implemented in conjunction with Python)

A / B test point to note

A / B test to be aware of statistical knowledge

 

1, what is A / B test is

A / B test (also known as split testing or bucket test ) both versions of a web page or application compared with each other for better performance which version of the method is determined. AB essentially on a test experiment, two or more variants in which Random page displayed to the user, the statistical analysis to determine which variant of conversion for a given target (such as indicators CTR) better.

 

2, A / B test works

In A / B test, you can set access a web page or application screen and modify it to create a second version of the same page. This change can be as simple as a button or a single title, it can be a complete redesign of the page. Then, half of the flow display original version (called controls) page, and the other half display a modified version of the page (called variants).

 

 

When the user accesses the page, as shown above gray button (control) and the red arrow buttons (variant) using Buried user may click behavior data collection, and analyzed by statistical engine (for A / B test). Then, you can determine this change (variant) has a positive effect for a given index (this is a user click-through rate CTR), negative or no effect.

The experimental data results might look like this:

 

3, for what purpose A / B test is

A / B test allows individuals, teams and companies by the results of user behavior data constantly changes its user experience carefully. This allows them to build assumptions and a better understanding of why the modification of certain elements affect user behavior. These assumptions may prove to be wrong, that they use A / B on the best experience of a specific target individuals or teams test ideas prove the user is not going to work, of course, may also prove to be correct.

So the A / B test just once to resolve differences contrast, A / B test can continue to use, in order to improve the user experience and improve certain goals, such as conversion rates over time.

For example, B2B technology companies may want to improve their quality and quantity of sales leads from active landing page. To achieve this goal, the team will attempt to make A / B testing changes to the overall layout of the title, visual images, form fields, and calls to action pages.

Once a test to determine what impact the change will help them produce what changes the behavior of visitors, what changes do not affect the behavior of visitors. Over time, they can be combined in multiple experiments to demonstrate the positive effect of changes to improve the variant relative to the control measurable.

 

Such product for developers and designers can use A / B tests to demonstrate the new features of the user experience the impact of changes. As long as there are clearly defined goals and explicit assumptions, user involvement, product experience and so can be optimized by A / B testing.

 

4, A / B test procedure

determine the goal : the goal is to determine whether the variant is more successful than the original version of the index. Click the button can be click-through rate, linked to the purchase of products open rates, email sign registration rate and so on.

create variants : make the required changes to the site of the original version of the element. Change may be the color of a button, switch the order of elements on the page, hidden content or navigation elements fully customizable.

generate hypotheses : Once the target, you can begin to generate A / B test ideas and assumptions to statistical analysis whether they will be better than the current version.

collect data : data collected for assuming the designated region corresponding to A / B test analysis.

running test : At this point, visitors to the site or application will be randomly assigned to a control or variants. Measurement, calculation and comparison of their interaction with each experience, to determine the performance of each user experience.

analysis : After the completion of the experiment, the results can be analyzed. A / B test analysis will show whether there is a statistically significant difference between the two versions.

 

Regardless of the results, we need to use the test results as a learning experience to generate new hypothesis can be tested in the future, and continue iterative optimization application elements or site's user experience.

 

5, A / B test simplified example (in conjunction with the Python implementation)

Description of Background Example :

Division of a " guess you want to watch " service access a new recommendation algorithm, the new algorithm is recommended strategies developed, before the line full flow to assess the merits of the recommended new strategies, assessment methods used are A / B test, specific sampling approach is in full flow amount of the two small, new recommendations were gone and the old policy branch recommended the policy branch, by comparing these two indicators in traffic (where users click on a measure by) the differences, advantages of the new strategy can assess bad, and then decide whether the new strategy for the whole full flow.

Examples of A / B test steps :

Indicators : CTR

Variants : new recommendation strategies

Hypothesis : new recommendation strategies can bring more users to click.

Data collection : The following group B data we want to verify the result of a new policy data, A set of data for the old policy resulting data. It was forged data.

The results (Python) :

Use python in  scipy.stats.ttest_ind  do bilateral t test on two sets of data, the result is relatively simple. But doing more or less than the unilateral detect when the need to do some processing, in order to get the correct result.

from scipy import statsimport numpy as npimport numpy as npimport seaborn as sns
A = np.array([ 1, 4, 2, 3, 5, 5, 5, 7, 8, 9,10,18])B = np.array([ 1, 2, 5, 6, 8, 10, 13, 14, 17, 20,13,8])
print('策略A的均值是:',np.mean(A))print('策略B的均值是:',np.mean(B))
Output:策略A的均值是:6.416666666666667策略B的均值是:9.75

Obviously, B is greater than the mean average of the policy strategy A, but B may be able to explain the policy to bring it more business transformation? Or simply because some random factors.

We want to prove that the new development strategy B is better, so you can set the null hypothesis and the alternative hypothesis are:

H0:A>=B

H1:A < B

scipy.stats.ttest_ind (x, y) is the default authentication x.mean () - y.mean () this assumption. In order to obtain positive results, calculated as follows:

stats.ttest_ind(B,A,equal_var= False)
  •  
output:Ttest_indResult(statistic=1.556783470104261, pvalue=0.13462981561745652)

According  scipy.stats.ttest_ind (x, y)  interpretation of the document, which is the result of a bilateral test. In order to obtain test results unilateral needs to be calculated in addition pvalue the results were 2-sided (where the thresholding value of 0.05).

Determined pvalue = 0.13462981561745652, p / 2> alpha (0.05), it is not able to reject the hypothesis, that the policy can not be temporarily bring more user B clicks.

 

6, A / B test points required note

1, a priori: experimental low price, low flow, extended to the user in full flow.

2, parallelism: different versions, different program at the time of verification, to take care of all other conditions consistent.

3, shunt scientific data and science: Science refers to shunt data AB two groups assigned to be consistent, scientific data is not directly refer to the average conversion rate, the average click-through rate to AB test decisions, but through confidence intervals, hypothesis testing, the degree of convergence to reach a conclusion.

 

7, A / B test to be aware of statistical knowledge

Just above article describes some of the content to AB test from the application point of view, when after collecting good data to do statistical analysis to infer that you may need to have the following knowledge, limited space here does not describe, review books to read on their own statistics, refer to the "Statistics study "Guyue Ping, Khan Academy statistics and other books and videos.

1, point estimates

2, interval estimation

3, the central limit theorem (sample estimates of the overall core, you can compare a look at the law of large numbers)

4, hypothesis testing

Which is part of the core hypothesis testing, other auxiliary better understanding of the part, such as interval estimation can be interpreted as positive inferential statistics, hypothesis testing can be understood as evidence to the contrary of inferential statistics, hypothesis testing on their own, you might also need to know little probability event, t distribution, z distribution, chi-square distribution, p value, alpha error, belta error and so on.

 

Summary :

In this article directory before 4 part reference in translation:

https://www.optimizely.com/optimization-glossary/ab-testing/

Part of which has been modified, which on AB test process steps made core changes, three parts catalog for personal learning after thinking income, we want to help.

Published 44 original articles · won praise 16 · views 10000 +

Guess you like

Origin blog.csdn.net/YYIverson/article/details/105078618