10 basic skills for product managers (5) Use Python to build and analyze RFM models (Part 1)

10 basic skills for product managers (5) Use Python to build and analyze RFM models (Part 1)

There are a thousand Hamlet in the eyes of a thousand viewers, and a thousand views of Python in the eyes of a thousand product managers!
Life is short, so I use Python for product decision analysis.

I can’t bear to let the threshold of data analysis tools take up too much time for the product manager, and it helps you save the time of looking for data analysis tools. The author, LineLian, strives to use an article to analyze Python to make data product analysis RFM models!

The background of this article first talks about the role of Python for product managers to analyze products so that they can make more scientific product decisions; then explain in detail the methods and steps of Python analysis of RFM, and finally analyze and establish an RFM model. The result output is based on the product optimization viewpoint of Python visual analysis.

In addition, it is not recommended that product managers write code, but data product managers and AI product managers must be able to understand Python code. After all, large companies such as Tencent have publicly required product managers to understand Python recruitment information in the JD of hiring product managers as shown below:
Job Offers

What is the RFM model?

The simplest understanding of RFM is as follows:

The meaning of the RFM model

The role of the RFM model:

The results of the RFM model analysis can help product-driven operations to formulate appropriate promotion and operation plans and select appropriate products or services to provide more accurate target users.

Prerequisites for RFM analysis:

  1. Customers who have recently had transactions are more likely to have transactions again than customers who have not recently.
    2. Customers with high transaction frequency are more likely to have transactions again than customers with low transaction frequency.
    3. Customers with a large total transaction amount in the past are more motivated to spend than customers with a small total transaction amount in the past.

How to analyze the RFM model?

There are many analysis methods for RFM models. The author recommends two tools, one is Python, and the other is EXCEL (this article is the first one, so let's talk about Python to analyze RFM first, and then talk about EXCEL to analyze RFM model in the next article). A small amount of user data uses EXCEL. What is the specific amount? Generally within 50,000. For large amounts of data (more than 50,000 user data) or big data, it is recommended to use the Python system as an RFM model. Of course, Python can also be used with a small amount of data. You can even make a Python model, no matter how much data is sent to the model, you can output useful decision information for product managers.

How many steps are there to build and analyze RFM models with Python?

Step 1: Determine the product data source to be analyzed

Source data, if necessary, the data set of this article can be obtained by contacting the product window of the official account LineLian. If you think this article is well written, you can pay attention to it for more exciting articles.
  The original data set is shown here first, so that there is a subjective impression of this data before data processing.
source data

It can be seen from the above figure that the data is divided into 9 columns, among which there are refunds in the order status.

Step 2: Data cleaning

1. Import the above source data into Pyhon tool for data cleaning. The actual operation is shown in the figure below:

Insert picture description here

2. Import the source data and delete the refund line data. Then extract keywords for the keywords to be analyzed.

3. Construct the last purchase time R

Insert picture description here

4. Construct purchase frequency F

Insert picture description here

5. Count the purchase amount M

Insert picture description here

6. Merge RFM

Insert picture description here

The third step: the user's layered scoring confirmation layered dimension table as shown below:

Insert picture description here

Step 4: Calculate the RFM-SCORE score

Calculate the R value first, then calculate the F and M values, and then compare with the average to reduce the number of user categories. Then it is to stratify users and build consolidation indicators.
Insert picture description here

Step 5: Count the number of people and amount

1. Count the number of people

Insert picture description here

2. Statistics amount

Insert picture description here

Step 6: Construct the conversion function

Determine whether the value of R/F/M is greater than the average value, and then compare it with the user hierarchical dimension table in the third step to convert and determine the customer type.
Insert picture description here

Step 7: Get the RFM result in Python

Insert picture description here

Step 8: do data visualization

1. First get the visualization of the number of people and the proportion of people. The visualization code of the number of people and the proportion of people is as follows:

Insert picture description here
Insert picture description here
The visualization results of the number of people and the proportion of people are as follows:
Insert picture description here

2. Visualizing the consumption amount and the proportion of the amount, the visualization code is as follows:

Insert picture description here
Insert picture description here

The visual diagram of consumption amount and consumption amount is as follows:
Insert picture description here

Step 9: Product or operation personnel explain the RFM image analyzed by Python

1. From the above analysis, it can be seen that the proportion of lost users is relatively high, accounting for 54.13%. Lost users indicate that the last purchase time is very long, the amount is small, and the order is small, indicating that the product has achieved a certain degree of refresh, but the retention rate is relatively high. low. The next focus of the product should be to design for the needs of lost customers.
2. Through the above analysis, it can be seen that the consumption amount is high but the proportion of customers waiting to be recalled is also relatively high, accounting for 68.49%. High-consumption pending users refer to those who have made a lot of consumption, but have not come back for a long time to spend again. Already near the edge of loss, this part of user products can drive operations to appropriately improve the information reach of user products and services. Let users truly perceive the temperature of services and products.

summary:

The basic logic of the author LineLian’s series of articles for product managers is: first write the product manager, then write the data product manager, the final foothold is to talk about the AI ​​product manager, because this is a progressive process, first the product will gradually accumulate data, data Analysis is needed. When the data reaches the stage of analyzing and making product optimization, a data product manager is born, but data alone is not enough. Data needs wisdom, and wisdom is calling for AI product managers.

Remarks: If you feel that the writing of this article is okay and you want to practice with the data source, you can share the circle of friends first, and then ask the data source to practice in the WeChat background.

Guess you like

Origin blog.csdn.net/weixin_42457814/article/details/105154740