Credit card customer risk analysis and evaluation

1. Dealing with outliers in credit card data
1. Training points
(1) Familiar with the basic business knowledge of credit cards.
(2) Master the identification and processing methods of outliers.
2. Demand Description
In order to promote the sound development of credit card business and reduce the risk of bad debts, major banks have carried out related work on credit card customer risk identification and established corresponding customer risk identification models. A bank needs to rebuild the risk identification model because the old risk identification model no longer meets the needs of business development over time. Currently, the credit card letter data description provided by the bank is shown in Table 7-11.
3. Implementation ideas
(1) Read credit card data.
(2) Discard the records of overdue, bad debts, forced card suspension, refunds, refusal records as 1, and defective accounts as 2.
(3) The records of discarding bad debts, forced card suspension, refunds are 1, and refusal records are 2.
(4) Discard the data whose frequency is 5 and the amount of card swiping is not equal to 1.
2. Construct key features of credit card customer risk assessment

  1. Training points
    (1) Master the principle of the credit card model.
    (2) Construct the key features of credit card user risk analysis.
    2. Requirements Description
    In the credit investigation work related to credit cards, the credit rating of customers is mainly judged from three directions. The credit rating is the historical credit risk of the customer, mainly the historical credit situation of the customer, including whether the customer has overdue, bad debt and forced card suspension records; the current economic situation of the customer, comprehensively considering the loan balance, personal monthly income, Personal monthly expenses, monthly family income, and monthly card swiping amount are closely related to the personal economic level; the customer's future economic income and the stability of the current income, the customer's occupation, age, and real estate information are different, so the customer's economic The stabilization situation is different.
    3. Implementation ideas and steps
    (1) According to the characteristics of defective accounts, overdue, and bad debts, forced card suspension records, refunds, and rejection records to construct historical behavior characteristics.
    (2) According to the characteristic loan balance, personal monthly income, personal monthly expenses, family monthly income and monthly card swiping amount, the characteristics of the economic risk situation are constructed.
    (3) According to the characteristic occupation, age, and residence, the characteristics of income risk situation are constructed.
    (4) Standardize the characteristics of historical behavior, economic risk situation, and income risk situation.
    3. Construct K-Means clustering model
    1. Training points
    (1) Master the application of K-Means clustering algorithm.
    (2) Master the method of clustering algorithm result analysis.
  2. Requirements Description
    Building a credit card high-risk customer identification model can be divided into two parts: the first part is to group customers according to the three characteristics constructed, and to cluster and group customers; the second part is to analyze the characteristics of each customer group in combination with business , analyze its risk, and rank each customer segment.
    3. Implementation ideas and steps
    (1) Construct a K-Means clustering model with 5 clusters.
    (2) Train the K-Means clustering model, and find the cluster center and the number of users of each category.

import pandas as pd
import numpy as np
from sklearn.cluster import KMeans

std = np.load(“…/tmp/standard.npy”)
print(std[:5])
kmeans_model = KMeans(n_clusters=5,random_state=123)
fit_kmeans = kmeans_model.fit(std)
kmeans_model.cluster_centers_

View sample category labels

print("cluster center\n",kmeans_model.cluster_centers_)
print("category label of the sample\n",kmeans_model.labels_)

Count the number of samples of different classes

r1 = pd.Series(kmeans_model.labels_).value_counts()
print('The final number of each category:\n',r1)

#Draw radar charts for clustering results
import matplotlib.pyplot as plt
#Set Chinese display
plt.rcParams['font.sans-serif'] = 'SimHei'
plt.rcParams['axes.unicode_minus'] = False

#Draw radar chart
N = len(kmeans_model.cluster_centers_[0])
print("N value\n",N)

#Set the angle of the radar chart, which is used to bisect and cut a circular surface
angles = np.linspace(0,2 np.pi,N,endpoint=False)
print("value of angles\n", angles)
#In order to make the radar The figure is closed in a circle
angles = np.concatenate((angles,[angles[0]]))
print("value of angles\n", angles )
#drawing
fig = plt.figure(figsize=(7,7))
ax = fig.add_subplot(111,polar=True)
colors = ["r", "g", "b", "y", "k"]
lab = []
for i in range(len(kmeans_model.cluster_centers_) ):
values ​​= kmeans_model.cluster_centers_[i]
feture = ["Historical Behavior Feature", "Economic Risk Feature", "Income Risk Feature"]
feture = np.concatenate((feture, [feture[0]]))
values ​​= np.concatenate((values,[values[0]]))
print("values\n",values)
#Draw a line chart
ax.plot(angles,values,colors[i],linestyle="-",linewidth=2,markersize=10) #fill
color
ax.fill(angles,values,alpha=0.5)
#add label for each feature
ax .set_thetagrids(angles
180/np.pi,feture,fontsize=15)
#Add title
plt.title("Customer group characteristic distribution map")
#Add grid
ax.grid(True)
#Add legend
lab.append("customer Group"+str(i+1)+","+str(r1[i])+"people") plt.legend
(lab)
plt.savefig(".../tmp/customer group characteristic distribution map.png")
plt. show()

insert image description here

Guess you like

Origin blog.csdn.net/qq_31391601/article/details/127412310