Article directory
- Installation Notes
- model training
- Model results
-
- score card
- Each variable type and IV value
- cut point for continuous variables
- Model effect test
- predict
- Supplement - Model Debugging
-
- Variable IV value and binning analysis
- Model effect analysis
- Supplement - Explanation of parameters for packages
Installation Notes
The package has been uploaded to the PYPI official website, see the CreditScoreModel package for details
For the first time, you can directly use the following code to install
pip install CreditScoreModel
model training
from CreditScoreModel.LogisticScoreCard import *
data=pd.read_csv('C:\\Users\\HP\\Desktop\\give me some credit\\data\\cs-training.csv')
data_predict=pd.read_csv('C:\\Users\\HP\\Desktop\\give me some credit\\data\\cs-test.csv')del data['Unnamed: 0']
data.columns=['y','RevolvingUtilizationOfUnsecuredLines', 'age','NumberOfTime30-59DaysPastDueNotWorse', 'DebtRatio', 'MonthlyIncome','NumberOfOpenCreditLinesAndLoans', 'NumberOfTimes90DaysLate','NumberRealEstateLoansOrLines', 'NumberOfTime60-89DaysPastDueNotWorse','NumberOfDependents']
del data_predict['Unnamed: 0']ls=logistic_score_card()
data_train, data_test = ls.get_data_train_test(data,test_size=0.25,random_state=1234)
ls.fit(data_train)
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\fixes.py:313: FutureWarning: numpy not_equal will not check object identity in the future. The comparison did not return the same result as suggested by the identity (`is`)) and will change._nan_object_mask = _nan_object_array != _nan_object_array
2019 16:15:06 INFO 任务开始。。。
2019 16:15:06 INFO 连续和离散变量划分中。。。
2019 16:15:06 INFO 连续和离散变量划分完成!
2019 16:15:06 INFO 连续变量最优分组进行中。。。
100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 23.53it/s]
2019 16:15:06 INFO 连续变量最优分组完成!
2019 16:15:06 INFO 根据cut离散化连续变量进行中。。。
100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:04<00:00, 2.38it/s]
2019 16:15:11 INFO 根据cut离散化连续变量完成!
2019 16:15:11 INFO IV值计算中。。。
100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 37.45it/s]
2019 16:15:11 INFO IV值计算完成!
2019 16:15:11 INFO WOE转换中。。。
100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 11.36it/s]
2019 16:15:12 INFO WOE转换完成!
2019 16:15:12 INFO 根据IV值大于 0.1 且 相关性小于 0.6 ,以及l1正则选取变量进行中。。。
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.FutureWarning)
2019 16:15:12 INFO 变量选取完成,总共 10 个变量,最终筛选出 5 个变量
2019 16:15:12 INFO 评分卡制作中。。。
2019 16:15:12 INFO 连续和离散变量划分中。。。
2019 16:15:12 INFO 连续和离散变量划分完成!
2019 16:15:12 INFO 根据cut离散化连续变量进行中。。。
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00, 2.36it/s]
2019 16:15:14 INFO 根据cut离散化连续变量完成!
2019 16:15:14 INFO WOE转换中。。。
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 11.81it/s]
2019 16:15:15 INFO WOE转换完成!
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.FutureWarning)
2019 16:15:18 INFO 评分卡制作完成!
2019 16:15:18 INFO 任务完成!
Model results
score card
Corresponding variable Chinese name
['Variable name', 'Variable type', 'Splitting point', 'Splitting grouping', 'The number of y is 1', 'The number of y is 0', 'Total', 'The number of y is 1 accounted for Ratio', 'the proportion of the number of y is 0', 'total proportion', 'y is 1 proportion of the total', 'woe', 'each group iv', 'variable iv value', 'logistic parameter col_coef', 'logistic parameter lr_intercept', 'group score']
ls.score_card
col | type | cuts | cut_points | 1_num | 0_num | total_num | 1_pct | 0_pct | total_pct | 1_rate | woe | iv | total_iv | col_coef | lr_intercept | score | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | NumberOfTime30-59DaysPastDueNotWorse | continuous | [-inf, 0.0, 1.0, inf] | (-inf, 0.0] | 3803 | 90648 | 94451 | 0.503509 | 0.863750 | 0.839564 | 0.040264 | -0.539683 | 0.194416 | 0.737079 | 0.538172 | -2.598254 | 20.950981 |
1 | NumberOfTime30-59DaysPastDueNotWorse | continuous | [-inf, 0.0, 1.0, inf] | (0.0, 1.0] | 1801 | 10249 | 12050 | 0.238448 | 0.097659 | 0.107111 | 0.149461 | 0.892673 | 0.125679 | 0.737079 | 0.538172 | -2.598254 | -34.654352 |
2 | NumberOfTime30-59DaysPastDueNotWorse | continuous | [-inf, 0.0, 1.0, inf] | (1.0, inf] | 1949 | 4050 | 5999 | 0.258043 | 0.038591 | 0.053324 | 0.324887 | 1.900110 | 0.416983 | 0.737079 | 0.538172 | -2.598254 | -73.763987 |
3 | NumberOfTime60-89DaysPastDueNotWorse | continuous | [-inf, 0.0, inf] | (-inf, 0.0] | 5469 | 101314 | 106783 | 0.724083 | 0.965383 | 0.949182 | 0.051216 | -0.287618 | 0.069402 | 0.570277 | 0.403517 | -2.598254 | 8.371879 |
4 | NumberOfTime60-89DaysPastDueNotWorse | continuous | [-inf, 0.0, inf] | (0.0, inf] | 2084 | 3633 | 5717 | 0.275917 | 0.034617 | 0.050818 | 0.364527 | 2.075741 | 0.500875 | 0.570277 | 0.403517 | -2.598254 | -60.419867 |
5 | NumberOfTimes90DaysLate | continuous | [-inf, 0.0, inf] | (-inf, 0.0] | 4953 | 101333 | 106286 | 0.655766 | 0.965564 | 0.944764 | 0.046601 | -0.386908 | 0.119863 | 0.833081 | 0.528354 | -2.598254 | 14.746114 |
6 | NumberOfTimes90DaysLate | continuous | [-inf, 0.0, inf] | (0.0, inf] | 2600 | 3614 | 6214 | 0.344234 | 0.034436 | 0.055236 | 0.418410 | 2.302207 | 0.713218 | 0.833081 | 0.528354 | -2.598254 | -87.743345 |
7 | RevolvingUtilizationOfUnsecuredLines | continuous | [-inf, 0.22, 0.49, 0.86, inf] | (-inf, 0.22] | 1320 | 62341 | 63661 | 0.174765 | 0.594024 | 0.565876 | 0.020735 | -1.223477 | 0.512953 | 1.071693 | 0.643205 | -2.598254 | 56.766152 |
8 | RevolvingUtilizationOfUnsecuredLines | continuous | [-inf, 0.22, 0.49, 0.86, inf] | (0.22, 0.49] | 916 | 16760 | 17676 | 0.121276 | 0.159700 | 0.157120 | 0.051822 | -0.275223 | 0.010575 | 1.071693 | 0.643205 | -2.598254 | 12.769650 |
9 | RevolvingUtilizationOfUnsecuredLines | continuous | [-inf, 0.22, 0.49, 0.86, inf] | (0.49, 0.86] | 1695 | 13027 | 14722 | 0.224414 | 0.124129 | 0.130862 | 0.115134 | 0.592169 | 0.059386 | 1.071693 | 0.643205 | -2.598254 | -27.475114 |
10 | RevolvingUtilizationOfUnsecuredLines | continuous | [-inf, 0.22, 0.49, 0.86, inf] | (0.86, inf] | 3622 | 12819 | 16441 | 0.479545 | 0.122147 | 0.146142 | 0.220303 | 1.367609 | 0.488779 | 1.071693 | 0.643205 | -2.598254 | -63.453483 |
11 | age | continuous | [-inf, 35.0, 55.0, 62.0, inf] | (-inf, 35.0] | 1801 | 14288 | 16089 | 0.238448 | 0.136145 | 0.143013 | 0.111940 | 0.560433 | 0.057334 | 0.239843 | 0.462719 | -2.598254 | -18.706199 |
12 | age | continuous | [-inf, 35.0, 55.0, 62.0, inf] | (35.0, 55.0] | 4061 | 45818 | 49879 | 0.537667 | 0.436582 | 0.443369 | 0.081417 | 0.208263 | 0.021052 | 0.239843 | 0.462719 | -2.598254 | -6.951426 |
13 | age | continuous | [-inf, 35.0, 55.0, 62.0, inf] | (55.0, 62.0] | 898 | 17050 | 17948 | 0.118893 | 0.162463 | 0.159538 | 0.050033 | -0.312225 | 0.013604 | 0.239843 | 0.462719 | -2.598254 | 10.421482 |
14 | age | continuous | [-inf, 35.0, 55.0, 62.0, inf] | (62.0, inf] | 793 | 27791 | 28584 | 0.104991 | 0.264810 | 0.254080 | 0.027743 | -0.925134 | 0.147853 | 0.239843 | 0.462719 | -2.598254 | 30.879240 |
各变量类型以及IV值
ls.col_type_iv
col | type | iv | |
---|---|---|---|
0 | RevolvingUtilizationOfUnsecuredLines | continuous | 1.071693 |
1 | age | continuous | 0.239843 |
2 | NumberOfTime30-59DaysPastDueNotWorse | continuous | 0.737079 |
3 | DebtRatio | continuous | 0.069471 |
4 | MonthlyIncome | continuous | 0.076410 |
5 | NumberOfOpenCreditLinesAndLoans | continuous | 0.073217 |
6 | NumberOfTimes90DaysLate | continuous | 0.833081 |
7 | NumberRealEstateLoansOrLines | continuous | 0.055378 |
8 | NumberOfTime60-89DaysPastDueNotWorse | continuous | 0.570277 |
9 | NumberOfDependents | continuous | 0.031616 |
连续变量的切分点
ls.col_continuous_cut_points
[['RevolvingUtilizationOfUnsecuredLines', [-inf, 0.22, 0.49, 0.86, inf]],['age', [-inf, 35.0, 55.0, 62.0, inf]],['NumberOfTime30-59DaysPastDueNotWorse', [-inf, 0.0, 1.0, inf]],['DebtRatio', [-inf, 0.41, 0.67, 2.66, inf]],['MonthlyIncome', [-inf, 1297.0, 4838.0, 6596.0, inf]],['NumberOfOpenCreditLinesAndLoans', [-inf, 2.0, 3.0, 13.0, inf]],['NumberOfTimes90DaysLate', [-inf, 0.0, inf]],['NumberRealEstateLoansOrLines', [-inf, 0.0, 1.0, 2.0, inf]],['NumberOfTime60-89DaysPastDueNotWorse', [-inf, 0.0, inf]],['NumberOfDependents', [-inf, 0.0, 1.0, 2.0, inf]]]
模型效果检验
ls.plot_roc_ks(data_train,ls.score_card)
2019 16:15:18 INFO 预测用户分数中。。。
2019 16:15:18 INFO 根据cut离散化连续变量进行中。。。
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00, 2.37it/s]
2019 16:15:20 INFO 根据cut离散化连续变量完成!
ls.plot_roc_ks(data_test,ls.score_card)
2019 16:15:21 INFO 预测用户分数中。。。
2019 16:15:21 INFO 根据cut离散化连续变量进行中。。。
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 6.97it/s]
2019 16:15:22 INFO 根据cut离散化连续变量完成!
预测
ls.predict_score_proba(data_test,ls.score_card)
2019 16:15:22 INFO 预测用户分数中。。。
2019 16:15:22 INFO 根据cut离散化连续变量进行中。。。
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 6.99it/s]
2019 16:15:23 INFO 根据cut离散化连续变量完成!
NumberOfTime30-59DaysPastDueNotWorsescore | NumberOfTime60-89DaysPastDueNotWorsescore | NumberOfTimes90DaysLatescore | RevolvingUtilizationOfUnsecuredLinesscore | agescore | score | proba | |
---|---|---|---|---|---|---|---|
0 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 30.879240 | 731.714366 | 0.011842 |
1 | 20.950981 | 8.371879 | 14.746114 | -63.453483 | -18.706199 | 561.909292 | 0.112027 |
2 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -6.951426 | 693.883700 | 0.019845 |
3 | -34.654352 | -60.419867 | -87.743345 | -27.475114 | -6.951426 | 382.755896 | 0.601900 |
4 | -73.763987 | 8.371879 | 14.746114 | -63.453483 | 10.421482 | 496.322007 | 0.238491 |
5 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 30.879240 | 731.714366 | 0.011842 |
6 | 20.950981 | 8.371879 | -87.743345 | -63.453483 | -6.951426 | 471.174606 | 0.307389 |
7 | 20.950981 | 8.371879 | 14.746114 | 12.769650 | -6.951426 | 649.887198 | 0.035921 |
8 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 10.421482 | 711.256608 | 0.015664 |
9 | 20.950981 | 8.371879 | 14.746114 | -27.475114 | -6.951426 | 609.642434 | 0.061116 |
10 | 20.950981 | 8.371879 | 14.746114 | -27.475114 | -6.951426 | 609.642434 | 0.061116 |
11 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -6.951426 | 693.883700 | 0.019845 |
12 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 30.879240 | 731.714366 | 0.011842 |
13 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -6.951426 | 693.883700 | 0.019845 |
14 | -34.654352 | 8.371879 | 14.746114 | 56.766152 | 10.421482 | 655.651276 | 0.033255 |
15 | 20.950981 | 8.371879 | 14.746114 | -63.453483 | 10.421482 | 591.036974 | 0.077701 |
16 | 20.950981 | 8.371879 | 14.746114 | -27.475114 | -6.951426 | 609.642434 | 0.061116 |
17 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 30.879240 | 731.714366 | 0.011842 |
18 | 20.950981 | 8.371879 | -87.743345 | -63.453483 | -18.706199 | 459.419833 | 0.343125 |
19 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -6.951426 | 693.883700 | 0.019845 |
20 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -6.951426 | 693.883700 | 0.019845 |
21 | 20.950981 | 8.371879 | -87.743345 | -63.453483 | -6.951426 | 471.174606 | 0.307389 |
22 | -34.654352 | 8.371879 | -87.743345 | 56.766152 | 30.879240 | 573.619574 | 0.096866 |
23 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -6.951426 | 693.883700 | 0.019845 |
24 | 20.950981 | 8.371879 | 14.746114 | 12.769650 | -6.951426 | 649.887198 | 0.035921 |
25 | -73.763987 | 8.371879 | 14.746114 | 56.766152 | 30.879240 | 636.999398 | 0.042649 |
26 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 30.879240 | 731.714366 | 0.011842 |
27 | 20.950981 | 8.371879 | 14.746114 | -63.453483 | 30.879240 | 611.494731 | 0.059659 |
28 | 20.950981 | 8.371879 | 14.746114 | 12.769650 | 10.421482 | 667.260107 | 0.028452 |
29 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 30.879240 | 731.714366 | 0.011842 |
... | ... | ... | ... | ... | ... | ... | ... |
37470 | 20.950981 | 8.371879 | 14.746114 | 12.769650 | -6.951426 | 649.887198 | 0.035921 |
37471 | -34.654352 | 8.371879 | 14.746114 | 12.769650 | -6.951426 | 594.281865 | 0.074538 |
37472 | 20.950981 | 8.371879 | -87.743345 | -27.475114 | -6.951426 | 507.152975 | 0.212299 |
37473 | -73.763987 | -60.419867 | -87.743345 | 12.769650 | -18.706199 | 372.136252 | 0.636593 |
37474 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 10.421482 | 711.256608 | 0.015664 |
37475 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -18.706199 | 682.128926 | 0.023276 |
37476 | 20.950981 | 8.371879 | 14.746114 | -63.453483 | -6.951426 | 573.664065 | 0.096812 |
37477 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -18.706199 | 682.128926 | 0.023276 |
37478 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 10.421482 | 711.256608 | 0.015664 |
37479 | 20.950981 | 8.371879 | 14.746114 | -27.475114 | -6.951426 | 609.642434 | 0.061116 |
37480 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -6.951426 | 693.883700 | 0.019845 |
37481 | -34.654352 | 8.371879 | -87.743345 | -63.453483 | -6.951426 | 415.569274 | 0.489626 |
37482 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -6.951426 | 693.883700 | 0.019845 |
37483 | -73.763987 | 8.371879 | 14.746114 | 56.766152 | -18.706199 | 587.413959 | 0.081378 |
37484 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 30.879240 | 731.714366 | 0.011842 |
37485 | 20.950981 | 8.371879 | 14.746114 | -63.453483 | -18.706199 | 561.909292 | 0.112027 |
37486 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -18.706199 | 682.128926 | 0.023276 |
37487 | -73.763987 | -60.419867 | 14.746114 | -27.475114 | -6.951426 | 446.135720 | 0.385743 |
37488 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -18.706199 | 682.128926 | 0.023276 |
37489 | 20.950981 | 8.371879 | -87.743345 | -63.453483 | -6.951426 | 471.174606 | 0.307389 |
37490 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 30.879240 | 731.714366 | 0.011842 |
37491 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | -6.951426 | 693.883700 | 0.019845 |
37492 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 30.879240 | 731.714366 | 0.011842 |
37493 | -34.654352 | 8.371879 | 14.746114 | 56.766152 | -6.951426 | 638.278367 | 0.041931 |
37494 | 20.950981 | 8.371879 | 14.746114 | -27.475114 | 10.421482 | 627.015343 | 0.048671 |
37495 | 20.950981 | 8.371879 | 14.746114 | 12.769650 | 30.879240 | 687.717864 | 0.021578 |
37496 | -34.654352 | 8.371879 | 14.746114 | 56.766152 | -6.951426 | 638.278367 | 0.041931 |
37497 | 20.950981 | 8.371879 | 14.746114 | 12.769650 | -6.951426 | 649.887198 | 0.035921 |
37498 | 20.950981 | 8.371879 | 14.746114 | 56.766152 | 30.879240 | 731.714366 | 0.011842 |
37499 | 20.950981 | 8.371879 | 14.746114 | -27.475114 | 30.879240 | 647.473100 | 0.037099 |
37500 rows × 7 columns
补充-模型调试
变量IV值以及分箱分析
#默认决策树分箱
ls.plot_col_woe_iv(data,'age')
cut_points | cut_points_id | 1_num | 0_num | total_num | 1_pct | 0_pct | total_pct | 1_rate | woe | iv | total_iv | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | (-inf, 36.0] | 0 | 2628 | 21237 | 23865 | 0.262118 | 0.151721 | 0.159100 | 0.110119 | 0.546753 | 0.060360 | 0.250005 |
0 | (36.0, 55.0] | 1 | 5177 | 58953 | 64130 | 0.516357 | 0.421171 | 0.427533 | 0.080727 | 0.203760 | 0.019395 | 0.250005 |
3 | (55.0, 63.0] | 2 | 1345 | 26409 | 27754 | 0.134151 | 0.188671 | 0.185027 | 0.048461 | -0.341036 | 0.018593 | 0.250005 |
2 | (63.0, inf] | 3 | 876 | 33375 | 34251 | 0.087373 | 0.238437 | 0.228340 | 0.025576 | -1.003921 | 0.151657 | 0.250005 |
# 手动分箱
ls.plot_col_woe_iv(data,'age',[-inf,20,30,40,inf])
C:\ProgramData\Anaconda3\lib\site-packages\CreditScoreModel\LogisticScoreCard.py:152: RuntimeWarning: divide by zero encountered in logresult['woe'] = np.log(result['1_pct'] / result['0_pct']) # WOE
cut_points | cut_points_id | 1_num | 0_num | total_num | 1_pct | 0_pct | total_pct | 1_rate | woe | iv | total_iv | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | (-inf, 20.0] | 0 | 0 | 1 | 1 | 0.000000 | 0.000007 | 0.000007 | 0.000000 | 0.000000 | 0.000000 | 0.0 |
2 | (20.0, 30.0] | 1 | 1244 | 9513 | 10757 | 0.124077 | 0.067963 | 0.071713 | 0.115646 | 0.601948 | 0.033778 | 0.0 |
1 | (30.0, 40.0] | 2 | 2390 | 21949 | 24339 | 0.238380 | 0.156808 | 0.162260 | 0.098196 | 0.418847 | 0.034166 | 0.0 |
0 | (40.0, inf] | 3 | 6392 | 108511 | 114903 | 0.637542 | 0.775223 | 0.766020 | 0.055630 | -0.195529 | 0.026921 | 0.0 |
# 不输出具体数据
ls.plot_col_woe_iv(data,'age',[-inf,20,30,40,inf],return_data=False)
C:\ProgramData\Anaconda3\lib\site-packages\CreditScoreModel\LogisticScoreCard.py:152: RuntimeWarning: divide by zero encountered in logresult['woe'] = np.log(result['1_pct'] / result['0_pct']) # WOE
模型效果分析
#默认参数跑出的结果
col_result=ls.col_result
col_continuous_cut_points=[col for col in ls.col_continuous_cut_points if col[0] in ls.col_result]
data_new=data_train[ls.col_result+['y']]score_card=ls.get_logistic_socre_card(data_new,col_continuous_cut_points)
ls.plot_roc_ks(data_new,score_card)
2019 16:15:35 INFO 评分卡制作中。。。
2019 16:15:35 INFO 连续和离散变量划分中。。。
2019 16:15:35 INFO 连续和离散变量划分完成!
2019 16:15:35 INFO 根据cut离散化连续变量进行中。。。
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00, 2.50it/s]
2019 16:15:38 INFO 根据cut离散化连续变量完成!
2019 16:15:38 INFO WOE转换中。。。
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 11.89it/s]
2019 16:15:38 INFO WOE转换完成!
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.FutureWarning)
2019 16:15:40 INFO 评分卡制作完成!
2019 16:15:40 INFO 预测用户分数中。。。
2019 16:15:40 INFO 根据cut离散化连续变量进行中。。。
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00, 2.41it/s]
2019 16:15:43 INFO 根据cut离散化连续变量完成!
# 例如:选取全部变量,并按以下切分点分箱
col_result=['y', 'RevolvingUtilizationOfUnsecuredLines', 'age','NumberOfTime30-59DaysPastDueNotWorse', 'DebtRatio', 'MonthlyIncome','NumberOfOpenCreditLinesAndLoans', 'NumberOfTimes90DaysLate','NumberRealEstateLoansOrLines', 'NumberOfTime60-89DaysPastDueNotWorse','NumberOfDependents']
col_continuous_cut_points=[['RevolvingUtilizationOfUnsecuredLines', [-inf, 0.22, 0.49, 0.86, inf]],['age', [-inf, 35.0, 55.0, 62.0, inf]],['NumberOfTime30-59DaysPastDueNotWorse', [-inf, 0.0, 1.0, inf]],['DebtRatio', [-inf, 0.41, 0.67, 2.66, inf]],['MonthlyIncome', [-inf, 1297.0, 4838.0, 6596.0, inf]],['NumberOfOpenCreditLinesAndLoans', [-inf, 2.0, 3.0, 13.0, inf]],['NumberOfTimes90DaysLate', [-inf, 0.0, inf]],['NumberRealEstateLoansOrLines', [-inf, 0.0, 1.0, 2.0, inf]],['NumberOfTime60-89DaysPastDueNotWorse', [-inf, 0.0, inf]],['NumberOfDependents', [-inf, 0.0, 1.0, 2.0, inf]]]
data_new=data_train[col_result]score_card=ls.get_logistic_socre_card(data_new,col_continuous_cut_points)
ls.plot_roc_ks(data_new,score_card)
2019 16:15:43 INFO 评分卡制作中。。。
2019 16:15:43 INFO 连续和离散变量划分中。。。
2019 16:15:43 INFO 连续和离散变量划分完成!
2019 16:15:43 INFO 根据cut离散化连续变量进行中。。。
100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:04<00:00, 2.49it/s]
2019 16:15:48 INFO 根据cut离散化连续变量完成!
2019 16:15:48 INFO WOE转换中。。。
100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 12.17it/s]
2019 16:15:49 INFO WOE转换完成!
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.FutureWarning)
2019 16:15:54 INFO 评分卡制作完成!
2019 16:15:54 INFO 预测用户分数中。。。
2019 16:15:54 INFO 根据cut离散化连续变量进行中。。。
100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:04<00:00, 2.13it/s]
2019 16:15:58 INFO 根据cut离散化连续变量完成!
补充-包的参数解释
def __init__(self,max_depth=None, # 决策树的深度max_leaf_nodes=4, # 决策树的子节点数min_samples_l
文章来源:https://www.rstk.cn/news/857619.html
一行代码搞定信用评分模型(python)就为大家介绍到这里,《python金融风控评分卡模型和数据分析(加强版)》更多实战案例会定期更新,用于银行培训,大家扫一扫下面二维码,记得收藏课程。