[Share] a dry ArchSummit push large data algorithm financial risk control practices

Author: Advanced Data engineer a push Xiao Chun

As we all know, the financial data is one of the highest level of industry, artificial intelligence and big data is an important field of application technology. With large data collection, storage, analysis and modeling techniques become more sophisticated, big data technology is gradually applied to all aspects of financial risk control. A push as a professional intelligent data service provider with massive data resources, wisdom in the financial sector also introduced the corresponding data solutions - LIVE, to provide customers with smart financial fraud, credit risk assessment and multi-dimensional high user wishes intelligent screening the whole process of data services, to enhance risk control capabilities help financial institutions. This article will focus on large wind control data, combined with a push practice, introduce financial risk control machine learning basic processes, algorithms practices and product technology and other content.

[Share] a dry ArchSummit push large data algorithm financial risk control practices
[Share] a dry ArchSummit push large data algorithm financial risk control practices
Flow of control air multidimensional feature

Risk control big data content of
data is a core element of risk control, risk control big data is actually the process of data processing, modeling and applications. Big Data risk control process is divided into four stages: data acquisition, data analysis, data modeling, risk control product applications. For mass data acquired cleaning and excavating, characterized in targeted financial deep processing; followed by construction rules and policy model algorithm, a respective external output wind control service.

A push to push messaging service started, hundreds of thousands of APP to provide efficient and stable push service, and precipitated a wealth of data resources, covering more than 4 billion terminal equipment, data comprehensive, extensive and deep. Using the basic equipment information, APP preference data line, the data line and the external scene supplementary data, a push constructed eight dimensions, + 350 wherein, while the dynamic update feature. Basic properties, assets, financial, behavioral preferences, social attributes, consumer preferences, risk and financial stability constitutes a push of the eight data dimension; use a push of the eight data dimensions, features more than 350 kinds of model building, and apply financial risk control in all aspects.

Financial risk control machine learning basic processes
throughout the risk control process modeling, done on a big push data platform. First, the mass-hand continuously updated data collection, cleaning, storage, performed before storing the data ID opened; the second step, the cleaned raw data to build features; finally, the use of multi-dimensional features financial risk control model building , use of technology, including collaborative recommendation algorithm, LR algorithm, XGBoost, marketing model, and model credit points bulls models.

[Share] a dry ArchSummit push large data algorithm financial risk control practices
Modeling process

How to efficiently build characteristics, risk control is a crucial issue in modeling. In practice, one will push wherein the stability analysis, dirty data / abnormal data processing, characterized in bins, wherein the polymerization and validation feature. Wherein the evaluation index includes a value IV, Gain value, monotonicity, stability and saturation.

Practice of wind control algorithm scene learning machine
using the multi-dimensional modeling capabilities and characteristics, before energizing the loan, credit and loan after the whole process: pulling, election, commentary, pipe, link five reminders.

[Share] a dry ArchSummit push large data algorithm financial risk control practices

The whole process can increase data

Pull - marketing model, screening a false registration, evaluation willingness to borrow

In the new pull-acquisition stage, a large push to develop fit, marketing model small two sale scenarios demand by rule policy, the model strategy, risk control three-pronged strategy to help customers identify the "real" reduce user acquisition costs and enhance registration and conversion rates. Customers can provide sample data, the modeling is done by a push, while, in the absence of sample data, a push relying on massive accumulation of sample data itself can be constructed in a variety of marketing generic model scenarios for customer use .

Election - before loan audit, fraud detection crowd, protection against malicious Piandai

The loan before the review stage we usually take two strategies: fraud division model, risk population screening. Fraud sub model refers to the data conversion in a push platform, according to information provided by the customer, feature matching, and be screened for its risk profile to be scored using the default rule, and finally draw the appropriate fraud division. A push over 350 features identified several dozens risk characteristics. For example, when a user installs APP to achieve a variety of classes of small loans above, or visit the scene abnormal, or that the user will hit the line blacklist is identified as risk characteristics. According to the level of fraud points to be sorted, access to customers who are not listed, need to focus on personnel.

风险人群筛选指的是根据用户存在的风险特征数量及程度,梳理出风险人员。个推利用筛选出的8种维度、350+特征,通过模型预测和规则制定,输出三类风险人群:黑名单、灰名单、多头名单。多头名单顾名思义,当某用户频繁安装或卸载多款借贷类APP时则会被模型系统判定为多头人员;灰名单指的是稳定性较差的人员,黑名单指的是异常人员。在贷前审核阶段,黑名单人员可直接不予以准入,灰名单和多头人员则需要重点关注。

评-信用分模型,贷前信用评估,辅助贷款定额

在评的阶段,个推采用信用分模型,为客户输出用户的信用评分。信用评分由五种维度构建而成:资产、身份、稳定性、关系、行为。个推信用评分模型先根据模型训练与规则模型,得到各个维度分,再将五个维度的个人评分作为特征输入模型,作为特征得到总体个人信用分。

信用分模型由多个模型整合而成,第一层是分类模型(lr+xgboost),得到分值;第二层在维度分的基础上再进行回归,得到最终信用评分。

管-贷中管控,监测异常特征,实现风险预警

在管的环节,个推采用贷中监测模型。从整体人群筛选出逾期相似(相关)人群,结合实时数据与高危特征异常监测得到高疑用户,结合客户的实际需求,对此类用户通过进一步的精准研判得到逾期风险人员,将此类人员告知客户,让其予以重点关注或排查。

催-贷后催管,催回价值评估,提高催回效率

在催的环节,个推基于自身构建的催回评分系统,可以有效指导金融机构制定差异化催管策略,助力更高效地完成催收工作。比如,当客户出现逾期和坏账时,金融机构通过个推的催回评分,对用户的还款能力和还款意愿进行评估,从而判断哪些用户优先催。

风控系统产品化
前面几个流程主要讲的是个推利用多维度特征自主构建风控模型,但在很多业务场景客户希望快速构建特征、快速返回风控结果。为此,我们研发上线个真决策引擎,在规则设计层为客户提供风控规则,让业务人员在规则执行层通过规则性加工进行灵活操作,目前已提供给部分客户试用。

[Share] a dry ArchSummit push large data algorithm financial risk control practices
风控决策引擎

Today, the era of technology and financial depth has come blend, financial risk control long way to go. A push to continue mining its rich data assets, continues to polish its own technology to enhance operational efficiency and boost the financial services industry capabilities.

(This article all the pictures are from a push)

More technical dry goods, please pay attention to public numbers: a push Institute of Technology.

[Share] a dry ArchSummit push large data algorithm financial risk control practices

Guess you like

Origin blog.51cto.com/13031991/2424378