2020 National Competition Higher Education Cup Mathematical Modeling Question C Questions for small and medium-sized enterprises credit decision-making and problem-solving documents and procedures for the whole process

2020 National Higher Education Cup Mathematical Modeling

Credit decision-making of small and micro enterprises in question C

Reproduce the original title

  In practice, due to the relatively small scale of small and medium-sized enterprises and the lack of collateral assets, banks usually provide loans to enterprises with strong strength and stable supply and demand based on credit policies, information on enterprise transaction bills, and the influence of upstream and downstream enterprises. , and can give preferential interest rates to enterprises with high reputation and low credit risk. Banks first assess the credit risk of small, medium and micro enterprises based on their strength and reputation, and then determine whether to lend and credit strategies such as loan amount, interest rate, and term based on factors such as credit risk.
  A bank's loan amount for a company that is determined to lend is 100,000 to 1,000,000 yuan; the annual interest rate is 4% to 15%; the loan period is 1 year. Attachments 1 to 3 respectively give the relevant data of 123 companies with credit records, the relevant data of 302 companies without credit records, and the 2019 statistical data on the relationship between loan interest rates and customer churn rates. The bank asked your team to establish a mathematical model to study the credit strategy for small, medium and micro enterprises based on the actual situation and the data information in the attachment, mainly to solve the following problems: (1) Quantitatively analyze the credit risk of the 123 enterprises in attachment 1, and
  give Show the bank's credit strategy for these enterprises when the total annual credit amount is fixed.
  (2) On the basis of question 1, conduct a quantitative analysis of the credit risk of 302 enterprises in Annex 2, and give the bank's credit strategy for these enterprises when the total annual credit is 100 million yuan.
  (3) The production, operation and economic benefits of enterprises may be affected by some sudden factors, and sudden factors often have different effects on different industries and different types of enterprises. Comprehensively considering the credit risk of each enterprise in Appendix 2 and the impact of possible sudden factors (such as the new crown virus epidemic) on each enterprise, the bank's credit adjustment strategy is given when the total annual credit is 100 million yuan.

  Appendix 1 Relevant data of 123 companies with credit records
  Appendix 2 Relevant data of 302 companies without credit records
  Appendix 3 Statistics on the relationship between bank loan annual interest rate and customer churn rate in 2019

  Data description in the attachment:
  (1) Input invoice: the invoice issued by the seller when the enterprise purchases (purchases products).
  (2) Sales invoice: the invoice issued by the enterprise to the buyer when selling products.
  (3) Valid invoices: invoices issued for normal trading activities.
  (4) Void invoice: After issuing an invoice for a transaction, the transaction is canceled for some reason, making the invoice invalid.
  (5) Negative invoice: After the invoice is issued for the transaction, the enterprise has entered the account and recorded the tax, and then the buyer returns the product and refunds for some reason. At this time, a negative invoice needs to be issued.
  (6) Credit rating: The bank will not grant loans to companies with a credit rating of D if the bank manually evaluates the company based on the actual situation of the company.
  (7) Customer churn rate: The rate at which a bank loses potential customers due to factors such as loan interest rates.

Overview of the overall solution process (abstract)

  The credit risk of small, medium and micro enterprises is a major risk in the investment of commercial banks. The source of credit risk is the asymmetry of information between borrowers and lenders. Establishing a credit risk quantification model is an important means to improve information symmetry. This paper establishes a credit decision-making model, quantifies the credit risk of each enterprise through naive Bayesian algorithm and multivariate factor analysis, and gives credit strategies such as loan amount and interest rate.
  For problem 1, first process the data in Appendix 1, and then conduct quantitative analysis on indicators such as credit rating, profitability, and operating capabilities, and establish a naive Bayesian classification model. According to the credit rating, profitability and operating capacity of different enterprises, the loan amount is quantified and allocated to obtain the credit allocation strategy.
  For question 2, we fit the data obtained in question 1 to predict the loan amount and annual interest rate of enterprises without credit records in appendix 2. Based on the naive Bayesian model of problem 1, the loan amount of the enterprise is quantified according to the proportion of the profit, and the credit allocation strategy is obtained.
  In response to question three, the enterprises in Annex 2 are divided into industries based on keywords and tax rates. Considering the impact of the new coronavirus on various industries from various aspects, multivariate factor analysis was carried out through SPSS software. The results show that the economy of the pharmaceutical and medical industries and the Internet industry has grown rapidly, but the rest of the industries have been more seriously affected by the epidemic. In the end, according to the industry category of the 302 companies in Annex II and the economic ups and downs of the company, a rationalized loan classification strategy adjustment is given. Finally, reference the precision, recall, and score to verify the accuracy of the model.

Model assumptions:

  1. Assume that the bank is a general commercial bank with the purpose of obtaining profits;
  2. Ignore the impact of moral hazard on bank credit business;
  3. Assume that all taxes in the invoices are paid on time;
  4. Assume that the enterprises in questions 1 and 2 are not affected The impact of unexpected factors;
  5. It is assumed that the impact of expenses on corporate profits is not considered.

problem analysis:

  Analysis of Question 1
  We use machine learning algorithms to quantitatively analyze the data in Appendix 1 (quantitative analysis can help us measure risks and benefits more intuitively). In the problem, first divide the credit rating of corporate loans into 6 levels, and then calculate the profit of the 123 companies with credit records after filtering out the data of valid invoices according to the data in the attached table. Secondly, based on the profitability of profitable enterprises, the credit rating of enterprises and the number of invoices issued by enterprises, using the naive Bayesian algorithm programming in the classification algorithm, the bank's credit strategy for these enterprises when the total annual credit amount is fixed is analyzed.
  Analysis of Question 2
  For Question 2, use the model in Question 1 to conduct quantitative analysis on the data in Appendix 2. Use the enterprise profit amount and enterprise billing quantity in Appendix 1 to make a fitting curve with the bank loan amount and annual interest rate respectively, and analyze the correlation between each characteristic condition. Since the enterprises in Appendix 2 have no credit rating, we can only use machine learning algorithms to integrate, analyze and calculate data based on the profit and loss of enterprises and the number of invoices issued by enterprises, so as to give the bank's allocation strategy for its loan amount.
  Analysis of Question 3
  In this question, the enterprises in Appendix 2 are divided into industries based on keywords and tax rates. We comprehensively consider the impact of the new coronavirus epidemic from multiple industries such as real estate, manufacturing, infrastructure, service, tourism, medicine, and the Internet, and use SPSS software to do multivariate factor analysis. The economy of the medical industry and the Internet industry has grown rapidly, while other industries have been severely hit by the impact of the epidemic. In the end, according to the industry category of the 321 enterprises in Annex II and the economic ups and downs of the enterprises, a rationalized loan classification strategy adjustment is given.

Model establishment and solution Overall paper thumbnail

insert image description here
insert image description here

For all papers, please see below "Only modeling QQ business cards" Click on the QQ business card

Program code: (code and documentation not free)

#导入excel文件处理库
import openpyxl
#读入企业信贷记录相关数据
wb = openpyxl.load_workbook('附件1:123家有信贷记录企业的相关数据.xlsx')
#选择df1,企业信息
df1 = wb['企业信息']
#定义存储公司名称列表
name_list = []
#循环读入公司名称
for row in df1.iter_rows(min_row=2, max_row=124,min_col=1, max_col=1):
 for cell in row:
 #遍历读取公司名称存入列表中
 name_list.append(cell.value)
#定义存储信誉评级、是否违约列表
leavel_list = []
weiyue_list = []
#循环读入信誉评级与是否违约
for row in df1.iter_rows(min_row=2, max_row=124,min_col=3, max_col=4):
 #展示存储每一行数据
 tmp = []
 #遍历读取数据
 for cell in row:
 tmp.append(cell.value)
 #取第一项数据即为信誉评级
 leavel_list.append(tmp[0])
 #取第二项数据即为是否违约
 weiyue_list.append(tmp[-1])
#将信誉评级与是否违约合并为一个等级
cls = ['leavel_list[i]'+'weiyue_list[i]' for i in range(len(leavel_list))]
#标准的公司分类为6类
cls_normal = ['A否','B是','B否','C是','C否','D是']
#定义存储不同分类的所属公司
cls_gs = []
#遍历获取不同类别的所属公司
for item in cls_normal:
 tmp = []
 for i in range(len(cls)):
 #若分类存在标准分类中,则提取公司名称
 if item == cls[i]:
 tmp.append(name_list[i])
 #存储类别与所属的公司
 cls_gs.append([item,tmp])
 
'''
分类情况
['A否', ['E1', 'E2', 'E6', 'E7', 'E8', 'E9', 'E13', 'E15', 'E16', 'E17', 'E18', 'E19', 'E22', 'E24', 
'E26', 'E27', 'E31', 'E42', 'E48', 'E54', 'E59', 'E64', 'E81', 'E84', 'E88', 'E89', 'E91']]
['B是', ['E45']]
['B否', ['E5', 'E10', 'E12', 'E20', 'E21', 'E23', 'E28', 'E30', 'E32', 'E33', 'E34', 'E35', 'E37', 
'E38', 'E43', 'E51', 'E57', 'E58', 'E60', 'E61', 'E62', 'E63', 'E65', 'E66', 'E67', 'E70', 'E71'

'E74', 'E76', 'E79', 'E83', 'E85', 'E93', 'E95', 'E97', 'E98', 'E106']]
['C是', ['E29', 'E87']]
['C否', ['E3', 'E4', 'E11', 'E14', 'E25', 'E39', 'E40', 'E41', 'E44', 'E46', 'E47', 'E49', 'E50', 
'E53', 'E55', 'E56', 'E68', 'E69', 'E72', 'E73', 'E75', 'E77', 'E78', 'E80', 'E86', 'E90', 'E92', 
'E94', 'E96', 'E104', 'E105', 'E110']]
['D是', ['E36', 'E52', 'E82', 'E99', 'E100', 'E101', 'E102', 'E103', 'E107', 'E108', 'E109', 
'E111', 'E112', 'E113', 'E114', 'E115', 'E116', 'E117', 'E118', 'E119', 'E120', 'E121', 
'E122', 'E123']]
'''
#导入pandas库读取excel
import pandas as pd
#定义函数用于获取行标,返回被查找值的第一个行号
def find_row(num_value,file_name):
 #使用read_excel函数读取文件
 demo_df = pd.read_excel(file_name)
 #遍历索引
 for indexs in demo_df.index:
 for i in range(len(demo_df.loc[indexs].values)):
 #如果行中的值与你所查询值相等则返回该值对应的行号
 if (str(demo_df.loc[indexs].values[i]) == num_value):
 row = str(indexs+2).rstrip('L')
 return row
#提取出进项、销项的数据到单独的excel用于提取行号
filejin_name = '进项发票信息.xlsx'
filechu_name = '销项发票信息.xlsx'
#定义存储进项、销项行号的列表
jin_point = []
chu_point = []
#遍历公司名称,用于获取此公司对应的行号,用于读取价税合计信息
for item in name_list:
 print(name_list.index(item))
 #将行号追加到列表中
 jin_point.append(find_row(item,filejin_name))
 chu_point.append(find_row(item,filechu_name))
#获取附件1中sheet2 进项发票信息
sheet2 = wb['进项发票信息']
#获取附件2中sheet3 销项发票信息
sheet3 = wb['销项发票信息']
#定义函数,用于获取有效发票的税额合计
def getMoney(min_row,max_row):
 #定义存储税额合计的列表
 money = []
 #遍历读取某一个公司的税额合计
 for row in sheet2.iter_rows(min_row=min_row, max_row=max_row,min_col=7, 
max_col=8):
 tmp = []
 for cell in row:
 tmp.append(cell.value)
 #如果为有效发票,则存储此项对应的税额合计
 if tmp[1] == '有效发票':
 money.append(tmp[0])
 #返回此公司的均值
 return round(sum(money)/len(money),2)
#定义存储进项税额合计的列表
In_name = []
#遍历获取进项税额合计
for i in range(len(jin_point)):
 try:
 #存储税额合计的均值
 In_name.append(getMoney(jin_point[i],jin_point[i+1]))
 except:
 pass
 
#定义存储销项税额合计的列表
chu_mean = []
#遍历获取销项税额合计
for i in range(len(chu_point)):
 try:
 #存储税额合计的均值
 chu_mean.append(getMoney(chu_point[i],chu_point[i+1]))
 except:
 pass
print(In_name,chu_mean)
#计算盈亏情况,盈亏情况为销项平均值 - 进项平均值
profit_loss = chu_mean - In_name
print(profit_loss)
For all papers, please see below "Only modeling QQ business cards" Click on the QQ business card

Guess you like

Origin blog.csdn.net/weixin_43292788/article/details/131531123
Recommended