Geology—success rate curve drawing

Article Source

"Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines."——"Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines." Evaluate"

The introduction to the success rate curve in this article is as follows:

Insert image description here
Insert image description here

translate:

Figure 5a shows the success rate in estimating known gold deposits by different percentages of prospective areas. The area defined as highly prospective in the RF map is much smaller compared to other MLA models. Therefore, in order to achieve success rates similar to RF, other MLA methods need to delineate larger expected areas. It can be observed how RF and SVM start with similar success rates, although in the case of RFThe slope of the success rate curve is steeper (Explanation: Different slopes represent different mineralization potentials. The greater the slope, the more mineral deposits can be captured in a smaller area). For the expected area percentage threshold above 10%, the success rate of RF and SVM exceeds 90%, while that of ANN is only equal to 70%. However, when 15% of the study area is considered prospective, the RF success rate converges to a success rate value of 98%. SVM needs to delineate 35% of the area to achieve this success rate value. RT experienced the worst success rate, reaching values ​​above 95% only in areas above 75%.

Algorithm of success rate curve

The success rate is calculated based onDifferent thresholds for potential zone area percentagesReclassify the gold potential map andCalculate the success rate (true positive rate; TPR) of these potential zones compared to known gold deposits(Agterberg and Bonham-Carter, 2005). Success rate is the percentage of training deposits that are correctly delineated in the prospect zone. In this study, it was crucial to achieve a high success rate in the smallest possible prospect area, given that mining costs are directly related to the extent of the prospect.

Among the aboveTPR = (预测为正,真实为正) / (所有已知正例)

Code

# 将整个研究区划分为有矿/无矿二分类
data_sum =  	# 此处为神经网络模型对整个研究区域的矿区单元的成矿概率预测值,区间值为[0,1]
high = data_sum.shape[0]
width = data_sum.shape[1]
total_area = width * high	# 整个研究区域面积

total = []
TPR = []

# 用坐标表示每个矿区单元,比如 (0,0)就是左上角的小区域
for i in range(high):
    for j in range(width):
        total.append((i,j))

def calculate_TPR():
    start = time.time()
    for i in np.linspace(0, 1.0, 101):
        TP = 0
        end_index = round(total_area * i)
        temp = total[0:end_index+1]		# 对面积进行百分阈值划分
        new_list = [item for item in temp if item in positive_location]		# positive_location是已知正样本坐标
        for (n,p) in new_list:			# 计算出在该阈值下面积中被正确预测的正样本
            if data_sum[n][p] >= 0.5:
                TP += 1
        TPR.append(TP/TP_plus_FN)
    end = time.time()
    print('耗时{:.2f}秒'.format(end-start))
    np.save('./numpyData/SuccessRate/'+fileName+'.npy',TPR)		# 保存计算结果

calculate_TPR()
# TPR = np.load('./numpyData/SuccessRate/'+fileName+'.npy')

# 设置画布大小 8*8
fig, ax1 = plt.subplots(figsize=(8, 8), dpi=100)

# 设置网格线
ax1.grid(axis='both', linestyle='-.')

# 标记3个点的坐标
first_index = (0.27,0.259)
second_index = (0.62,0.659)
third_index = (0.90,0.907)

# 绘图
ax1.plot(np.linspace(0, 1.0, 101), TPR, color="red", alpha=0.5, linewidth= 2)

# 绘制3个折点
ax1.plot(first_index[0], first_index[1], 'ko')
plt.hlines(first_index[1], -0.04, first_index[0], color="skyblue", linestyle='--')       # 画出横线到坐标
plt.vlines(first_index[0], -0.04, first_index[1], color="skyblue", linestyle='--')        # 画出竖线到坐标

ax1.plot(second_index[0], second_index[1], 'ko')
plt.hlines(second_index[1], -0.04, second_index[0], color="skyblue", linestyle='--')       # 画出横线到坐标
plt.vlines(second_index[0], -0.04, second_index[1], color="skyblue", linestyle='--')        # 画出竖线到坐标

ax1.plot(third_index[0], third_index[1], 'ko')
plt.hlines(third_index[1], -0.04, third_index[0], color="skyblue", linestyle='--')       # 画出横线到坐标
plt.vlines(third_index[0], -0.04, third_index[1], color="skyblue", linestyle='--')        # 画出竖线到坐标

show_first = str(first_index)
show_second = str(second_index)
show_third = str(third_index)

# 3个点标注信息
plt.annotate(show_first,xy=(first_index[0], first_index[1]),xytext=(first_index[0]+0.02, first_index[1]-0.02))
plt.annotate(show_second,xy=(second_index[0], second_index[1]),xytext=(second_index[0]+0.02, second_index[1]-0.02))
plt.annotate(show_third,xy=(third_index[0], third_index[1]),xytext=(third_index[0]+0.02, third_index[1]-0.02))


# ax1.set_yticks(np.linspace(0, 100, 11), custom_y_left)     # 自定义 y轴刻度
ax1.set_yticks(np.linspace(0, 1.0, 11))     # 设置 y轴刻度
plt.ylim((-0.04,1.04))                      # 设置 y轴显示范围
ax1.set_xticks(np.linspace(0, 1.0, 11))     # 设置 x轴刻度
plt.xlim((-0.04,1.04))                      # 设置 x轴显示范围

# 设置 x、y轴坐标信息
ax1.set_xlabel('Percentage of prospective areas (%)', fontdict={
    
    'size': 16})
ax1.set_ylabel('Success rate (%)', fontdict={
    
    'size': 16})

plt.title(fileName)
plt.show(block=True)

renderings

Insert image description here

Guess you like

Origin blog.csdn.net/qq_56039091/article/details/127169180