用 R 语言绘制 ROC 曲线

在本文中,我们将仔细研究机器学习的一个重要错误指标 - R 编程中的 Plotting ROC 曲线。

让我们开始吧!!!

ROC曲线的必要性

错误指标使我们能够评估和证明模型在特定数据集中的运作。

ROC插图是这样的错误指数之一。

**ROC插图,也称为ROC AUC曲线,是一种分类错误指标,即衡量分类机器学习算法的运作和结果。

要准确地说,ROC曲线代表了值的概率曲线,而AUC是不同值组/标签的分离性尺度。

** AUC 得分越高,预测值的分类就越好。

例如,考虑一个模型来预测和分类投掷的结果是否是头或尾巴。

因此,如果AUC得分高,则表明该模型能够更有效地将头部分为头部和尾巴分为尾巴。

在技术上,ROC曲线是模型的真正率和虚假正率之间绘制的。

现在让我们尝试在下一节中实施ROC曲线的概念!

方法 I:使用 plot() 函数

我们可以使用ROC曲线来评估机器学习模型,以及之前讨论的模型,因此,让我们尝试将ROC曲线的概念应用于物流回归模型。

让我们开始吧!!!:)

在本示例中,我们将使用银行贷款的默认数据集通过物流回归进行建模。我们将使用从**pROC**库中plot() 函数绘制 ROC 曲线。

首先,我们将数据集加载到环境中使用 read.csv()函数 2. 数据集的分割是建模之前的一个关键步骤。因此,我们使用 R 文档中的 createDataPartition() 函数 来样本数据集的训练和测试数据值。我们已经设置了某些错误指标来评估模型的运作,其中包括 (precision)(/community/tutorials/calculate-precision-in-r-programming), (Recall)(错误/ plot/tutorials/recall-in-r-programming)

 1rm(list = ls())
 2#Setting the working directory
 3setwd("D:/Edwisor_Project - Loan_Defaulter/")
 4getwd()
 5#Load the dataset
 6dta = read.csv("bank-loan.csv",header=TRUE)
 7
 8### Data SAMPLING ####
 9library(caret)
10set.seed(101)
11split = createDataPartition(data$default, p = 0.80, list = FALSE)
12train_data = data[split,]
13test_data = data[-split,]
14
15#error metrics -- Confusion Matrix
16err_metric=function(CM)
17{
18  TN =CM[1,1]
19  TP =CM[2,2]
20  FP =CM[1,2]
21  FN =CM[2,1]
22  precision =(TP)/(TP+FP)
23  recall_score =(FP)/(FP+TN)
24  f1_score=2*((precision*recall_score)/(precision+recall_score))
25  accuracy_model  =(TP+TN)/(TP+TN+FP+FN)
26  False_positive_rate =(FP)/(FP+TN)
27  False_negative_rate =(FN)/(FN+TP)
28  print(paste("Precision value of the model: ",round(precision,2)))
29  print(paste("Accuracy of the model: ",round(accuracy_model,2)))
30  print(paste("Recall value of the model: ",round(recall_score,2)))
31  print(paste("False Positive rate of the model: ",round(False_positive_rate,2)))
32  print(paste("False Negative rate of the model: ",round(False_negative_rate,2)))
33  print(paste("f1 score of the model: ",round(f1_score,2)))
34}
35
36# 1. Logistic regression
37logit_m =glm(formula = default~. ,data =train_data ,family='binomial')
38summary(logit_m)
39logit_P = predict(logit_m , newdata = test_data[-13] ,type = 'response' )
40logit_P <- ifelse(logit_P > 0.5,1,0) # Probability check
41CM= table(test_data[,13] , logit_P)
42print(CM)
43err_metric(CM)
44
45#ROC-curve using pROC library
46library(pROC)
47roc_score=roc(test_data[,13], logit_P) #AUC score
48plot(roc_score ,main ="ROC curve -- Logistic Regression ")

出发点:**

ROC Curve Logistic Regression 1

方法二:使用 roc.plot() 函数

R 编程为我们提供了另一个名为验证的库,用于为模型绘制 ROC-AUC 曲线。

为了使用该功能,我们需要在我们的环境中安装和导入验证库。

这样做后,我们使用roc.plot() 函数编写数据,以便在数据值的** 敏感性**和** 具体性**之间进行清晰的评估,如下所示。

1install.packages("verification")
2library(verification)
3x<- c(0,0,0,1,1,1)
4y<- c(.7, .7, 0, 1,5,.6)
5data<-data.frame(x,y)
6names(data)<-c("yes","no")
7roc.plot(data$yes, data$no)

出发点:**

ROC Plot Using Verification Package

结论

由此,我们已经到这个话题的尽头. 请自由评论下面,如果你遇到任何问题。

尝试与其他机器学习模型实现ROC插图的概念,并在评论部分告诉我们您的理解。

直到那时,保持调节和快乐的学习!! :)