Calculate some representative metrics for binary classification model evaluation.

performance_metric(
  pred,
  actual,
  positive,
  metric = c("ZeroOneLoss", "Accuracy", "Precision", "Recall", "Sensitivity",
    "Specificity", "F1_Score", "Fbeta_Score", "LogLoss", "AUC", "Gini", "PRAUC",
    "LiftAUC", "GainAUC", "KS_Stat", "ConfusionMatrix"),
  cutoff = 0.5,
  beta = 1
)

Arguments

pred

numeric. Probability values that predicts the positive class of the target variable.

actual

factor. The value of the actual target variable.

positive

character. Level of positive class of binary classification.

metric

character. The performance metrics you want to calculate. See details.

cutoff

numeric. Threshold for classifying predicted probability values into positive and negative classes.

beta

numeric. Weight of precision in harmonic mean for F-Beta Score.

Value

numeric or table object. Confusion Matrix return by table object. and otherwise is numeric.: The performance metrics calculated are as follows.:

  • ZeroOneLoss : Normalized Zero-One Loss(Classification Error Loss).

  • Accuracy : Accuracy.

  • Precision : Precision.

  • Recall : Recall.

  • Sensitivity : Sensitivity.

  • Specificity : Specificity.

  • F1_Score : F1 Score.

  • Fbeta_Score : F-Beta Score.

  • LogLoss : Log loss / Cross-Entropy Loss.

  • AUC : Area Under the Receiver Operating Characteristic Curve (ROC AUC).

  • Gini : Gini Coefficient.

  • PRAUC : Area Under the Precision-Recall Curve (PR AUC).

  • LiftAUC : Area Under the Lift Chart.

  • GainAUC : Area Under the Gain Chart.

  • KS_Stat : Kolmogorov-Smirnov Statistic.

  • ConfusionMatrix : Confusion Matrix.

Details

The cutoff argument applies only if the metric argument is "ZeroOneLoss", "Accuracy", "Precision", "Recall", "Sensitivity", "Specificity", "F1_Score", "Fbeta_Score", "ConfusionMatrix".

Examples

library(dplyr)

# Divide the train data set and the test data set.
sb <- rpart::kyphosis %>%
  split_by(Kyphosis)

# Extract the train data set from original data set.
train <- sb %>%
  extract_set(set = "train")

# Extract the test data set from original data set.
test <- sb %>%
  extract_set(set = "test")

# Sampling for unbalanced data set using SMOTE(synthetic minority over-sampling technique).
train <- sb %>%
  sampling_target(seed = 1234L, method = "ubSMOTE")

# Cleaning the set.
train <- train %>%
  cleanse
#> ── Checking unique value ─────────────────────────── unique value is one ──
#> No variables that unique value is one.
#> 
#> ── Checking unique rate ─────────────────────────────── high unique rate ──
#> No variables that high unique rate.
#> 
#> ── Checking character variables ─────────────────────── categorical data ──
#> No character variables.
#> 
#> 

# Run the model fitting.
result <- run_models(.data = train, target = "Kyphosis", positive = "present")
result
#> # A tibble: 7 × 7
#>   step     model_id     target   is_factor positive negative fitted_model
#>   <chr>    <chr>        <chr>    <lgl>     <chr>    <chr>    <list>      
#> 1 1.Fitted logistic     Kyphosis TRUE      present  absent   <glm>       
#> 2 1.Fitted rpart        Kyphosis TRUE      present  absent   <rpart>     
#> 3 1.Fitted ctree        Kyphosis TRUE      present  absent   <BinaryTr>  
#> 4 1.Fitted randomForest Kyphosis TRUE      present  absent   <rndmFrs.>  
#> 5 1.Fitted ranger       Kyphosis TRUE      present  absent   <ranger>    
#> 6 1.Fitted xgboost      Kyphosis TRUE      present  absent   <xgb.Bstr>  
#> 7 1.Fitted lasso        Kyphosis TRUE      present  absent   <lognet>    

# Predict the model.
pred <- run_predict(result, test)
pred
#> # A tibble: 7 × 8
#>   step       model_id target is_factor positive negative fitted_model predicted 
#>   <chr>      <chr>    <chr>  <lgl>     <chr>    <chr>    <list>       <list>    
#> 1 2.Predict… logistic Kypho… TRUE      present  absent   <glm>        <prdct_cl>
#> 2 2.Predict… rpart    Kypho… TRUE      present  absent   <rpart>      <prdct_cl>
#> 3 2.Predict… ctree    Kypho… TRUE      present  absent   <BinaryTr>   <prdct_cl>
#> 4 2.Predict… randomF… Kypho… TRUE      present  absent   <rndmFrs.>   <prdct_cl>
#> 5 2.Predict… ranger   Kypho… TRUE      present  absent   <ranger>     <prdct_cl>
#> 6 2.Predict… xgboost  Kypho… TRUE      present  absent   <xgb.Bstr>   <prdct_cl>
#> 7 2.Predict… lasso    Kypho… TRUE      present  absent   <lognet>     <prdct_cl>

# Calculate Accuracy.
performance_metric(attr(pred$predicted[[1]], "pred_prob"), test$Kyphosis,
  "present", "Accuracy")
#> [1] 0.5833333
# Calculate Confusion Matrix.
performance_metric(attr(pred$predicted[[1]], "pred_prob"), test$Kyphosis,
  "present", "ConfusionMatrix")
#>          actual
#> predict   absent present
#>   absent       9       0
#>   present     10       5
# Calculate Confusion Matrix by cutoff = 0.55.
performance_metric(attr(pred$predicted[[1]], "pred_prob"), test$Kyphosis,
  "present", "ConfusionMatrix", cutoff = 0.55)
#>          actual
#> predict   absent present
#>   absent       9       0
#>   present     10       5