summary method for "optimal_bins". summary metrics to evaluate the performance of binomial classification model.
# S3 method for class 'optimal_bins'
summary(object, ...)NULL.
print() to print only binning table information of "optimal_bins" objects. summary.performance_bin() includes general metrics and result of significance tests life follows.:
Binning Table : Metrics by bins.
CntRec, CntPos, CntNeg, RatePos, RateNeg, Odds, WoE, IV, JSD, AUC.
General Metrics.
Gini index.
Jeffrey's Information Value.
Jensen-Shannon Divergence.
Kolmogorov-Smirnov Statistics.
Herfindahl-Hirschman Index.
normalized Herfindahl-Hirschman Index.
Cramer's V Statistics.
Table of Significance Tests.
library(dplyr)
# Generate data for the example
heartfailure2 <- heartfailure
heartfailure2[sample(seq(NROW(heartfailure2)), 5), "creatinine"] <- NA
# optimal binning
bin <- binning_by(heartfailure2, "death_event", "creatinine")
#> Warning: The factor y has been changed to a numeric vector consisting of 0 and 1.
#> 'Yes' changed to 1 (positive) and 'No' changed to 0 (negative).
bin
#> binned type: optimal
#> number of bins: 3
#> x
#> [0.5,0.9] (0.9,1.8] (1.8,9.4] <NA>
#> 78 168 48 5
# summary optimal_bins class
summary(bin)
#> ── Binning Table ──────────────────────── Several Metrics ──
#> Bin CntRec CntPos CntNeg RatePos RateNeg Odds WoE IV
#> 1 [0.5,0.9] 78 9 69 0.09375 0.33990 0.13043 -1.28802 0.31705
#> 2 (0.9,1.8] 168 52 116 0.54167 0.57143 0.44828 -0.05349 0.00159
#> 3 (1.8,9.4] 48 35 13 0.36458 0.06404 2.69231 1.73926 0.52272
#> 4 <NA> 5 0 5 0.00000 0.02463 0.00000 NA NA
#> 5 Total 299 96 203 1.00000 1.00000 0.47291 NA NA
#> JSD AUC
#> 1 0.03710 0.01593
#> 2 0.00020 0.20833
#> 3 0.05818 0.05237
#> 4 NA 0.02463
#> 5 NA 0.30126
#>
#> ── General Metrics ─────────────────────────────────────────
#> • Gini index : -0.39748
#> • IV (Jeffrey) : NA
#> • JS (Jensen-Shannon) Divergence : NA
#> • Kolmogorov-Smirnov Statistics : 0.27591
#> • HHI (Herfindahl-Hirschman Index) : 0.40981
#> • HHI (normalized) : 0.21307
#> • Cramer's V : 0.41821
#>
#> ── Significance Tests ──────────────────── Chisquare Test ──
#> Bin A Bin B statistics p_value
#> 1 [0.5,0.9] (0.9,1.8] 10.76624 1.033685e-03
#> 2 (0.9,1.8] (1.8,9.4] 27.33097 1.714438e-07
#>
# performance table
attr(bin, "performance")
#> Bin CntRec CntPos CntNeg CntCumPos CntCumNeg RatePos RateNeg RateCumPos
#> 1 [0.5,0.9] 78 9 69 9 69 0.09375 0.33990 0.09375
#> 2 (0.9,1.8] 168 52 116 61 185 0.54167 0.57143 0.63542
#> 3 (1.8,9.4] 48 35 13 96 198 0.36458 0.06404 1.00000
#> 4 <NA> 5 0 5 96 203 0.00000 0.02463 1.00000
#> 5 Total 299 96 203 NA NA 1.00000 1.00000 NA
#> RateCumNeg Odds LnOdds WoE IV JSD AUC
#> 1 0.33990 0.13043 -2.03688 -1.28802 0.31705 0.03710 0.01593
#> 2 0.91133 0.44828 -0.80235 -0.05349 0.00159 0.00020 0.20833
#> 3 0.97537 2.69231 0.99040 1.73926 0.52272 0.05818 0.05237
#> 4 1.00000 0.00000 -Inf NA NA NA 0.02463
#> 5 NA 0.47291 -0.74886 NA NA NA 0.30126
# extract binned results
if (!is.null(bin)) {
extract(bin) %>%
head(20)
}
#> [1] (1.8,9.4] (0.9,1.8] (0.9,1.8] (1.8,9.4] (1.8,9.4] (1.8,9.4] (0.9,1.8]
#> [8] (0.9,1.8] (0.9,1.8] (1.8,9.4] (1.8,9.4] [0.5,0.9] (0.9,1.8] (0.9,1.8]
#> [15] (0.9,1.8] (0.9,1.8] [0.5,0.9] [0.5,0.9] (0.9,1.8] (1.8,9.4]
#> Levels: [0.5,0.9] < (0.9,1.8] < (1.8,9.4]