The binning_rgr() finding intervals for numerical variable using recursive information gain ratio maximization.
binning_rgr(.data, y, x, min_perc_bins = 0.1, max_n_bins = 5, ordered = TRUE)
.data | a data frame. |
---|---|
y | character. name of binary response variable. The variable must character of factor. |
x | character. name of continuous characteristic variable. At least 5 different values. and Inf is not allowed. |
min_perc_bins | numeric. minimum percetange of rows for each split or segment (controls the sample size), 0.1 (or 10 percent) as default. |
max_n_bins | integer. maximum number of bins or segments to split the input variable, 5 bins as default. |
ordered | logical. whether to build an ordered factor or not. |
an object of "infogain_bins" class. Attributes of "infogain_bins" class is as follows.
class : "infogain_bins".
type : binning type, "infogain".
breaks : numeric. the number of intervals into which x is to be cut.
levels : character. levels of binned value.
raw : numeric. raw data, x argument value.
target : integer. binary response variable.
x_var : character. name of x variable.
y_var : character. name of y variable.
This function can be usefully used when developing a model that predicts y.
# \donttest{ library(dplyr) # binning by recursive information gain ratio maximization using character bin <- binning_rgr(heartfailure, "death_event", "creatinine") # binning by recursive information gain ratio maximization using name bin <- binning_rgr(heartfailure, death_event, creatinine) bin#> binned type: infogain #> number of bins: 5 #> x #> [0.5,1.0) 1.0 [1.1,1.2) [1.2,1.7) [1.7,9.4] #> 81 50 43 64 61#> levels freq rate #> 1 [0.5,1.0) 81 0.2709030 #> 2 1.0 50 0.1672241 #> 3 [1.1,1.2) 43 0.1438127 #> 4 [1.2,1.7) 64 0.2140468 #> 5 [1.7,9.4] 61 0.2040134#> [1] [1.7,9.4] [1.1,1.2) [1.2,1.7) [1.7,9.4] [1.7,9.4] [1.7,9.4] [1.2,1.7) #> [8] [1.1,1.2) [1.2,1.7) [1.7,9.4] [1.7,9.4] [0.5,1.0) [1.1,1.2) [1.1,1.2) #> [15] 1.0 [1.2,1.7) [0.5,1.0) [0.5,1.0) 1.0 [1.7,9.4] #> Levels: [0.5,1.0) < 1.0 < [1.1,1.2) < [1.2,1.7) < [1.7,9.4]# }