The compare_category() compute information to examine the relationship between categorical variables.

compare_category(.data, ...)

# S3 method for data.frame
compare_category(.data, ...)

Arguments

.data

a data.frame or a tbl_df.

...

one or more unquoted expressions separated by commas. You can treat variable names like they are positions. Positive values select variables; negative values to drop variables. These arguments are automatically quoted and evaluated in a context where column names represent column positions. They support unquoting and splicing.

Value

An object of the class as compare based list. The information to examine the relationship between categorical variables is as follows each components.

  • var1 : factor. The level of the first variable to compare. 'var1' is the name of the first variable to be compared.

  • var2 : factor. The level of the second variable to compare. 'var2' is the name of the second variable to be compared.

  • n : integer. frequency by var1 and var2.

  • rate : double. relative frequency.

  • first_rate : double. relative frequency in first variable.

  • second_rate : double. relative frequency in second variable.

Details

It is important to understand the relationship between categorical variables in EDA. compare_category() compares relations by pair combination of all categorical variables. and return compare_category class that based list object.

Attributes of return object

Attributes of compare_category class is as follows.

  • variables : character. List of variables selected for comparison.

  • combination : matrix. It consists of pairs of variables to compare.

Examples

# \donttest{
# Generate data for the example
heartfailure2 <- heartfailure
heartfailure2[sample(seq(NROW(heartfailure2)), 5), "smoking"] <- NA

library(dplyr)

# Compare the all categorical variables
all_var <- compare_category(heartfailure2)

# Print compare_numeric class objects
all_var
#> $`anaemia vs diabetes`
#> # A tibble: 4 × 6
#>   anaemia diabetes     n  rate var1_rate var2_rate
#>   <fct>   <fct>    <int> <dbl>     <dbl>     <dbl>
#> 1 No      No          98 0.328     0.576     0.563
#> 2 No      Yes         72 0.241     0.424     0.576
#> 3 Yes     No          76 0.254     0.589     0.437
#> 4 Yes     Yes         53 0.177     0.411     0.424
#> 
#> $`anaemia vs hblood_pressure`
#> # A tibble: 4 × 6
#>   anaemia hblood_pressure     n  rate var1_rate var2_rate
#>   <fct>   <fct>           <int> <dbl>     <dbl>     <dbl>
#> 1 No      No                113 0.378     0.665     0.582
#> 2 No      Yes                57 0.191     0.335     0.543
#> 3 Yes     No                 81 0.271     0.628     0.418
#> 4 Yes     Yes                48 0.161     0.372     0.457
#> 
#> $`anaemia vs sex`
#> # A tibble: 4 × 6
#>   anaemia sex        n  rate var1_rate var2_rate
#>   <fct>   <fct>  <int> <dbl>     <dbl>     <dbl>
#> 1 No      Female    53 0.177     0.312     0.505
#> 2 No      Male     117 0.391     0.688     0.603
#> 3 Yes     Female    52 0.174     0.403     0.495
#> 4 Yes     Male      77 0.258     0.597     0.397
#> 
#> $`anaemia vs smoking`
#> # A tibble: 5 × 6
#>   anaemia smoking     n   rate var1_rate var2_rate
#>   <fct>   <fct>   <int>  <dbl>     <dbl>     <dbl>
#> 1 No      No        105 0.351     0.618      0.525
#> 2 No      Yes        60 0.201     0.353      0.638
#> 3 No      NA          5 0.0167    0.0294     1    
#> 4 Yes     No         95 0.318     0.736      0.475
#> 5 Yes     Yes        34 0.114     0.264      0.362
#> 
#> $`anaemia vs death_event`
#> # A tibble: 4 × 6
#>   anaemia death_event     n  rate var1_rate var2_rate
#>   <fct>   <fct>       <int> <dbl>     <dbl>     <dbl>
#> 1 No      No            120 0.401     0.706     0.591
#> 2 No      Yes            50 0.167     0.294     0.521
#> 3 Yes     No             83 0.278     0.643     0.409
#> 4 Yes     Yes            46 0.154     0.357     0.479
#> 
#> $`diabetes vs hblood_pressure`
#> # A tibble: 4 × 6
#>   diabetes hblood_pressure     n  rate var1_rate var2_rate
#>   <fct>    <fct>           <int> <dbl>     <dbl>     <dbl>
#> 1 No       No                112 0.375     0.644     0.577
#> 2 No       Yes                62 0.207     0.356     0.590
#> 3 Yes      No                 82 0.274     0.656     0.423
#> 4 Yes      Yes                43 0.144     0.344     0.410
#> 
#> $`diabetes vs sex`
#> # A tibble: 4 × 6
#>   diabetes sex        n  rate var1_rate var2_rate
#>   <fct>    <fct>  <int> <dbl>     <dbl>     <dbl>
#> 1 No       Female    50 0.167     0.287     0.476
#> 2 No       Male     124 0.415     0.713     0.639
#> 3 Yes      Female    55 0.184     0.44      0.524
#> 4 Yes      Male      70 0.234     0.56      0.361
#> 
#> $`diabetes vs smoking`
#> # A tibble: 6 × 6
#>   diabetes smoking     n    rate var1_rate var2_rate
#>   <fct>    <fct>   <int>   <dbl>     <dbl>     <dbl>
#> 1 No       No        107 0.358     0.615       0.535
#> 2 No       Yes        66 0.221     0.379       0.702
#> 3 No       NA          1 0.00334   0.00575     0.2  
#> 4 Yes      No         93 0.311     0.744       0.465
#> 5 Yes      Yes        28 0.0936    0.224       0.298
#> 6 Yes      NA          4 0.0134    0.032       0.8  
#> 
#> $`diabetes vs death_event`
#> # A tibble: 4 × 6
#>   diabetes death_event     n  rate var1_rate var2_rate
#>   <fct>    <fct>       <int> <dbl>     <dbl>     <dbl>
#> 1 No       No            118 0.395     0.678     0.581
#> 2 No       Yes            56 0.187     0.322     0.583
#> 3 Yes      No             85 0.284     0.68      0.419
#> 4 Yes      Yes            40 0.134     0.32      0.417
#> 
#> $`hblood_pressure vs sex`
#> # A tibble: 4 × 6
#>   hblood_pressure sex        n  rate var1_rate var2_rate
#>   <fct>           <fct>  <int> <dbl>     <dbl>     <dbl>
#> 1 No              Female    61 0.204     0.314     0.581
#> 2 No              Male     133 0.445     0.686     0.686
#> 3 Yes             Female    44 0.147     0.419     0.419
#> 4 Yes             Male      61 0.204     0.581     0.314
#> 
#> $`hblood_pressure vs smoking`
#> # A tibble: 6 × 6
#>   hblood_pressure smoking     n    rate var1_rate var2_rate
#>   <fct>           <fct>   <int>   <dbl>     <dbl>     <dbl>
#> 1 No              No        125 0.418     0.644       0.625
#> 2 No              Yes        65 0.217     0.335       0.691
#> 3 No              NA          4 0.0134    0.0206      0.8  
#> 4 Yes             No         75 0.251     0.714       0.375
#> 5 Yes             Yes        29 0.0970    0.276       0.309
#> 6 Yes             NA          1 0.00334   0.00952     0.2  
#> 
#> $`hblood_pressure vs death_event`
#> # A tibble: 4 × 6
#>   hblood_pressure death_event     n  rate var1_rate var2_rate
#>   <fct>           <fct>       <int> <dbl>     <dbl>     <dbl>
#> 1 No              No            137 0.458     0.706     0.675
#> 2 No              Yes            57 0.191     0.294     0.594
#> 3 Yes             No             66 0.221     0.629     0.325
#> 4 Yes             Yes            39 0.130     0.371     0.406
#> 
#> $`sex vs smoking`
#> # A tibble: 6 × 6
#>   sex    smoking     n    rate var1_rate var2_rate
#>   <fct>  <fct>   <int>   <dbl>     <dbl>     <dbl>
#> 1 Female No        100 0.334     0.952      0.5   
#> 2 Female Yes         4 0.0134    0.0381     0.0426
#> 3 Female NA          1 0.00334   0.00952    0.2   
#> 4 Male   No        100 0.334     0.515      0.5   
#> 5 Male   Yes        90 0.301     0.464      0.957 
#> 6 Male   NA          4 0.0134    0.0206     0.8   
#> 
#> $`sex vs death_event`
#> # A tibble: 4 × 6
#>   sex    death_event     n  rate var1_rate var2_rate
#>   <fct>  <fct>       <int> <dbl>     <dbl>     <dbl>
#> 1 Female No             71 0.237     0.676     0.350
#> 2 Female Yes            34 0.114     0.324     0.354
#> 3 Male   No            132 0.441     0.680     0.650
#> 4 Male   Yes            62 0.207     0.320     0.646
#> 
#> $`smoking vs death_event`
#> # A tibble: 6 × 6
#>   smoking death_event     n    rate var1_rate var2_rate
#>   <fct>   <fct>       <int>   <dbl>     <dbl>     <dbl>
#> 1 No      No            135 0.452       0.675   0.665  
#> 2 No      Yes            65 0.217       0.325   0.677  
#> 3 Yes     No             66 0.221       0.702   0.325  
#> 4 Yes     Yes            28 0.0936      0.298   0.292  
#> 5 NA      No              2 0.00669     0.4     0.00985
#> 6 NA      Yes             3 0.0100      0.6     0.0312 
#> 

# Compare the categorical variables that case of joint the death_event variable
all_var %>% 
  "["(grep("death_event", names(all_var)))
#> $`anaemia vs death_event`
#> # A tibble: 4 × 6
#>   anaemia death_event     n  rate var1_rate var2_rate
#>   <fct>   <fct>       <int> <dbl>     <dbl>     <dbl>
#> 1 No      No            120 0.401     0.706     0.591
#> 2 No      Yes            50 0.167     0.294     0.521
#> 3 Yes     No             83 0.278     0.643     0.409
#> 4 Yes     Yes            46 0.154     0.357     0.479
#> 
#> $`diabetes vs death_event`
#> # A tibble: 4 × 6
#>   diabetes death_event     n  rate var1_rate var2_rate
#>   <fct>    <fct>       <int> <dbl>     <dbl>     <dbl>
#> 1 No       No            118 0.395     0.678     0.581
#> 2 No       Yes            56 0.187     0.322     0.583
#> 3 Yes      No             85 0.284     0.68      0.419
#> 4 Yes      Yes            40 0.134     0.32      0.417
#> 
#> $`hblood_pressure vs death_event`
#> # A tibble: 4 × 6
#>   hblood_pressure death_event     n  rate var1_rate var2_rate
#>   <fct>           <fct>       <int> <dbl>     <dbl>     <dbl>
#> 1 No              No            137 0.458     0.706     0.675
#> 2 No              Yes            57 0.191     0.294     0.594
#> 3 Yes             No             66 0.221     0.629     0.325
#> 4 Yes             Yes            39 0.130     0.371     0.406
#> 
#> $`sex vs death_event`
#> # A tibble: 4 × 6
#>   sex    death_event     n  rate var1_rate var2_rate
#>   <fct>  <fct>       <int> <dbl>     <dbl>     <dbl>
#> 1 Female No             71 0.237     0.676     0.350
#> 2 Female Yes            34 0.114     0.324     0.354
#> 3 Male   No            132 0.441     0.680     0.650
#> 4 Male   Yes            62 0.207     0.320     0.646
#> 
#> $`smoking vs death_event`
#> # A tibble: 6 × 6
#>   smoking death_event     n    rate var1_rate var2_rate
#>   <fct>   <fct>       <int>   <dbl>     <dbl>     <dbl>
#> 1 No      No            135 0.452       0.675   0.665  
#> 2 No      Yes            65 0.217       0.325   0.677  
#> 3 Yes     No             66 0.221       0.702   0.325  
#> 4 Yes     Yes            28 0.0936      0.298   0.292  
#> 5 NA      No              2 0.00669     0.4     0.00985
#> 6 NA      Yes             3 0.0100      0.6     0.0312 
#> 

# Compare the two categorical variables
two_var <- compare_category(heartfailure2, smoking, death_event)

# Print compare_category class objects
two_var
#> $`smoking vs death_event`
#> # A tibble: 6 × 6
#>   smoking death_event     n    rate var1_rate var2_rate
#>   <fct>   <fct>       <int>   <dbl>     <dbl>     <dbl>
#> 1 No      No            135 0.452       0.675   0.665  
#> 2 No      Yes            65 0.217       0.325   0.677  
#> 3 Yes     No             66 0.221       0.702   0.325  
#> 4 Yes     Yes            28 0.0936      0.298   0.292  
#> 5 NA      No              2 0.00669     0.4     0.00985
#> 6 NA      Yes             3 0.0100      0.6     0.0312 
#> 

# Filtering the case of smoking included NA 
two_var %>%
  "[["(1) %>% 
  filter(!is.na(smoking))
#> # A tibble: 4 × 6
#>   smoking death_event     n   rate var1_rate var2_rate
#>   <fct>   <fct>       <int>  <dbl>     <dbl>     <dbl>
#> 1 No      No            135 0.452      0.675     0.665
#> 2 No      Yes            65 0.217      0.325     0.677
#> 3 Yes     No             66 0.221      0.702     0.325
#> 4 Yes     Yes            28 0.0936     0.298     0.292

# Summary the all case : Return a invisible copy of an object.
stat <- summary(all_var)
#> ── Contingency tables ──────────────────────────── Number of table is 15 ── 
#> $`anaemia vs diabetes`
#>        diabetes
#> anaemia No Yes
#>     No  98  72
#>     Yes 76  53
#> 
#> $`anaemia vs hblood_pressure`
#>        hblood_pressure
#> anaemia  No Yes
#>     No  113  57
#>     Yes  81  48
#> 
#> $`anaemia vs sex`
#>        sex
#> anaemia Female Male
#>     No      53  117
#>     Yes     52   77
#> 
#> $`anaemia vs smoking`
#>        smoking
#> anaemia  No Yes
#>     No  105  60
#>     Yes  95  34
#> 
#> $`anaemia vs death_event`
#>        death_event
#> anaemia  No Yes
#>     No  120  50
#>     Yes  83  46
#> 
#> $`diabetes vs hblood_pressure`
#>         hblood_pressure
#> diabetes  No Yes
#>      No  112  62
#>      Yes  82  43
#> 
#> $`diabetes vs sex`
#>         sex
#> diabetes Female Male
#>      No      50  124
#>      Yes     55   70
#> 
#> $`diabetes vs smoking`
#>         smoking
#> diabetes  No Yes
#>      No  107  66
#>      Yes  93  28
#> 
#> $`diabetes vs death_event`
#>         death_event
#> diabetes  No Yes
#>      No  118  56
#>      Yes  85  40
#> 
#> $`hblood_pressure vs sex`
#>                sex
#> hblood_pressure Female Male
#>             No      61  133
#>             Yes     44   61
#> 
#> $`hblood_pressure vs smoking`
#>                smoking
#> hblood_pressure  No Yes
#>             No  125  65
#>             Yes  75  29
#> 
#> $`hblood_pressure vs death_event`
#>                death_event
#> hblood_pressure  No Yes
#>             No  137  57
#>             Yes  66  39
#> 
#> $`sex vs smoking`
#>         smoking
#> sex       No Yes
#>   Female 100   4
#>   Male   100  90
#> 
#> $`sex vs death_event`
#>         death_event
#> sex       No Yes
#>   Female  71  34
#>   Male   132  62
#> 
#> $`smoking vs death_event`
#>        death_event
#> smoking  No Yes
#>     No  135  65
#>     Yes  66  28
#> 
#> ── Relative contingency tables ─────────────────── Number of table is 15 ── 
#> $`anaemia vs diabetes`
#>        diabetes
#> anaemia        No       Yes
#>     No  0.3277592 0.2408027
#>     Yes 0.2541806 0.1772575
#> 
#> $`anaemia vs hblood_pressure`
#>        hblood_pressure
#> anaemia        No       Yes
#>     No  0.3779264 0.1906355
#>     Yes 0.2709030 0.1605351
#> 
#> $`anaemia vs sex`
#>        sex
#> anaemia    Female      Male
#>     No  0.1772575 0.3913043
#>     Yes 0.1739130 0.2575251
#> 
#> $`anaemia vs smoking`
#>        smoking
#> anaemia        No       Yes
#>     No  0.3571429 0.2040816
#>     Yes 0.3231293 0.1156463
#> 
#> $`anaemia vs death_event`
#>        death_event
#> anaemia        No       Yes
#>     No  0.4013378 0.1672241
#>     Yes 0.2775920 0.1538462
#> 
#> $`diabetes vs hblood_pressure`
#>         hblood_pressure
#> diabetes        No       Yes
#>      No  0.3745819 0.2073579
#>      Yes 0.2742475 0.1438127
#> 
#> $`diabetes vs sex`
#>         sex
#> diabetes    Female      Male
#>      No  0.1672241 0.4147157
#>      Yes 0.1839465 0.2341137
#> 
#> $`diabetes vs smoking`
#>         smoking
#> diabetes        No       Yes
#>      No  0.3639456 0.2244898
#>      Yes 0.3163265 0.0952381
#> 
#> $`diabetes vs death_event`
#>         death_event
#> diabetes        No       Yes
#>      No  0.3946488 0.1872910
#>      Yes 0.2842809 0.1337793
#> 
#> $`hblood_pressure vs sex`
#>                sex
#> hblood_pressure    Female      Male
#>             No  0.2040134 0.4448161
#>             Yes 0.1471572 0.2040134
#> 
#> $`hblood_pressure vs smoking`
#>                smoking
#> hblood_pressure         No        Yes
#>             No  0.42517007 0.22108844
#>             Yes 0.25510204 0.09863946
#> 
#> $`hblood_pressure vs death_event`
#>                death_event
#> hblood_pressure        No       Yes
#>             No  0.4581940 0.1906355
#>             Yes 0.2207358 0.1304348
#> 
#> $`sex vs smoking`
#>         smoking
#> sex              No        Yes
#>   Female 0.34013605 0.01360544
#>   Male   0.34013605 0.30612245
#> 
#> $`sex vs death_event`
#>         death_event
#> sex             No       Yes
#>   Female 0.2374582 0.1137124
#>   Male   0.4414716 0.2073579
#> 
#> $`smoking vs death_event`
#>        death_event
#> smoking        No       Yes
#>     No  0.4591837 0.2210884
#>     Yes 0.2244898 0.0952381
#> 
#> ── Chi-squared contingency table tests ─────────── Number of table is 15 ── 
#>         variable_1      variable_2    statistic      p.value df
#> 1          anaemia        diabetes 1.035093e-02 9.189634e-01  1
#> 2          anaemia hblood_pressure 2.893564e-01 5.906333e-01  1
#> 3          anaemia             sex 2.299464e+00 1.294186e-01  1
#> 4          anaemia         smoking 2.889091e+00 8.918122e-02  1
#> 5          anaemia     death_event 1.042175e+00 3.073161e-01  1
#> 6         diabetes hblood_pressure 9.476710e-03 9.224497e-01  1
#> 7         diabetes             sex 6.783853e+00 9.198613e-03  1
#> 8         diabetes         smoking 6.701186e+00 9.634881e-03  1
#> 9         diabetes     death_event 2.161684e-30 1.000000e+00  1
#> 10 hblood_pressure             sex 2.829289e+00 9.255934e-02  1
#> 11 hblood_pressure         smoking 9.628388e-01 3.264727e-01  1
#> 12 hblood_pressure     death_event 1.543461e+00 2.141034e-01  1
#> 13             sex         smoking 5.654892e+01 5.481762e-14  1
#> 14             sex     death_event 0.000000e+00 1.000000e+00  1
#> 15         smoking     death_event 1.102361e-01 7.398755e-01  1

# Summary by returned objects
stat
#> $table
#> $table$`anaemia vs diabetes`
#>        diabetes
#> anaemia No Yes
#>     No  98  72
#>     Yes 76  53
#> 
#> $table$`anaemia vs hblood_pressure`
#>        hblood_pressure
#> anaemia  No Yes
#>     No  113  57
#>     Yes  81  48
#> 
#> $table$`anaemia vs sex`
#>        sex
#> anaemia Female Male
#>     No      53  117
#>     Yes     52   77
#> 
#> $table$`anaemia vs smoking`
#>        smoking
#> anaemia  No Yes
#>     No  105  60
#>     Yes  95  34
#> 
#> $table$`anaemia vs death_event`
#>        death_event
#> anaemia  No Yes
#>     No  120  50
#>     Yes  83  46
#> 
#> $table$`diabetes vs hblood_pressure`
#>         hblood_pressure
#> diabetes  No Yes
#>      No  112  62
#>      Yes  82  43
#> 
#> $table$`diabetes vs sex`
#>         sex
#> diabetes Female Male
#>      No      50  124
#>      Yes     55   70
#> 
#> $table$`diabetes vs smoking`
#>         smoking
#> diabetes  No Yes
#>      No  107  66
#>      Yes  93  28
#> 
#> $table$`diabetes vs death_event`
#>         death_event
#> diabetes  No Yes
#>      No  118  56
#>      Yes  85  40
#> 
#> $table$`hblood_pressure vs sex`
#>                sex
#> hblood_pressure Female Male
#>             No      61  133
#>             Yes     44   61
#> 
#> $table$`hblood_pressure vs smoking`
#>                smoking
#> hblood_pressure  No Yes
#>             No  125  65
#>             Yes  75  29
#> 
#> $table$`hblood_pressure vs death_event`
#>                death_event
#> hblood_pressure  No Yes
#>             No  137  57
#>             Yes  66  39
#> 
#> $table$`sex vs smoking`
#>         smoking
#> sex       No Yes
#>   Female 100   4
#>   Male   100  90
#> 
#> $table$`sex vs death_event`
#>         death_event
#> sex       No Yes
#>   Female  71  34
#>   Male   132  62
#> 
#> $table$`smoking vs death_event`
#>        death_event
#> smoking  No Yes
#>     No  135  65
#>     Yes  66  28
#> 
#> 
#> $relative
#> $relative$`anaemia vs diabetes`
#>        diabetes
#> anaemia        No       Yes
#>     No  0.3277592 0.2408027
#>     Yes 0.2541806 0.1772575
#> 
#> $relative$`anaemia vs hblood_pressure`
#>        hblood_pressure
#> anaemia        No       Yes
#>     No  0.3779264 0.1906355
#>     Yes 0.2709030 0.1605351
#> 
#> $relative$`anaemia vs sex`
#>        sex
#> anaemia    Female      Male
#>     No  0.1772575 0.3913043
#>     Yes 0.1739130 0.2575251
#> 
#> $relative$`anaemia vs smoking`
#>        smoking
#> anaemia        No       Yes
#>     No  0.3571429 0.2040816
#>     Yes 0.3231293 0.1156463
#> 
#> $relative$`anaemia vs death_event`
#>        death_event
#> anaemia        No       Yes
#>     No  0.4013378 0.1672241
#>     Yes 0.2775920 0.1538462
#> 
#> $relative$`diabetes vs hblood_pressure`
#>         hblood_pressure
#> diabetes        No       Yes
#>      No  0.3745819 0.2073579
#>      Yes 0.2742475 0.1438127
#> 
#> $relative$`diabetes vs sex`
#>         sex
#> diabetes    Female      Male
#>      No  0.1672241 0.4147157
#>      Yes 0.1839465 0.2341137
#> 
#> $relative$`diabetes vs smoking`
#>         smoking
#> diabetes        No       Yes
#>      No  0.3639456 0.2244898
#>      Yes 0.3163265 0.0952381
#> 
#> $relative$`diabetes vs death_event`
#>         death_event
#> diabetes        No       Yes
#>      No  0.3946488 0.1872910
#>      Yes 0.2842809 0.1337793
#> 
#> $relative$`hblood_pressure vs sex`
#>                sex
#> hblood_pressure    Female      Male
#>             No  0.2040134 0.4448161
#>             Yes 0.1471572 0.2040134
#> 
#> $relative$`hblood_pressure vs smoking`
#>                smoking
#> hblood_pressure         No        Yes
#>             No  0.42517007 0.22108844
#>             Yes 0.25510204 0.09863946
#> 
#> $relative$`hblood_pressure vs death_event`
#>                death_event
#> hblood_pressure        No       Yes
#>             No  0.4581940 0.1906355
#>             Yes 0.2207358 0.1304348
#> 
#> $relative$`sex vs smoking`
#>         smoking
#> sex              No        Yes
#>   Female 0.34013605 0.01360544
#>   Male   0.34013605 0.30612245
#> 
#> $relative$`sex vs death_event`
#>         death_event
#> sex             No       Yes
#>   Female 0.2374582 0.1137124
#>   Male   0.4414716 0.2073579
#> 
#> $relative$`smoking vs death_event`
#>        death_event
#> smoking        No       Yes
#>     No  0.4591837 0.2210884
#>     Yes 0.2244898 0.0952381
#> 
#> 
#> $chisq
#>         variable_1      variable_2    statistic      p.value df
#> 1          anaemia        diabetes 1.035093e-02 9.189634e-01  1
#> 2          anaemia hblood_pressure 2.893564e-01 5.906333e-01  1
#> 3          anaemia             sex 2.299464e+00 1.294186e-01  1
#> 4          anaemia         smoking 2.889091e+00 8.918122e-02  1
#> 5          anaemia     death_event 1.042175e+00 3.073161e-01  1
#> 6         diabetes hblood_pressure 9.476710e-03 9.224497e-01  1
#> 7         diabetes             sex 6.783853e+00 9.198613e-03  1
#> 8         diabetes         smoking 6.701186e+00 9.634881e-03  1
#> 9         diabetes     death_event 2.161684e-30 1.000000e+00  1
#> 10 hblood_pressure             sex 2.829289e+00 9.255934e-02  1
#> 11 hblood_pressure         smoking 9.628388e-01 3.264727e-01  1
#> 12 hblood_pressure     death_event 1.543461e+00 2.141034e-01  1
#> 13             sex         smoking 5.654892e+01 5.481762e-14  1
#> 14             sex     death_event 0.000000e+00 1.000000e+00  1
#> 15         smoking     death_event 1.102361e-01 7.398755e-01  1
#> 

# component of table 
stat$table
#> $`anaemia vs diabetes`
#>        diabetes
#> anaemia No Yes
#>     No  98  72
#>     Yes 76  53
#> 
#> $`anaemia vs hblood_pressure`
#>        hblood_pressure
#> anaemia  No Yes
#>     No  113  57
#>     Yes  81  48
#> 
#> $`anaemia vs sex`
#>        sex
#> anaemia Female Male
#>     No      53  117
#>     Yes     52   77
#> 
#> $`anaemia vs smoking`
#>        smoking
#> anaemia  No Yes
#>     No  105  60
#>     Yes  95  34
#> 
#> $`anaemia vs death_event`
#>        death_event
#> anaemia  No Yes
#>     No  120  50
#>     Yes  83  46
#> 
#> $`diabetes vs hblood_pressure`
#>         hblood_pressure
#> diabetes  No Yes
#>      No  112  62
#>      Yes  82  43
#> 
#> $`diabetes vs sex`
#>         sex
#> diabetes Female Male
#>      No      50  124
#>      Yes     55   70
#> 
#> $`diabetes vs smoking`
#>         smoking
#> diabetes  No Yes
#>      No  107  66
#>      Yes  93  28
#> 
#> $`diabetes vs death_event`
#>         death_event
#> diabetes  No Yes
#>      No  118  56
#>      Yes  85  40
#> 
#> $`hblood_pressure vs sex`
#>                sex
#> hblood_pressure Female Male
#>             No      61  133
#>             Yes     44   61
#> 
#> $`hblood_pressure vs smoking`
#>                smoking
#> hblood_pressure  No Yes
#>             No  125  65
#>             Yes  75  29
#> 
#> $`hblood_pressure vs death_event`
#>                death_event
#> hblood_pressure  No Yes
#>             No  137  57
#>             Yes  66  39
#> 
#> $`sex vs smoking`
#>         smoking
#> sex       No Yes
#>   Female 100   4
#>   Male   100  90
#> 
#> $`sex vs death_event`
#>         death_event
#> sex       No Yes
#>   Female  71  34
#>   Male   132  62
#> 
#> $`smoking vs death_event`
#>        death_event
#> smoking  No Yes
#>     No  135  65
#>     Yes  66  28
#> 

# component of chi-square test 
stat$chisq
#>         variable_1      variable_2    statistic      p.value df
#> 1          anaemia        diabetes 1.035093e-02 9.189634e-01  1
#> 2          anaemia hblood_pressure 2.893564e-01 5.906333e-01  1
#> 3          anaemia             sex 2.299464e+00 1.294186e-01  1
#> 4          anaemia         smoking 2.889091e+00 8.918122e-02  1
#> 5          anaemia     death_event 1.042175e+00 3.073161e-01  1
#> 6         diabetes hblood_pressure 9.476710e-03 9.224497e-01  1
#> 7         diabetes             sex 6.783853e+00 9.198613e-03  1
#> 8         diabetes         smoking 6.701186e+00 9.634881e-03  1
#> 9         diabetes     death_event 2.161684e-30 1.000000e+00  1
#> 10 hblood_pressure             sex 2.829289e+00 9.255934e-02  1
#> 11 hblood_pressure         smoking 9.628388e-01 3.264727e-01  1
#> 12 hblood_pressure     death_event 1.543461e+00 2.141034e-01  1
#> 13             sex         smoking 5.654892e+01 5.481762e-14  1
#> 14             sex     death_event 0.000000e+00 1.000000e+00  1
#> 15         smoking     death_event 1.102361e-01 7.398755e-01  1

# component of chi-square test 
summary(all_var, "chisq")
#> ── Chi-squared contingency table tests ─────────── Number of table is 15 ── 
#>         variable_1      variable_2    statistic      p.value df
#> 1          anaemia        diabetes 1.035093e-02 9.189634e-01  1
#> 2          anaemia hblood_pressure 2.893564e-01 5.906333e-01  1
#> 3          anaemia             sex 2.299464e+00 1.294186e-01  1
#> 4          anaemia         smoking 2.889091e+00 8.918122e-02  1
#> 5          anaemia     death_event 1.042175e+00 3.073161e-01  1
#> 6         diabetes hblood_pressure 9.476710e-03 9.224497e-01  1
#> 7         diabetes             sex 6.783853e+00 9.198613e-03  1
#> 8         diabetes         smoking 6.701186e+00 9.634881e-03  1
#> 9         diabetes     death_event 2.161684e-30 1.000000e+00  1
#> 10 hblood_pressure             sex 2.829289e+00 9.255934e-02  1
#> 11 hblood_pressure         smoking 9.628388e-01 3.264727e-01  1
#> 12 hblood_pressure     death_event 1.543461e+00 2.141034e-01  1
#> 13             sex         smoking 5.654892e+01 5.481762e-14  1
#> 14             sex     death_event 0.000000e+00 1.000000e+00  1
#> 15         smoking     death_event 1.102361e-01 7.398755e-01  1

# component of chi-square test (first, third case)
summary(all_var, "chisq", pos = c(1, 3))
#> ── Chi-squared contingency table tests ──────────── Number of table is 2 ── 
#>   variable_1 variable_2  statistic   p.value df
#> 1    anaemia   diabetes 0.01035093 0.9189634  1
#> 2    anaemia        sex 2.29946450 0.1294186  1

# component of relative frequency table 
summary(all_var, "relative")
#> ── Relative contingency tables ─────────────────── Number of table is 15 ── 
#> $`anaemia vs diabetes`
#>        diabetes
#> anaemia        No       Yes
#>     No  0.3277592 0.2408027
#>     Yes 0.2541806 0.1772575
#> 
#> $`anaemia vs hblood_pressure`
#>        hblood_pressure
#> anaemia        No       Yes
#>     No  0.3779264 0.1906355
#>     Yes 0.2709030 0.1605351
#> 
#> $`anaemia vs sex`
#>        sex
#> anaemia    Female      Male
#>     No  0.1772575 0.3913043
#>     Yes 0.1739130 0.2575251
#> 
#> $`anaemia vs smoking`
#>        smoking
#> anaemia        No       Yes
#>     No  0.3571429 0.2040816
#>     Yes 0.3231293 0.1156463
#> 
#> $`anaemia vs death_event`
#>        death_event
#> anaemia        No       Yes
#>     No  0.4013378 0.1672241
#>     Yes 0.2775920 0.1538462
#> 
#> $`diabetes vs hblood_pressure`
#>         hblood_pressure
#> diabetes        No       Yes
#>      No  0.3745819 0.2073579
#>      Yes 0.2742475 0.1438127
#> 
#> $`diabetes vs sex`
#>         sex
#> diabetes    Female      Male
#>      No  0.1672241 0.4147157
#>      Yes 0.1839465 0.2341137
#> 
#> $`diabetes vs smoking`
#>         smoking
#> diabetes        No       Yes
#>      No  0.3639456 0.2244898
#>      Yes 0.3163265 0.0952381
#> 
#> $`diabetes vs death_event`
#>         death_event
#> diabetes        No       Yes
#>      No  0.3946488 0.1872910
#>      Yes 0.2842809 0.1337793
#> 
#> $`hblood_pressure vs sex`
#>                sex
#> hblood_pressure    Female      Male
#>             No  0.2040134 0.4448161
#>             Yes 0.1471572 0.2040134
#> 
#> $`hblood_pressure vs smoking`
#>                smoking
#> hblood_pressure         No        Yes
#>             No  0.42517007 0.22108844
#>             Yes 0.25510204 0.09863946
#> 
#> $`hblood_pressure vs death_event`
#>                death_event
#> hblood_pressure        No       Yes
#>             No  0.4581940 0.1906355
#>             Yes 0.2207358 0.1304348
#> 
#> $`sex vs smoking`
#>         smoking
#> sex              No        Yes
#>   Female 0.34013605 0.01360544
#>   Male   0.34013605 0.30612245
#> 
#> $`sex vs death_event`
#>         death_event
#> sex             No       Yes
#>   Female 0.2374582 0.1137124
#>   Male   0.4414716 0.2073579
#> 
#> $`smoking vs death_event`
#>        death_event
#> smoking        No       Yes
#>     No  0.4591837 0.2210884
#>     Yes 0.2244898 0.0952381
#> 

# component of table without missing values 
summary(all_var, "table", na.rm = TRUE)
#> ── Contingency tables ──────────────────────────── Number of table is 15 ── 
#> $`anaemia vs diabetes`
#>        diabetes
#> anaemia No Yes
#>     No  98  72
#>     Yes 76  53
#> 
#> $`anaemia vs hblood_pressure`
#>        hblood_pressure
#> anaemia  No Yes
#>     No  113  57
#>     Yes  81  48
#> 
#> $`anaemia vs sex`
#>        sex
#> anaemia Female Male
#>     No      53  117
#>     Yes     52   77
#> 
#> $`anaemia vs smoking`
#>        smoking
#> anaemia  No Yes
#>     No  105  60
#>     Yes  95  34
#> 
#> $`anaemia vs death_event`
#>        death_event
#> anaemia  No Yes
#>     No  120  50
#>     Yes  83  46
#> 
#> $`diabetes vs hblood_pressure`
#>         hblood_pressure
#> diabetes  No Yes
#>      No  112  62
#>      Yes  82  43
#> 
#> $`diabetes vs sex`
#>         sex
#> diabetes Female Male
#>      No      50  124
#>      Yes     55   70
#> 
#> $`diabetes vs smoking`
#>         smoking
#> diabetes  No Yes
#>      No  107  66
#>      Yes  93  28
#> 
#> $`diabetes vs death_event`
#>         death_event
#> diabetes  No Yes
#>      No  118  56
#>      Yes  85  40
#> 
#> $`hblood_pressure vs sex`
#>                sex
#> hblood_pressure Female Male
#>             No      61  133
#>             Yes     44   61
#> 
#> $`hblood_pressure vs smoking`
#>                smoking
#> hblood_pressure  No Yes
#>             No  125  65
#>             Yes  75  29
#> 
#> $`hblood_pressure vs death_event`
#>                death_event
#> hblood_pressure  No Yes
#>             No  137  57
#>             Yes  66  39
#> 
#> $`sex vs smoking`
#>         smoking
#> sex       No Yes
#>   Female 100   4
#>   Male   100  90
#> 
#> $`sex vs death_event`
#>         death_event
#> sex       No Yes
#>   Female  71  34
#>   Male   132  62
#> 
#> $`smoking vs death_event`
#>        death_event
#> smoking  No Yes
#>     No  135  65
#>     Yes  66  28
#> 

# component of table include marginal value 
margin <- summary(all_var, "table", marginal = TRUE)
#> ── Contingency tables ──────────────────────────── Number of table is 15 ── 
#> $`anaemia vs diabetes`
#>          diabetes
#> anaemia    No Yes <Total>
#>   No       98  72     170
#>   Yes      76  53     129
#>   <Total> 174 125     299
#> 
#> $`anaemia vs hblood_pressure`
#>          hblood_pressure
#> anaemia    No Yes <Total>
#>   No      113  57     170
#>   Yes      81  48     129
#>   <Total> 194 105     299
#> 
#> $`anaemia vs sex`
#>          sex
#> anaemia   Female Male <Total>
#>   No          53  117     170
#>   Yes         52   77     129
#>   <Total>    105  194     299
#> 
#> $`anaemia vs smoking`
#>          smoking
#> anaemia    No Yes <Total>
#>   No      105  60     165
#>   Yes      95  34     129
#>   <Total> 200  94     294
#> 
#> $`anaemia vs death_event`
#>          death_event
#> anaemia    No Yes <Total>
#>   No      120  50     170
#>   Yes      83  46     129
#>   <Total> 203  96     299
#> 
#> $`diabetes vs hblood_pressure`
#>          hblood_pressure
#> diabetes   No Yes <Total>
#>   No      112  62     174
#>   Yes      82  43     125
#>   <Total> 194 105     299
#> 
#> $`diabetes vs sex`
#>          sex
#> diabetes  Female Male <Total>
#>   No          50  124     174
#>   Yes         55   70     125
#>   <Total>    105  194     299
#> 
#> $`diabetes vs smoking`
#>          smoking
#> diabetes   No Yes <Total>
#>   No      107  66     173
#>   Yes      93  28     121
#>   <Total> 200  94     294
#> 
#> $`diabetes vs death_event`
#>          death_event
#> diabetes   No Yes <Total>
#>   No      118  56     174
#>   Yes      85  40     125
#>   <Total> 203  96     299
#> 
#> $`hblood_pressure vs sex`
#>                sex
#> hblood_pressure Female Male <Total>
#>         No          61  133     194
#>         Yes         44   61     105
#>         <Total>    105  194     299
#> 
#> $`hblood_pressure vs smoking`
#>                smoking
#> hblood_pressure  No Yes <Total>
#>         No      125  65     190
#>         Yes      75  29     104
#>         <Total> 200  94     294
#> 
#> $`hblood_pressure vs death_event`
#>                death_event
#> hblood_pressure  No Yes <Total>
#>         No      137  57     194
#>         Yes      66  39     105
#>         <Total> 203  96     299
#> 
#> $`sex vs smoking`
#>          smoking
#> sex        No Yes <Total>
#>   Female  100   4     104
#>   Male    100  90     190
#>   <Total> 200  94     294
#> 
#> $`sex vs death_event`
#>          death_event
#> sex        No Yes <Total>
#>   Female   71  34     105
#>   Male    132  62     194
#>   <Total> 203  96     299
#> 
#> $`smoking vs death_event`
#>          death_event
#> smoking    No Yes <Total>
#>   No      135  65     200
#>   Yes      66  28      94
#>   <Total> 201  93     294
#> 
margin
#> $`anaemia vs diabetes`
#>          diabetes
#> anaemia    No Yes <Total>
#>   No       98  72     170
#>   Yes      76  53     129
#>   <Total> 174 125     299
#> 
#> $`anaemia vs hblood_pressure`
#>          hblood_pressure
#> anaemia    No Yes <Total>
#>   No      113  57     170
#>   Yes      81  48     129
#>   <Total> 194 105     299
#> 
#> $`anaemia vs sex`
#>          sex
#> anaemia   Female Male <Total>
#>   No          53  117     170
#>   Yes         52   77     129
#>   <Total>    105  194     299
#> 
#> $`anaemia vs smoking`
#>          smoking
#> anaemia    No Yes <Total>
#>   No      105  60     165
#>   Yes      95  34     129
#>   <Total> 200  94     294
#> 
#> $`anaemia vs death_event`
#>          death_event
#> anaemia    No Yes <Total>
#>   No      120  50     170
#>   Yes      83  46     129
#>   <Total> 203  96     299
#> 
#> $`diabetes vs hblood_pressure`
#>          hblood_pressure
#> diabetes   No Yes <Total>
#>   No      112  62     174
#>   Yes      82  43     125
#>   <Total> 194 105     299
#> 
#> $`diabetes vs sex`
#>          sex
#> diabetes  Female Male <Total>
#>   No          50  124     174
#>   Yes         55   70     125
#>   <Total>    105  194     299
#> 
#> $`diabetes vs smoking`
#>          smoking
#> diabetes   No Yes <Total>
#>   No      107  66     173
#>   Yes      93  28     121
#>   <Total> 200  94     294
#> 
#> $`diabetes vs death_event`
#>          death_event
#> diabetes   No Yes <Total>
#>   No      118  56     174
#>   Yes      85  40     125
#>   <Total> 203  96     299
#> 
#> $`hblood_pressure vs sex`
#>                sex
#> hblood_pressure Female Male <Total>
#>         No          61  133     194
#>         Yes         44   61     105
#>         <Total>    105  194     299
#> 
#> $`hblood_pressure vs smoking`
#>                smoking
#> hblood_pressure  No Yes <Total>
#>         No      125  65     190
#>         Yes      75  29     104
#>         <Total> 200  94     294
#> 
#> $`hblood_pressure vs death_event`
#>                death_event
#> hblood_pressure  No Yes <Total>
#>         No      137  57     194
#>         Yes      66  39     105
#>         <Total> 203  96     299
#> 
#> $`sex vs smoking`
#>          smoking
#> sex        No Yes <Total>
#>   Female  100   4     104
#>   Male    100  90     190
#>   <Total> 200  94     294
#> 
#> $`sex vs death_event`
#>          death_event
#> sex        No Yes <Total>
#>   Female   71  34     105
#>   Male    132  62     194
#>   <Total> 203  96     299
#> 
#> $`smoking vs death_event`
#>          death_event
#> smoking    No Yes <Total>
#>   No      135  65     200
#>   Yes      66  28      94
#>   <Total> 201  93     294
#> 

# component of chi-square test 
summary(two_var, method = "chisq")
#> ── Chi-squared contingency table tests ──────────── Number of table is 1 ── 
#>   variable_1  variable_2 statistic   p.value df
#> 1    smoking death_event 0.1102361 0.7398755  1

# verbose is FALSE 
summary(all_var, "chisq", verbose = FALSE)
#>         variable_1      variable_2    statistic      p.value df
#> 1          anaemia        diabetes 1.035093e-02 9.189634e-01  1
#> 2          anaemia hblood_pressure 2.893564e-01 5.906333e-01  1
#> 3          anaemia             sex 2.299464e+00 1.294186e-01  1
#> 4          anaemia         smoking 2.889091e+00 8.918122e-02  1
#> 5          anaemia     death_event 1.042175e+00 3.073161e-01  1
#> 6         diabetes hblood_pressure 9.476710e-03 9.224497e-01  1
#> 7         diabetes             sex 6.783853e+00 9.198613e-03  1
#> 8         diabetes         smoking 6.701186e+00 9.634881e-03  1
#> 9         diabetes     death_event 2.161684e-30 1.000000e+00  1
#> 10 hblood_pressure             sex 2.829289e+00 9.255934e-02  1
#> 11 hblood_pressure         smoking 9.628388e-01 3.264727e-01  1
#> 12 hblood_pressure     death_event 1.543461e+00 2.141034e-01  1
#> 13             sex         smoking 5.654892e+01 5.481762e-14  1
#> 14             sex     death_event 0.000000e+00 1.000000e+00  1
#> 15         smoking     death_event 1.102361e-01 7.398755e-01  1

#' # Using pipes & dplyr -------------------------
# If you want to use dplyr, set verbose to FALSE
summary(all_var, "chisq", verbose = FALSE) %>% 
  filter(p.value < 0.26)
#>        variable_1  variable_2 statistic      p.value df
#> 1         anaemia         sex  2.299464 1.294186e-01  1
#> 2         anaemia     smoking  2.889091 8.918122e-02  1
#> 3        diabetes         sex  6.783853 9.198613e-03  1
#> 4        diabetes     smoking  6.701186 9.634881e-03  1
#> 5 hblood_pressure         sex  2.829289 9.255934e-02  1
#> 6 hblood_pressure death_event  1.543461 2.141034e-01  1
#> 7             sex     smoking 56.548915 5.481762e-14  1

# Extract component from list by index
summary(all_var, "table", na.rm = TRUE, verbose = FALSE) %>% 
  "[["(1)
#>        diabetes
#> anaemia No Yes
#>     No  98  72
#>     Yes 76  53

# Extract component from list by name
summary(all_var, "table", na.rm = TRUE, verbose = FALSE) %>% 
  "[["("smoking vs death_event")
#>        death_event
#> smoking  No Yes
#>     No  135  65
#>     Yes  66  28

# plot all pair of variables
plot(all_var)
















# plot a pair of variables
plot(two_var)


# plot all pair of variables by prompt
plot(all_var, prompt = TRUE)
#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:

#> Hit <Return> to see next plot:


# plot a pair of variables
plot(two_var, las = 1)

# }