print and summary method for "compare_numeric" class.
# S3 method for compare_numeric summary( object, method = c("all", "correlation", "linear"), thres_corr = 0.3, thres_rs = 0.1, verbose = TRUE, ... ) # S3 method for compare_numeric print(x, ...)
object | an object of class "compare_numeric", usually, a result of a call to compare_numeric(). |
---|---|
method | character. Select statistics to be aggregated. "correlation" calculates the Pearson's correlation coefficient, and "linear" returns the aggregation of the linear model. "all" returns both information. However, the difference between summary.compare_numeric() and compare_numeric() is that only cases that are greater than the specified threshold are returned. "correlation" returns only cases with a correlation coefficient greater than the thres_corr argument value. "linear" returns only cases with R^2 greater than the thres_rs argument. |
thres_corr | numeric. This is the correlation coefficient threshold of the correlation coefficient information to be returned. The default is 0.3. |
thres_rs | numeric. R^2 threshold of linear model summaries information to return. The default is 0.1. |
verbose | logical. Specifies whether to output additional information during the calculation process. The default is to output information as TRUE. In this case, the function returns the value with invisible(). If FALSE, the value is returned by return(). |
... | further arguments passed to or from other methods. |
x | an object of class "compare_numeric", usually, a result of a call to compare_numeric(). |
An object of the class as compare based list. The information to examine the relationship between numerical variables is as follows each components. - correlation component : Pearson's correlation coefficient.
var1 : factor. The level of the first variable to compare. 'var1' is the name of the first variable to be compared.
var2 : factor. The level of the second variable to compare. 'var2' is the name of the second variable to be compared.
coef_corr : double. Pearson's correlation coefficient.
- linear component : linear model summaries
var1 : factor. The level of the first variable to compare. 'var1' is the name of the first variable to be compared.
var2 : factor. The level of the second variable to compare. 'var2' is the name of the second variable to be compared.
r.squared : double. The percent of variance explained by the model.
adj.r.squared : double. r.squared adjusted based on the degrees of freedom.
sigma : double. The square root of the estimated residual variance.
statistic : double. F-statistic.
p.value : double. p-value from the F test, describing whether the full regression is significant.
df : integer degrees of freedom.
logLik : double. the log-likelihood of data under the model.
AIC : double. the Akaike Information Criterion.
BIC : double. the Bayesian Information Criterion.
deviance : double. deviance.
df.residual : integer residual degrees of freedom.
print.compare_numeric() displays only the information compared between the variables included in compare_numeric. When using summary.compare_numeric(), it is advantageous to set the verbose argument to TRUE if the user is only viewing information from the console. It is also advantageous to specify FALSE if you want to manipulate the results.
# \donttest{ # Generate data for the example heartfailure2 <- heartfailure[, c("platelets", "creatinine", "sodium")] library(dplyr) # Compare the all numerical variables all_var <- compare_numeric(heartfailure2) # Print compare_numeric class object all_var#> $correlation #> # A tibble: 3 x 3 #> var1 var2 coef_corr #> <chr> <chr> <dbl> #> 1 platelets creatinine -0.0412 #> 2 platelets sodium 0.0621 #> 3 creatinine sodium -0.189 #> #> $linear #> # A tibble: 3 x 14 #> var1 var2 r.squared adj.r.squared sigma statistic p.value df logLik #> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 platele… creati… 0.00170 -0.00166 9.79e4 0.505 0.478 1 -3859. #> 2 platele… sodium 0.00386 0.000505 9.78e4 1.15 0.284 1 -3859. #> 3 creatin… sodium 0.0358 0.0325 1.02e0 11.0 0.00102 1 -428. #> # … with 5 more variables: AIC <dbl>, BIC <dbl>, deviance <dbl>, #> # df.residual <int>, nobs <int> #># Compare the correlation that case of joint the sodium variable all_var %>% "$"(correlation) %>% filter(var1 == "sodium" | var2 == "sodium") %>% arrange(desc(abs(coef_corr)))#> # A tibble: 2 x 3 #> var1 var2 coef_corr #> <chr> <chr> <dbl> #> 1 creatinine sodium -0.189 #> 2 platelets sodium 0.0621# Compare the correlation that case of abs(coef_corr) > 0.1 all_var %>% "$"(correlation) %>% filter(abs(coef_corr) > 0.1)#> # A tibble: 1 x 3 #> var1 var2 coef_corr #> <chr> <chr> <dbl> #> 1 creatinine sodium -0.189# Compare the linear model that case of joint the sodium variable all_var %>% "$"(linear) %>% filter(var1 == "sodium" | var2 == "sodium") %>% arrange(desc(r.squared))#> # A tibble: 2 x 14 #> var1 var2 r.squared adj.r.squared sigma statistic p.value df logLik #> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 creatini… sodi… 0.0358 0.0325 1.02e0 11.0 0.00102 1 -428. #> 2 platelets sodi… 0.00386 0.000505 9.78e4 1.15 0.284 1 -3859. #> # … with 5 more variables: AIC <dbl>, BIC <dbl>, deviance <dbl>, #> # df.residual <int>, nobs <int># Compare the two numerical variables two_var <- compare_numeric(heartfailure2, sodium, creatinine) # Print compare_numeric class objects two_var#> $correlation #> # A tibble: 1 x 3 #> var1 var2 coef_corr #> <chr> <chr> <dbl> #> 1 sodium creatinine -0.189 #> #> $linear #> # A tibble: 1 x 14 #> var1 var2 r.squared adj.r.squared sigma statistic p.value df logLik AIC #> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 sodi… crea… 0.0358 0.0325 4.34 11.0 0.00102 1 -862. 1730. #> # … with 4 more variables: BIC <dbl>, deviance <dbl>, df.residual <int>, #> # nobs <int> #>#> ── Correlation check : abs(r) > 0.3 ───────────── Number of pairs is 0/3 ── #> # A tibble: 0 x 3 #> # … with 3 variables: var1 <chr>, var2 <chr>, coef_corr <dbl> #> ── R.squared check : R^2 > 0.1 ────────────────── Number of pairs is 0/3 ── #> # A tibble: 0 x 14 #> # … with 14 variables: var1 <chr>, var2 <chr>, r.squared <dbl>, #> # adj.r.squared <dbl>, sigma <dbl>, statistic <dbl>, p.value <dbl>, df <dbl>, #> # logLik <dbl>, AIC <dbl>, BIC <dbl>, deviance <dbl>, df.residual <int>, #> # nobs <int>#> ── Correlation check : abs(r) > 0.3 ───────────── Number of pairs is 0/3 ── #> # A tibble: 0 x 3 #> # … with 3 variables: var1 <chr>, var2 <chr>, coef_corr <dbl>#> ── Correlation check : abs(r) > 0.1 ───────────── Number of pairs is 1/3 ── #> # A tibble: 1 x 3 #> var1 var2 coef_corr #> <chr> <chr> <dbl> #> 1 creatinine sodium -0.189#> ── Correlation check : abs(r) > 0.3 ───────────── Number of pairs is 0/3 ── #> # A tibble: 0 x 3 #> # … with 3 variables: var1 <chr>, var2 <chr>, coef_corr <dbl> #> ── R.squared check : R^2 > 0.05 ───────────────── Number of pairs is 0/3 ── #> # A tibble: 0 x 14 #> # … with 14 variables: var1 <chr>, var2 <chr>, r.squared <dbl>, #> # adj.r.squared <dbl>, sigma <dbl>, statistic <dbl>, p.value <dbl>, df <dbl>, #> # logLik <dbl>, AIC <dbl>, BIC <dbl>, deviance <dbl>, df.residual <int>, #> # nobs <int>#> $correlation #> # A tibble: 0 x 3 #> # … with 3 variables: var1 <chr>, var2 <chr>, coef_corr <dbl> #> #> $linear #> # A tibble: 0 x 14 #> # … with 14 variables: var1 <chr>, var2 <chr>, r.squared <dbl>, #> # adj.r.squared <dbl>, sigma <dbl>, statistic <dbl>, p.value <dbl>, df <dbl>, #> # logLik <dbl>, AIC <dbl>, BIC <dbl>, deviance <dbl>, df.residual <int>, #> # nobs <int> #># }