Find the numerical variable that contains outliers in the object that inherits the data.frame or data.frame.
find_outliers(.data, index = TRUE, rate = FALSE)
a data.frame or a tbl_df
.
logical. When representing the information of a variable including outliers, specify whether or not the variable is represented by an index. Returns an index if TRUE or a variable names if FALSE.
logical. If TRUE, returns the percentage of outliers in the individual variable.
Information on variables including outliers.
find_outliers(heartfailure)
#> [1] 3 5 7 8 9
find_outliers(heartfailure, index = FALSE)
#> [1] "cpk_enzyme" "ejection_fraction" "platelets"
#> [4] "creatinine" "sodium"
find_outliers(heartfailure, rate = TRUE)
#> cpk_enzyme ejection_fraction platelets creatinine
#> 9.699 0.669 7.023 9.699
#> sodium
#> 1.338
## using dplyr -------------------------------------
library(dplyr)
# Perform simple data quality diagnosis of variables with outliers.
heartfailure %>%
select(find_outliers(.)) %>%
diagnose()
#> # A tibble: 5 × 6
#> variables types missing_count missing_percent unique_count unique_rate
#> <chr> <chr> <int> <dbl> <int> <dbl>
#> 1 cpk_enzyme nume… 0 0 208 0.696
#> 2 ejection_fraction nume… 0 0 17 0.0569
#> 3 platelets nume… 0 0 176 0.589
#> 4 creatinine nume… 0 0 40 0.134
#> 5 sodium nume… 0 0 27 0.0903