Find the numerical variable that contains outliers in the object that inherits the data.frame or data.frame.

find_outliers(.data, index = TRUE, rate = FALSE)

Arguments

.data

a data.frame or a tbl_df.

index

logical. When representing the information of a variable including outliers, specify whether or not the variable is represented by an index. Returns an index if TRUE or a variable names if FALSE.

rate

logical. If TRUE, returns the percentage of outliers in the individual variable.

Value

Information on variables including outliers.

Examples

find_outliers(heartfailure)
#> [1] 3 5 7 8 9

find_outliers(heartfailure, index = FALSE)
#> [1] "cpk_enzyme"        "ejection_fraction" "platelets"        
#> [4] "creatinine"        "sodium"           

find_outliers(heartfailure, rate = TRUE)
#>        cpk_enzyme ejection_fraction         platelets        creatinine 
#>             9.699             0.669             7.023             9.699 
#>            sodium 
#>             1.338 

## using dplyr -------------------------------------
library(dplyr)

# Perform simple data quality diagnosis of variables with outliers.
heartfailure %>%
  select(find_outliers(.)) %>%
  diagnose()
#> # A tibble: 5 × 6
#>   variables         types missing_count missing_percent unique_count unique_rate
#>   <chr>             <chr>         <int>           <dbl>        <int>       <dbl>
#> 1 cpk_enzyme        nume…             0               0          208      0.696 
#> 2 ejection_fraction nume…             0               0           17      0.0569
#> 3 platelets         nume…             0               0          176      0.589 
#> 4 creatinine        nume…             0               0           40      0.134 
#> 5 sodium            nume…             0               0           27      0.0903