The diagnose_report() report the information for diagnosing the quality of the data.

diagnose_report(.data, output_format, output_file, output_dir, ...)

# S3 method for data.frame
diagnose_report(
  .data,
  output_format = c("pdf", "html"),
  output_file = NULL,
  output_dir = tempdir(),
  font_family = NULL,
  browse = TRUE,
  ...
)

Arguments

.data

a data.frame or a tbl_df.

output_format

report output type. Choose either "pdf" and "html". "pdf" create pdf file by knitr::knit(). "html" create html file by rmarkdown::render().

output_file

name of generated file. default is NULL.

output_dir

name of directory to generate report file. default is tempdir().

...

arguments to be passed to methods.

font_family

character. font family name for figure in pdf.

browse

logical. choose whether to output the report results to the browser.

Value

No return value. This function only generates a report.

Details

Generate generalized data diagnostic reports automatically. You can choose to output to pdf and html files. This is useful for diagnosing a data frame with a large number of variables than data with a small number of variables. For pdf output, Korean Gothic font must be installed in Korean operating system.

Reported information

Reported from the data diagnosis is as follows.

  • Diagnose Data

    • Overview of Diagnosis

      • List of all variables quality

      • Diagnosis of missing data

      • Diagnosis of unique data(Text and Category)

      • Diagnosis of unique data(Numerical)

    • Detailed data diagnosis

      • Diagnosis of categorical variables

      • Diagnosis of numerical variables

      • List of numerical diagnosis (zero)

      • List of numerical diagnosis (minus)

  • Diagnose Outliers

    • Overview of Diagnosis

      • Diagnosis of numerical variable outliers

      • Detailed outliers diagnosis

See vignette("diagonosis") for an introduction to these concepts.

Examples

if (FALSE) {
# reporting the diagnosis information -------------------------
# create pdf file. file name is DataDiagnosis_Report.pdf
diagnose_report(heartfailure)

# create pdf file. file name is Diagn.pdf
diagnose_report(heartfailure, output_file = "Diagn.pdf")

# create pdf file. file name is ./Diagn.pdf and not browse
diagnose_report(heartfailure, output_dir = ".", output_file = "Diagn.pdf", 
  browse = FALSE)

# create html file. file name is Diagnosis_Report.html
diagnose_report(heartfailure, output_format = "html")

# create html file. file name is Diagn.html
diagnose_report(heartfailure, output_format = "html", output_file = "Diagn.html")
}