tidymodels

Biostat 203B

Author

Dr. Hua Zhou @ UCLA

Published

March 12, 2023

1 Overview

  • A typical data science project:

  • tidymodels is an ecosystem for:

    1. Build and fit a model;
    2. Feature engineering: coding qualitative predictors, transformation of predictors (e.g., log), extracting key features from raw variables (e.g., getting the day of the week out of a date variable), interaction terms, …;
    3. Evaluate model using resampling (such as cross-validation).
    4. Tuning model parameters.

2 Heart data example

We illustrate a binary classification example using a dataset from the Cleveland Clinic Foundation for Heart Disease.

2.1 Logistic regression (with enet regularization) workflow

qmd, html

2.2 Random forest workflow

qmd, html

2.3 Boosting (XGBoost) workflow

qmd, html

2.4 SVM (with radial basis kernel) workflow

qmd, html

2.5 Multi-layer perceptron (MLP) workflow

qmd, html