Biostat 203B Homework 4

Due Mar 24 @ 11:59PM

Author

YOUR NAME and UID

Display machine information:

sessionInfo()

Load database libraries and the tidyverse frontend:

suppressPackageStartupMessages(library(tidyverse))
suppressPackageStartupMessages(library(tidymodels))
suppressPackageStartupMessages(library(lubridate))

1 Predicting 30-day mortality

Using the ICU cohort icu_cohort.rds you built in Homework 3, develop at least three analytic approaches for predicting the 30-day mortality of patients admitted to ICU using demographic information (gender, age, marital status, ethnicity), first lab measurements during ICU stay, and first vital measurements during ICU stay. For example, you can use (1) logistic regression with elastic net (lasso + ridge) penalty (e.g., glmnet or keras package), (2) random forest, (3) boosting, and (4) support vector machines, or (5) MLP neural network (keras package)

  1. Partition data into 50% training set and 50% test set. Stratify partitioning according the 30-day mortality status.

  2. Train and tune the models using the training set.

  3. Compare model classification performance on the test set. Report both the area under ROC curve and accuracy for each model.