echo $SHELL
Lab 1: R, RStudio, Quarto, Git, SSH, Keys, and MIMIC
Biostat 203B
This tutorial goes through the process of installing the software environment (R, Quarto, RStudio, Git, SSH, Docker) for reproducing 203B course materials and working on 203B homework.
1 R, Quarto, and RStudio
- Download and install R v4.2.2 or later https://cran.rstudio.com.
Mac users be aware that R-x.x.x-arm64.pkg
is for Apple Silicon CPUs (M1/M2) and R-x.x.x.pkg
is for Intel CPUs.
Windows users also need to install the Rtools.
Download and install Quarto CLI v1.2.280 or later https://quarto.org/docs/get-started/.
Download and install RStudio Desktop v2022.12 or later https://rstudio.com/products/rstudio/#Desktop
2 Terminal and Bash shell
Open the
Terminal
app. You can findTerminal
fromLaunchPad
->Other
, or fromSpotlight Search
(type Terminal). For convenience, you can pin the Terminal program to the Dock.Check that current shell is Bash
- If necessary, change default shell to Bash by the following command in Terminal. Then close the Terminal window and reopen it.
chsh -s /bin/bash
Download and install
Git for Windows
v2.39 or later. Accept the default settings during the installation process.Git Bash
program is available as a component ofGit for Windows
. It provides a basic Bash shell and packages many commonly used Linux programs.Instruct RStudio to use
Git Bash
as terminal: Tools -> Global Options… -> Terminal -> New terminals open with -> Git Bash.Note that
Git Bash
is not a Linux system. It’s a Windows program that emulates a Bash shell, but lacks many Linux commands.Git Bash
does not have a separate Linux file system. It piggybacks on the Windows file system, thus cannot do Linux-style file permission control. The user home ofGit Bash
is same as the user home on Windows, e.g.,/c/Users/[USERNAME]
.
The WSL (Windows Subsystem for Linux) is a more complete Linux solution for Windows users.
Install WSL following instructions.
A standalone Ubuntu system is available after WSL installation. Look for icon .
Instruct RStudio to use WSL as terminal: Tools -> Global Options… -> Terminal -> New terminals open with -> WSL.
WSL is full Linux system. It has its own Linux file system, separate from the Windows file system. The user home is at
/home/[USERNAME]
within the Linux file system.Within WSL, it is easy to access the files in Windows file system, which is mounted at
/mnt/c
on WSL. For the example, the file/c/Users/[USERNAME]/file
is available within WSL at/mnt/c/Users/[USERNAME]/file
.Within Windows, it is easy to access the files in the WSL file system from the Linux tab in File Explorer.
3 Git and GitHub
- Check whether
git
command is available in Terminal.
If git
is not available, follow the instructions Installing on macOS
section to install it.
git
is included in either Git Bash
or WSL Ubuntu.
3.1 git clone
and git pull
course materials
- In Terminal (Mac Terminal, Windows Git Bash, or Windows WSL), you can
git clone
a copy of course materials by
git clone https://github.com/ucla-biostat-203b/2023winter.git ~/203b-lecture
This command clones the GitHub repository to a folder called 203b-lecture
under your home directory. You can clone multiple copies to different locations on the same machine if you want. Since this GitHub repository is frequently updated, you can sync the local copy on your computer with the repository on GitHub by running
git pull
within the 203b-lecture
folder.
Navigate to the
203b-lecture
folder, double click the2023winter.Rproj
file to open the project in RStudio. Within RStudio, you can navigate to theslides
folder to open and renderqmd
files.You can also perform the majority of daily Git operations within RStudio.
3.2 Set up GitHub repo for 203B homework
Apply for the Student Developer Pack at GitHub using your UCLA email. You’ll get GitHub Pro account with unlimited public and private repositories for free.
On https://github.com/, create a private repository
biostat-203b-2023-winter
and addHua-Zhou
andtomokiokuno0528
as your collaborators with write permission. This repository is for submitting your 203B homework.In Terminal (Mac Terminal, Windows Git Bash, or Windows WSL), you can run
git clone [SSH_ADDRESS] ~/203b-hw
to clone your homework repository to a folder called 203b-hw
under your home directory on your computer. [SSH_ADDRESS]
is obtained by clicking the menu <> Code
-> SSH
on the repository page on GitHub.
- Alternatively, you can also use RStudio to git clone:
File
->New Project...
->Version Control
->Git
-> inputRepository URL:
,Project directory name:
(203b-hw).
4 SSH and keys
4.1 SSH client
SSH client should be available in Terminal (Mac Terminal, Windows Git Bash, or Windows WSL) by default.
In Terminal, the command to connect to a Linux machine is
ssh [USERNAME]@[IP_ADDRESS]
Replace [USERNAME]
in the command by your actual user name on the Linux machine you are connecting to. If you cannot connect, you may not have an account. For example, to connect to my account on the Hoffman2 cluster at UCLA
ssh huazhou@hoffman2.idre.ucla.edu
4.2 SSH keys
- First check whether you already have keys on your local machine. If you don’t have
~/.ssh
folder, that means you have never used SSH before.
ls -al ~/.ssh
If no SSH keys yet, generate a pair of RSA keys
- Method 1: Generate keys on Terminal, following the instructions in lecture notes.
- Method 2: Use RStudio to generate keys.
Tools
->Global Options...
->Git/SVN
.
Using either method, make sure keys are in the default location
~/.ssh/
- Method 1: Generate keys on Terminal, following the instructions in lecture notes.
Make sure the permission for the key files are correct.
The permission for the
~/.ssh
folder should be700 (drwx------)
.The permission for the private key
~/.ssh/id_rsa
should be600 (-rw-------)
.The permission for the public key
~/.ssh/id_rsa.pub
should be644 (-rw-r--r--)
.
The permission for the
~/.ssh
folder can be755 (drwxr-xr-x)
.The permission for the private key
~/.ssh/id_rsa
can be644 (-rw-r--r--)
.The permission for the public key
~/.ssh/id_rsa.pub
can be644 (-rw-r--r--)
.
The permission for the
~/.ssh
folder should be700 (drwx------)
.The permission for the private key
~/.ssh/id_rsa
should be600 (-rw-------)
.The permission for the public key
~/.ssh/id_rsa.pub
should be644 (-rw-r--r--)
.
- Upload the public SSH key to GitHub: Click the user Avatar on top right corner -> Settings -> SSH and GPG keys -> New SSH key.
In your local machine terminal, you may copy the public key to the clipboard by following command.
pbcopy ~/.ssh/id_rsa.pub
clip < ~/.ssh/id_rsa.pub
If the xclip
and xsel
are installed, then we can use
xclip -selection clipboard ~/.ssh/id_rsa.pub
On local machine, instruct RStudio to use the public key just generated:
Tools
->Global Options...
->Git/SVN
->SSH key:
.After setting up SSH key, you can
git push
your local commits to GitHub repo seamlessly without inputting passwords.
5 Docker (optional)
A customized Docker container provides a self-contained Linux Ubuntu environment for reproducing materials in this course.
Download and install the Docker Desktop https://www.docker.com/products/docker-desktop/.
Open Terminal (Mac Terminal, Windows Git Bash, or Windows WSL) at the
/Docker
folder of course material.To run a Docker container, we first modify the
volumes
section of thedocker-compose.yml
file to map the203b-lecture
,203-hw
, andmimic
folders on your computer to the home directory in the Ubuntu system in Docker container. Then type
docker compose up
to run the Docker container. This can take up to 10 minutes depending on internet connection.
Point your browser to
localhost:8787
to connect to the RStudio Server running on the Ubuntu system.Advanced: Open Terminal (Mac Terminal, Windows Git Bash, or Windows WSL) at the
/Docker
folder of course material. Modify theDockerfile
according to your needs. Type
docker build . -t huazhou/ucla_biostat_203b_2023w
to build the new Docker image.
6 MIMIC Data
Much of homework and in class examples are demonstrated on the MIMIC-IV v1.0 data set. Download the data to your computer from here (7.76 GB), and make it available at ~/mimic
. For example, you can create a symbolic link by
ln -s /PATH/TO/YOUR/MIMIC_FOLDER ~/mimic
Your homework solution should always read data from ~/mimic
. This is critical for Tomoki and me to reproduce your homework.
ls ~/mimic
LICENSE.txt
MIMIC-reduce-chartevents
SHA256SUMS.txt
core
hosp
icu
index.html