I'm relatively new to coding and would appreciate any guidance at all!
I am trying to pull a random sample of 100 patients per group- mortality/ no mortality-in hopes of comparing them. However, I need patient demographics to be comparable between the two groups eg. I need an equal amount of females to be pulled in "group mortality" as in "group no mortality". I need to do this for age and ethnicity as well.
I have made subsets of the two groups (my dataset is called "newdata":
datalive <- subset(newdata, newdata$DISPOSITION_EXPIRED_YN==0)
datadeath <- subset(newdata, newdata$DISPOSITION_EXPIRED_YN==1)
I also have attempted to stratify sample:
library(dplyr)
stratified_sample <- newdata %>%
group_by(newdata$GENDER_CODE) %>%
mutate(num_rows=n()) %>%
sample_frac(0.5, weight=num_rows) %>%
ungroup
Any help at all would be greatly appreciated!