Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

Pull sample from dataset with comparable demographics

$
0
0

I'm relatively new to coding and would appreciate any guidance at all!

I am trying to pull a random sample of 100 patients per group- mortality/ no mortality-in hopes of comparing them. However, I need patient demographics to be comparable between the two groups eg. I need an equal amount of females to be pulled in "group mortality" as in "group no mortality". I need to do this for age and ethnicity as well.

I have made subsets of the two groups (my dataset is called "newdata":

datalive <- subset(newdata, newdata$DISPOSITION_EXPIRED_YN==0)
datadeath <- subset(newdata, newdata$DISPOSITION_EXPIRED_YN==1) 

I also have attempted to stratify sample:

library(dplyr)
stratified_sample <- newdata %>%
  group_by(newdata$GENDER_CODE) %>%
  mutate(num_rows=n()) %>%
  sample_frac(0.5, weight=num_rows) %>%
  ungroup

Any help at all would be greatly appreciated!


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>