Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201894

dplyr sample_n by group with unique size argument per group

$
0
0

I am trying to draw a stratified sample from a data set for which a variable exists that indicates how large the sample size per group should be.

library(dplyr)
# example data 
df <- data.frame(id = 1:15,
                 grp = rep(1:3,each = 5), 
                 frq = rep(c(3,2,4), each = 5))

In this example, grp refers to the group I want to sample by and frq is the sample size specificied for that group.

Using split, I came up with this possible solution, which gives the desired result but seems rather inefficient :

s <- split(df, df$grp)
lapply(s,function(x) sample_n(x, size = unique(x$frq))) %>% 
      do.call(what = rbind)

Is there a way using just dplyr's group_by and sample_n to do this?

My first thought was:

df %>% group_by(grp) %>% sample_n(size = frq)

but this gives the error:

Error in is_scalar_integerish(size) : object 'frq' not found


Viewing all articles
Browse latest Browse all 201894

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>