Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

R - keep random rows per group, but different numbers per group

$
0
0

The function sample_n() from package dplyr allows to randomly keep a specific number of rows. Combine with group_by(), you can for instance keep 2 observations per group:

mtcars %>% 
  select(vs, drat) %>% 
  group_by(vs) %>% 
  sample_n(2)

# A tibble: 4 x 2
# Groups:   vs [2]
     vs  drat
  <dbl> <dbl>
1     0  3.07
2     0  3.9 
3     1  4.22
4     1  3.08

Question: is there an easy way to select a different number of observations per group? For instance, if I want to keep 2 observations for the first group, and 3 for the second one. If I give a vector to the function sample_n(), it only uses the first value (result is the same as above).

mtcars %>% 
  select(vs, drat) %>% 
  group_by(vs) %>% 
  sample_n(c(2,3))

Thanks in advance.


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>