Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 208690

Selecting all rows that match a criteria selected randomly within dplyr

$
0
0

I am trying to select all rows in a repeated measures dataset that belong to a randomly selected group of people. I am trying to do it entirely in the tidyverse (for my own edification) but find myself having to fall back on base R functions. Here is how I do it with a combination of base R and dplyr commands.

set.seed(145)
df <- data.frame(id = rep(letters[1:10], each = 4),
                 score = rnorm(40))
ids <- sample(unique(df$id), 3)
smallDF <- df %>% dplyr::filter(id %in% ids)
smallDF

#    id      score
# 1   a  0.6869129
# 2   a  1.0663631
# 3   a  0.5367006
# 4   a  1.9060287
# 5   c  1.1677516
# 6   c  0.7926794
# 7   c -1.2135038
# 8   c -1.0056141
# 9   d  0.2085696
# 10  d  0.4461776
# 11  d -0.6208060
# 12  d  0.4413429

I can sample randomly from the id identifier using dplyr...

df %>% distinct(id) %>% sample_n(3)

#   id
# 1  e
# 2  c
# 3  b

...but the fact that the output is a dataframe/tibble is making it difficult for me to get to that next step where I then filter the original df by the randomly selected id identifiers.

Can anyone help?


Viewing all articles
Browse latest Browse all 208690

Latest Images

Trending Articles



Latest Images