I have and experiment where I am looking at bird visits to feeders. If they go to the correct feeder, it counts as a success. They are deemed to have learnt which feeder to visit once they are successful 16/20 visits (80%).
I want to write a piece of R
code that goes through the full data frame by bird and identifies the first instance of 80% success with a moving window of 20 visits.
The code I have below works but it is clumsy. The end result is to obtain a tibble with each row representing a unique bird and the point at which they first reach 80% success.
# My attempt
library(tidyverse)
# Some sample data
data <- tibble(
bird = rep("a121", 99),
success = sample(c(1,0), 99, replace = TRUE, prob = c(0.7, 0.3)),
visit = 1:99) %>%
bind_rows(tibble(
bird = rep("b232", 99),
success = sample(c(1,0), 99, replace = TRUE, prob = c(0.75, 0.15)),
visit = 1:99)) %>%
bind_rows(tibble(
bird = rep("c211", 99),
success = sample(c(1,0), 99, replace = TRUE, prob = c(0.7, 0.3)),
visit = 1:99)) %>%
mutate(observation = 1:297) %>%
select(observation, everything())
# Identify first time 80% success rate met with a sliding window of 20
res <- NULL
slide.funct <- function(data, window, step){
birds <- unique(pull(data, bird)) # Identify the birds in the data
for (j in birds) {
sub_data <- filter(data, bird == j) # Work with one bird at a time
# Might need a line to ensure the data is ordered by time
total <- nrow(sub_data)
spots <- seq(from = 1, to = (total - window), by = step)
result <- vector(length = length(spots))
for(i in 1:length(spots)){
result[i] <- mean(sub_data$success[spots[i]:(spots[i] + window)]) # Success rate by window
position <- Position(function(x) x >= 0.8 , result) + 19 # First point at which success > 80%
}
res <- bind_rows(res, sub_data[position, ]) # Build up info for each bird
}
res <- res %>% mutate(observation = observation + 19) %>%
select(observation, bird)
return(res)
}
slide.funct(data, 19, 1)
#> # A tibble: 3 x 2
#> observation bird
#> <dbl> <chr>
#> 1 53 a121
#> 2 138 b232
#> 3 237 c211
This is only the first stage of the process as there is more to do downstream. However, I'm not sure the above function robust enough. I looked at the 'zoo' package but didn't think it could help here. I'm also looking to use tidyverse
code where I can, so if there are better options with respect to achieving the above, I'd be very happy to hear them.