Suppose we have two columns in a dataframe. Both columns contain lists of dates. The number of dates in any given cell is not fixed (i.e., can vary), as illustrated below:
library(tidyverse)
set.seed(41)
randomDate <- function(date1, date2, n){
sample(seq(as.Date(date1), as.Date(date2), by="day"), n)
}
df <- data.frame(dates1 = I(map(sample(1:25, 1000, replace=T),
randomDate,
date1="1999/01/01",
date2="2000/01/01")),
dates2 = I(map(sample(1:10, 1000, replace=T),
randomDate,
date1="1999/01/01",
date2="2000/01/01")))
To further clarify, in this reproducible example, the first observation (i.e., row 1) has 8 dates for the dates1
variable and 2 dates for the dates2
variable. The second observation contains 3 dates for the dates1
variable and 9 dates for the dates2
variable.
My goal is as follows:
For each observation (row), check whether the observation has at least x dates in
dates2
within y days of any single date fromdates1
and return a logical (TRUE
/FALSE
)
For example, if we consider x=2 and y=14 for an observation where:
dates1: 1999/01/05,1999/02/05
dates2: 1999/01/02,1999/01/30,1999/07/02,1999/02/09,1999/07/02
I would want to return TRUE
since 1999/01/30 and 1999/02/09 are both within 14 days of 1999/02/05.