Here is the question which I found is very similar to mine, but the start dates and stop dates can be same sometimes (Find overlapping dates for each ID and create a new row for the overlap). I want to comment and ask directly, but I don't have reputation over 15.
Suppose the data looks like (copied from Uwe):
library(data.table)
DT <- fread(
"ID date1 date2 Value
15 2003-04-05 2003-05-06 1
15 2003-04-20 2003-06-20 1
16 2001-01-02 2002-03-04 2
17 2003-03-05 2007-02-22 1
17 2005-04-15 2014-05-19 2"
)
cols <- c("date1", "date2")
DT[, (cols) := lapply(.SD, as.IDate), .SDcols = cols]
What I want to get is:
ID date1 date2 Value
15 2003-04-05 2003-04-19 1
15 2003-04-20 2003-05-06 2
15 2003-05-07 2003-06-20 1
17 2003-03-05 2005-04-14 1
17 2005-04-15 2007-02-22 3
17 2007-02-23 2014-05-19 2
where the start date and stop date don't overlap.
Also, this is the first step from Uwe, could someone tell me what is -1L mean here? I understand that tmp is a temporary vector, but why do we need to -1?
library(data.table)
options(datatable.print.class = TRUE)
breaks <- DT[, {
tmp <- unique(sort(c(date1, date2)))
.(start = head(tmp, -1L), end = tail(tmp, -1L))
}, by = ID]
breaks
Thank you!