Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 204742

Sum values for overlapping dates for each ID

$
0
0

Here is the question which I found is very similar to mine, but the start dates and stop dates can be same sometimes (Find overlapping dates for each ID and create a new row for the overlap). I want to comment and ask directly, but I don't have reputation over 15.

Suppose the data looks like (copied from Uwe):

library(data.table)
DT <- fread(
  "ID    date1         date2       Value
15  2003-04-05  2003-05-06      1
15  2003-04-20  2003-06-20      1
16  2001-01-02  2002-03-04      2
17  2003-03-05  2007-02-22      1   
17  2005-04-15  2014-05-19      2"
)
cols <- c("date1", "date2")
DT[, (cols) := lapply(.SD, as.IDate), .SDcols = cols]

What I want to get is:

ID    date1         date2       Value
15  2003-04-05  2003-04-19      1
15  2003-04-20  2003-05-06      2
15  2003-05-07  2003-06-20      1
17  2003-03-05  2005-04-14      1   
17  2005-04-15  2007-02-22      3
17  2007-02-23  2014-05-19      2

where the start date and stop date don't overlap.

Also, this is the first step from Uwe, could someone tell me what is -1L mean here? I understand that tmp is a temporary vector, but why do we need to -1?

library(data.table)
options(datatable.print.class = TRUE)
breaks <- DT[, {
  tmp <- unique(sort(c(date1, date2)))
  .(start = head(tmp, -1L), end = tail(tmp, -1L))
  }, by = ID]
breaks

Thank you!


Viewing all articles
Browse latest Browse all 204742

Trending Articles