I have some time data
library(data.table); library(lubridate); set.seed(42)
dat <- data.table(time=as.POSIXct("2019-01-01 08:00:00") + round(runif(10,60,1e4)), val=runif(10))[order(time), id:=seq_len(.N)]
> dat[order(id)]
time val id
1: 2019-01-01 08:12:42 0.93400648 1
2: 2019-01-01 08:14:33 0.29466004 2
3: 2019-01-01 08:24:47 0.27195012 3
4: 2019-01-01 08:43:43 0.52270421 4
5: 2019-01-01 08:55:16 0.98264672 5
6: 2019-01-01 09:01:20 0.02388521 6
7: 2019-01-01 09:39:06 0.89681397 7
8: 2019-01-01 09:47:16 0.82138170 8
9: 2019-01-01 10:09:31 0.06342926 9
10: 2019-01-01 10:20:01 0.67328881 10
and I would like to calculate the sum of val
during the following hour for each value of time
. For example, for ID 1, this would be the sum of val
for IDs 1 to 6, for ID 2, the sum of val
for IDs 2 to 6, and so forth. In my actual problem, there is an additional grouping variable, which I omitted to keep things focused. I will then implement the solution, adding [..., by=group]
.
I understand that solving this requires me to subset within j
but this is a problem I frequently run into and can't solve. I have not yet understood the general approach to this, if there is one.