I have many questions about time differences, this one however is unique because I wish to find the time difference between each pair of similar timestamps. I have not been able to successfully do this. I am wondering if I need a loop to do this.
Goal: Find total durations for activities A, B and C. I realize the last/first difftime commands will not be accurate, because it does not account for the time 'in between' the set.
For example, if we group A, and calculate its time duration, it SHOULD NOT be 4:45:10 - 4:07:01. That would give the incorrect duration. A more accurate time difference is taking the time difference of each consecutive row. Example: 4:45:11- 4:45:10 and then 4:07:01 - 4:06:59 etc.
Would I have to create a loop for this? Does dplyr have a function within its package to accurately calculate the time differences for similar datetime stamps, whilst grouping?
Here is my data:
ID TIME
A 12/18/2019 4:45:10 AM
A 12/18/2019 4:45:11 AM
A 12/18/2019 4:06:59 PM
A 12/18/2019 4:07:01 PM
B 12/18/2019 4:14:13 AM
B 12/18/2019 4:14:14 AM
B 12/18/2019 4:14:15 AM
C 12/18/2019 4:59:49 AM
C 12/18/2019 4:59:50 AM
C 12/18/2019 4:59:51 AM
I would like this:
ID TIME DELTA
A 12/18/2019 4:45:10 AM NA
A 12/18/2019 4:45:11 AM 1 sec
A 12/18/2019 4:06:59 PM
A 12/18/2019 4:07:01 PM 1 sec
B 12/18/2019 4:14:13 AM
B 12/18/2019 4:14:14 AM 1 sec
B 12/18/2019 4:14:15 AM
C 12/18/2019 4:59:49 AM
C 12/18/2019 4:59:50 AM 1 sec
C 12/18/2019 4:59:51 AM NA
I would like the accurate time differences within a 2-5 minute interval within each grouped ID so that my durations are accurate. Would I have to create a loop?
Any suggestion is helpful. I have been researching this for a week now. Thank you
so far I have this loop:
row = 0
col = 2
sum_time=0
for msgid in grouped_data
{
row <- row + 1
if row %% 2 == 0
{
sum_time<-grouped_data(row,col)-grouped_data(row-1,col)+sum_time
}
if grouped_data(row,1) != grouped_data(row-1,1)
{
sum_time<-0.0
}
}
dput:
'data.frame': 1047588 obs. of 2 variables:
$ ID: Factor w/ 30 levels "","A",..: 7 11 11 11 11 11 7 12 12 12 ...
$ DATE : Factor w/ 2563 levels "","12/18/2019 1:26:07 AM",..: 2 3 4 5 6 7 7 8 9 10 ...
library(tidyverse)
sample_df <- tribble(~ID, ~TIME,
'A', '12/18/2019 4:45:10 AM',
'A', '12/18/2019 4:45:11 AM',
'A', '12/18/2019 4:06:59 PM',
'A', '12/18/2019 4:07:01 PM',
'B', '12/18/2019 4:14:13 AM',
'B', '12/18/2019 4:14:14 AM',
'B', '12/18/2019 4:14:15 AM',
'C', '12/18/2019 4:59:49 AM',
'C', '12/18/2019 4:59:50 AM',
'C', '12/18/2019 4:59:51 AM') %>%
mutate_all(as.factor)