I have two devices that record similar data but in a different way: V13
and X16
. V13
measures activity of an animal in a window of 57 seconds starting 57 seconds before the DateTime
we have in our dataset. X16
measures activity but at a higher resolution. The X16
records acceleration in the three axes (X
, Y
and Z
) every 333 milliseconds (that is, around 3 data per second, or what is the same, 3 Hertz).
Here I simulate data for both devices:
V13<- data.frame(DateTime_V13=c("2017-08-10 12:01:54.487","2017-08-10 12:11:13.457","2017-08-10 12:18:36.652","2017-08-10 12:22:46.652","2017-08-10 12:27:21.652","2017-08-10 12:33:05.987"),
ID=c("A","A","A","A","A","A"),
Act.V13=c(0.8,1.7,2.5,1.3,0.6,1.2))
V13$DateTime_V13<- as.POSIXct(V13$DateTime_V13, format="%Y-%m-%d %H:%M:%OS",tz="UTC")
V13
DateTime_V13 ID Act.V13
1 2017-08-10 12:01:54 A 0.8
2 2017-08-10 12:11:13 A 1.7
3 2017-08-10 12:18:36 A 2.5
4 2017-08-10 12:22:46 A 1.3
5 2017-08-10 12:27:21 A 0.6
6 2017-08-10 12:33:05 A 1.2
library(data.table)
options("digits.secs" = 3)
cols <- c("x", "y", "z")
set.seed(100)
base_data <- fread("DateTime_X16, ID, N
2017-08-10 12:00:00.000, A, 10000")[
, DateTime := lubridate::ymd_hms(DateTime_X16)]
X16 <- fread("DateTime_X16, ID, N
2017-08-10 12:00:00.000, A, 10000")[
, DateTime_X16 := lubridate::ymd_hms(DateTime_X16)][
, .(DateTime_X16 = seq(from = DateTime_X16, by = 1/3, length.out = N)),
by = .(rn = seq(nrow(base_data)), ID)][
, (cols) := replicate(length(cols), round(runif(.N, -1, +1), 2), simplify = FALSE)][
, rn := NULL][]
head(X16)
ID DateTime_X16 x y z
1: A 2017-08-10 12:00:00.000 -0.38 0.33 0.76
2: A 2017-08-10 12:00:00.333 -0.48 -0.96 -0.36
3: A 2017-08-10 12:00:00.666 0.10 -0.18 -0.09
4: A 2017-08-10 12:00:01.000 -0.89 -0.17 -0.24
5: A 2017-08-10 12:00:01.333 -0.06 0.31 -0.22
6: A 2017-08-10 12:00:01.666 -0.03 0.73 -0.19
I can relate both values of activity by calculating the activity with the X16
using the same formula and time-interval than the V13
. Below I show the code to do this:
setDT(V13)[, DateTime_V13 := as.POSIXct(DateTime_V13, format=fmt, tz="UTC")][,
c("start", "end") := .(DateTime_V13-57, DateTime_V13)]
setDT(X16)[, DateTime_X16 := as.POSIXct(DateTime_X16, format=fmt, tz="UTC")]
n_min <- 128L
Comparison <-X16[V13, on = .(DateTime_X16 >= start, DateTime_X16 <= end),
by = .EACHI, .(
DateTime = i.DateTime_V13,
.N,
Act.X16 = if (.N < n_min) NA_real_ else sum(sqrt(x ^ 2 + y ^ 2 + z ^ 2)) / .N
)][,
(1L:3L) := NULL][]
foo <- V13[,c("DateTime_V13","ID","Act.V13")]
Comparison<- cbind(foo,Comparison)
head(Comparison)
DateTime_V13 ID Act.V13 N Act.X16
1: 2017-08-10 12:01:54.486 A 0.8 171 0.9071006
2: 2017-08-10 12:11:13.457 A 1.7 171 0.9798638
3: 2017-08-10 12:18:36.651 A 2.5 171 0.9765528
4: 2017-08-10 12:22:46.651 A 1.3 171 0.9388695
5: 2017-08-10 12:27:21.651 A 0.6 171 0.9663221
6: 2017-08-10 12:33:05.986 A 1.2 171 0.9621084
For some methodological reasons, I need to change the time of the 'X16' device in a 4-minute window (forwarding the time to 2 minutes and backing up the time to 2 minutes) SECOND BY SECOND and then establish for which DateTime
of X16
the linear relationship between the activity value of the V13
and X16
is maximum.
I would like to get a table like the one shown below in which I have for different times for X16
the R-squared
using a linear regression model. Negative values mean that we have subtracted that time from the real-time of the 'X16' and positive values that we have added that time.
Table
DateTime_X16_delay R.Squared
1 -5 0.73
2 -4 0.70
3 -3 0.69
4 -2 0.71
5 -1 0.68
6 0 0.70
7 1 0.72
8 2 0.73
9 3 0.74
10 4 0.75
11 5 0.74
In my example
First I calculate the R-squared
for my current data (in the above table would correspond to the DateTime_X16_delay
of 0
since I am not adding or subtracting anything to DateTime_X16
):
model <- lm(Comparison$Act.V13 ~ Comparison$Act.X16)
summary(lm(Comparison$Act.V13 ~ Comparison$Act.X16))$r.squared
[1] 0.2928491 # Approximately 30% of the variation in `Y` is explained by `X`.
Then I change the time for the X16
device subtracting one second:
X16$DateTime_X16<- X16$DateTime_X16 -1
setDT(V13)[, DateTime_V13 := as.POSIXct(DateTime_V13, format=fmt, tz="UTC")][,
c("start", "end") := .(DateTime_V13, DateTime_V13+57)]
setDT(X16)[, DateTime_X16 := as.POSIXct(DateTime_X16, format=fmt, tz="UTC")]
n_min <- 128L
onebefore <-X16[V13, on = .(DateTime_X16 >= start, DateTime_X16 <= end),
by = .EACHI, .(
DateTime = i.DateTime_V13,
.N,
Act.X16 = if (.N < n_min) NA_real_ else sum(sqrt(x ^ 2 + y ^ 2 + z ^ 2)) / .N
)][,
(1L:3L) := NULL][]
foo <- V13[,c("DateTime_V13","ID","Act.V13")]
Comparison<- cbind(foo,onebefore)
head(Comparison)
DateTime_V13 ID Act.V13 N Act.X16
1: 2017-08-10 12:01:54 A 0.8 171 0.9606891
2: 2017-08-10 12:11:13 A 1.7 171 0.9189191
3: 2017-08-10 12:18:36 A 2.5 171 0.9539324
4: 2017-08-10 12:22:46 A 1.3 171 0.9547121
5: 2017-08-10 12:27:21 A 0.6 171 0.9229827
6: 2017-08-10 12:33:05 A 1.2 171 0.9714373
And I make again a linear regression and find the R-squared
.
model <- lm(Comparison$Act.V13 ~ Comparison$Act.X16)
summary(lm(Comparison$Act.V13 ~ Comparison$Act.X16))$r.squared
[1] 0.003973827
And so 'N' times. In this particular example, I would like to do it by changing the 'DateTime' of the 'X16' from 5 seconds before (subtracting 5 seconds from the original time of the 'X16') to 5 seconds after (adding 5 seconds to the original times of the 'X16').
Does anyone know how to do this automatic calculation process of the 'R-squared' by changing the times of the 'X16'? The window of times for which I must calculate this R-squared
is wide and if I do it in a manual way it can take a long time.