Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201894

How to do a for-loop to establish the linear relationship between two variables by changing for one of them repeatedly its 'DateTime'

$
0
0

I have two devices that record similar data but in a different way: V13 and X16. V13measures activity of an animal in a window of 57 seconds starting 57 seconds before the DateTime we have in our dataset. X16 measures activity but at a higher resolution. The X16 records acceleration in the three axes (X, Y and Z) every 333 milliseconds (that is, around 3 data per second, or what is the same, 3 Hertz).

Here I simulate data for both devices:

V13<- data.frame(DateTime_V13=c("2017-08-10 12:01:54.487","2017-08-10 12:11:13.457","2017-08-10 12:18:36.652","2017-08-10 12:22:46.652","2017-08-10 12:27:21.652","2017-08-10 12:33:05.987"),
                    ID=c("A","A","A","A","A","A"),
                    Act.V13=c(0.8,1.7,2.5,1.3,0.6,1.2))
V13$DateTime_V13<- as.POSIXct(V13$DateTime_V13, format="%Y-%m-%d %H:%M:%OS",tz="UTC")

V13

         DateTime_V13 ID Act.V13
1 2017-08-10 12:01:54  A     0.8
2 2017-08-10 12:11:13  A     1.7
3 2017-08-10 12:18:36  A     2.5
4 2017-08-10 12:22:46  A     1.3
5 2017-08-10 12:27:21  A     0.6
6 2017-08-10 12:33:05  A     1.2

library(data.table)
options("digits.secs" = 3)
cols <- c("x", "y", "z")
set.seed(100)

base_data <- fread("DateTime_X16, ID, N
2017-08-10 12:00:00.000, A, 10000")[
  , DateTime := lubridate::ymd_hms(DateTime_X16)]

X16 <- fread("DateTime_X16, ID, N
2017-08-10 12:00:00.000, A, 10000")[
  , DateTime_X16 := lubridate::ymd_hms(DateTime_X16)][
    , .(DateTime_X16 = seq(from = DateTime_X16, by = 1/3, length.out = N)), 
    by = .(rn = seq(nrow(base_data)), ID)][
      ,  (cols) := replicate(length(cols), round(runif(.N, -1, +1), 2), simplify = FALSE)][
        , rn := NULL][]

head(X16)

   ID            DateTime_X16     x     y     z
1:  A 2017-08-10 12:00:00.000 -0.38  0.33  0.76
2:  A 2017-08-10 12:00:00.333 -0.48 -0.96 -0.36
3:  A 2017-08-10 12:00:00.666  0.10 -0.18 -0.09
4:  A 2017-08-10 12:00:01.000 -0.89 -0.17 -0.24
5:  A 2017-08-10 12:00:01.333 -0.06  0.31 -0.22
6:  A 2017-08-10 12:00:01.666 -0.03  0.73 -0.19

I can relate both values of activity by calculating the activity with the X16 using the same formula and time-interval than the V13. Below I show the code to do this:

setDT(V13)[, DateTime_V13 := as.POSIXct(DateTime_V13, format=fmt, tz="UTC")][,
                                                                                                        c("start", "end") := .(DateTime_V13-57, DateTime_V13)]
setDT(X16)[, DateTime_X16 := as.POSIXct(DateTime_X16, format=fmt, tz="UTC")]

n_min <- 128L
Comparison <-X16[V13, on = .(DateTime_X16 >= start, DateTime_X16 <= end),
                       by = .EACHI, .(
                         DateTime = i.DateTime_V13,
                         .N,  
                         Act.X16 = if (.N < n_min) NA_real_ else sum(sqrt(x ^ 2 + y ^ 2 + z ^ 2)) / .N
                       )][,
                          (1L:3L) := NULL][]

foo <- V13[,c("DateTime_V13","ID","Act.V13")]
Comparison<- cbind(foo,Comparison)

head(Comparison)


              DateTime_V13 ID Act.V13   N   Act.X16
1: 2017-08-10 12:01:54.486  A     0.8 171 0.9071006
2: 2017-08-10 12:11:13.457  A     1.7 171 0.9798638
3: 2017-08-10 12:18:36.651  A     2.5 171 0.9765528
4: 2017-08-10 12:22:46.651  A     1.3 171 0.9388695
5: 2017-08-10 12:27:21.651  A     0.6 171 0.9663221
6: 2017-08-10 12:33:05.986  A     1.2 171 0.9621084

For some methodological reasons, I need to change the time of the 'X16' device in a 4-minute window (forwarding the time to 2 minutes and backing up the time to 2 minutes) SECOND BY SECOND and then establish for which DateTime of X16 the linear relationship between the activity value of the V13 and X16 is maximum.

I would like to get a table like the one shown below in which I have for different times for X16 the R-squared using a linear regression model. Negative values mean that we have subtracted that time from the real-time of the 'X16' and positive values that we have added that time.

Table

   DateTime_X16_delay R.Squared
1                  -5      0.73
2                  -4      0.70
3                  -3      0.69
4                  -2      0.71
5                  -1      0.68
6                   0      0.70
7                   1      0.72
8                   2      0.73
9                   3      0.74
10                  4      0.75 
11                  5      0.74

In my example

First I calculate the R-squared for my current data (in the above table would correspond to the DateTime_X16_delay of 0 since I am not adding or subtracting anything to DateTime_X16):

model <- lm(Comparison$Act.V13 ~ Comparison$Act.X16)
summary(lm(Comparison$Act.V13 ~ Comparison$Act.X16))$r.squared
[1] 0.2928491 # Approximately 30% of the variation in `Y` is explained by `X`.

Then I change the time for the X16 device subtracting one second:

X16$DateTime_X16<- X16$DateTime_X16 -1

setDT(V13)[, DateTime_V13 := as.POSIXct(DateTime_V13, format=fmt, tz="UTC")][,
                                                                             c("start", "end") := .(DateTime_V13, DateTime_V13+57)]
setDT(X16)[, DateTime_X16 := as.POSIXct(DateTime_X16, format=fmt, tz="UTC")]

n_min <- 128L

onebefore <-X16[V13, on = .(DateTime_X16 >= start, DateTime_X16 <= end),
                      by = .EACHI, .(
                        DateTime = i.DateTime_V13,
                        .N,  
                        Act.X16 = if (.N < n_min) NA_real_ else sum(sqrt(x ^ 2 + y ^ 2 + z ^ 2)) / .N
                      )][,
                         (1L:3L) := NULL][]

foo <- V13[,c("DateTime_V13","ID","Act.V13")]
Comparison<- cbind(foo,onebefore)

head(Comparison)

          DateTime_V13 ID Act.V13   N   Act.X16
1: 2017-08-10 12:01:54  A     0.8 171 0.9606891
2: 2017-08-10 12:11:13  A     1.7 171 0.9189191
3: 2017-08-10 12:18:36  A     2.5 171 0.9539324
4: 2017-08-10 12:22:46  A     1.3 171 0.9547121
5: 2017-08-10 12:27:21  A     0.6 171 0.9229827
6: 2017-08-10 12:33:05  A     1.2 171 0.9714373

And I make again a linear regression and find the R-squared.

model <- lm(Comparison$Act.V13 ~ Comparison$Act.X16)
summary(lm(Comparison$Act.V13 ~ Comparison$Act.X16))$r.squared
[1] 0.003973827 

And so 'N' times. In this particular example, I would like to do it by changing the 'DateTime' of the 'X16' from 5 seconds before (subtracting 5 seconds from the original time of the 'X16') to 5 seconds after (adding 5 seconds to the original times of the 'X16').

Does anyone know how to do this automatic calculation process of the 'R-squared' by changing the times of the 'X16'? The window of times for which I must calculate this R-squared is wide and if I do it in a manual way it can take a long time.


Viewing all articles
Browse latest Browse all 201894

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>