Analyzing temporally correlated data using 'date' as a fixed effect; do you convert to a factor first or leave as a 'Date' class in R?

The short and sweet of the question is this: How do you deal with time as a parameter in an repeat ANOVA in R?

I have repeat observations of the number of plants growing in pots taken at 3 day, 7 day, and 14 day intervals coming from greenhouse data. Incorporating the time element is important as each species grows at a different rate with some germinating very quickly while others take monthes. Variation in survey frequency reflects those changes. I am looking at the effect of a treatment (2 levels), species (10 levels), and date (n=4). I control for variation by pot using it as a random effect. This is all summed up by the following equation:

Establishment ~ Date + Species + Treatment + (1|Pot)

I've built a quick simulation of my data and ran it through lmer. My confusion stems from how I handle the date variable. Do I leave the date variable in the 'Date' class or should it be factored? Perhaps I have misunderstood the literature but it seems that both can be used but I am hoping to go with what is most appropriate.

library(tidyverse)
library(lubridate)
library(lme4)

gmean <- c(.4,.5,.6,.65) #Simulate 4 means to represent date and treatment effects
sigma_g <- .1            #Standard deviation
reps <- 30               #Replicates
nspp <- 10               #Simulated number of species
ntrt <- 2                #Simulated number of treatments
n <- nspp*reps           #Total reps for simulation

tibble(est = rnorm(n, gmean, sigma_g),
       date = rep(seq(ymd('2018-04-07'),ymd('2018-05-03'), by = '1 week'), length.out=n),
       species = rep(1:nspp, each=n/nspp),
       trt  = rep(c("control","trt"), length.out=n),
       pot_id = paste0(species,"-",trt)) %>% 
  mutate_each(funs(factor(.)),species:trt) %>% 
  glimpse() -> sim_df 

mm <- lmer(est ~ date + species + trt  + (1|pot_id), data=sim_df, REML=F)
mm.factored <- lmer(est ~ factor(date) + species + trt  + (1|pot_id), data=sim_df, REML=F)

anova(mm,mm.factored)

Data: sim_df
Models:
mm: est ~ date + species + trt + (1 | pot_id)
mm.factored: est ~ factor(date) + species + trt + (1 | pot_id)
            Df     AIC     BIC logLik deviance  Chisq Chi Df Pr(>Chisq)  
mm          14 -515.80 -463.94 271.90  -543.80                           
mm.factored 15 -518.39 -462.83 274.19  -548.39 4.5909      1    0.03214 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

My 'real' dataset has closer to 6000 objects.

Latest Images

Trending Articles

Latest Images