I have daily data for over 50 years. I need to generate a new dataset with averaged values of daily data.For each day the average should be taken as including daily data from past 10 years and future 10 years.
Here is a reproducible example. I have years from 1998-2008 and 10 days of months Feb and March. What I need to do is to compute new averaged columns of T1 and T2 as T1avg and T2avg respectively. The daily average needs to be computed so that it includes data for past 4 and future 4 years. For averaging crop the df for 1998-2001 and 2005-2008 since those years will not have enough data for 4 years before or after.
For example for Feb 28, 2002. I need to average values if T1 and T2 for days 02/28/1998,02/28/1999,02/28/2000,02/28/2001,02/28/2002,02/28/2003,02/28/2004,02/28/2005,02/28/2006. For Feb 29,2004 I would just average 02/29/200,02/29/2004,02/29/2008.
I tried sqldf. I am able to do daily avg but couldn't figure out how compute the average by conditioning on year where year is between year-4 and year +4 years.
#Generate data
df<-as.data.frame(cbind(year=rep(1998:2008,each=20),
month=c(rep(2:3,each=10),rep(2:3,each=10),rep(2:3,each=10),
rep(2:3,each=10),rep(2:3,each=10),rep(2:3,each=10),
rep(2:3,each=10),rep(2:3,each=10),rep(2:3,each=10),
rep(2:3,each=10),rep(2:3,each=10)),
day=c(19:28,1:10,19:28,1:10,20:29,1:10,
19:28,1:10,19:28,1:10,19:28,1:10,
20:29,1:10,19:28,1:10,19:28,1:10,
19:28,1:10,20:29,1:10),
T1=rnorm(220),
T2=rnorm(220)))
##################### Average daily data########################
sqldf("
select
month,
day,
year,
T1,
T2,
avg(T1) as T1_avg
,avg(T2) as T2_avg
from df
group by
month, day
")