Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 206278

linear regression for time series in r- efficiently

$
0
0
Customer  date       sales1 sales2
c1        2019-01-01   67   35
c1        2019-01-07   70   32
c1        2019-01-14   72   40
c2        2019-01-01   100  12
c2        2019-01-07   134  20
c2        2019-01-14   174  23

Making date column as number for forecasting purpose.

df <- df %>% group_by(customer) %>% mutate(dt_seq = row_number())
n <- data.frame()
for(i in unique(df$customer)){
  one <- df[df$customer ==i,]
  one$customer <- NULL
  model <- lm(sales1 ~ dt_seq,data=one)
  fin <- data.frame(matrix(0,52,1))
  colnames(fin) <- 'dt_seq'
  fin$dt_seq <- seq(max(one$dt_seq)+1,max(one$dt_seq)+52,1)
  pre <- as.data.frame(predict(model,fin))
  temp <- cbind(cbind(fin,pre),i)
  temp$dt <- seq(max(one$dt)+7,max(one$dt)+7*52,7)
  colnames(temp) <- c("dt_seq","sales1","customer","dt")
  n <- rbind(n,temp)
}

This is taking long time as I have many customers data and is there any other way to run parallel using spark.lapply() function. Edit : I would like to predict for future values for next 52 periods.For each customer the for loop is taking too long time as I have many customers data. Is there other way to calculate linear regression for time series for grouped data and forecast values.


Viewing all articles
Browse latest Browse all 206278

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>