I am using Prophet and Linear Regression in order to:
- Predict sales on day level / store level;
- Understand the effect size of my repressors (x variables).
I don’t necessarily want to stick to these modelling techniques.
Now I’m facing the issue that if I model each store separately, the number of observations will decrease (and hence I am losing degrees of freedom). However, if I aggregate all stores - and model them at once – I expect that the model will not fit very well. Furthermore, if I aggregate the sales of these stores, the biggest stores will have a heavier weighting factor in the group.
Eventually, I need a prediction on store level. However, I would like to use all stores to determine the effect size of my external repressors.
My data is 4 years of sales data on day level from 100 stores. The additional regressor is the depth of discount (in percentages). See below an example of how my data looks like:
> head(data)
Date Sales_EUR Store_ID Discount_depth
1 2017-01-01 101 1 0.10
2 2017-01-01 105 2 0.12
3 2017-01-01 104 3 0.11
4 2017-01-01 200 4 0.09
5 2017-01-01 170 5 0.10
6 2017-01-01 150 6 0.12
Does anyone have a solution or best practice for this issue?
Many thanks in advance.