I need to run a regression on a panel data. My data looks at prices of hotel room plans for two distinct observation points (a high season date and low season date).
I have this data, and I run:
library(plm)
Y <- cbind(cost)
X <- cbind(deluxe, standard, dinner, cancell, capacity, size)
pdata <- pdata.frame(mydata, index = c("ID_roomplan", "season"), drop.index = FALSE)
fixed <- plm(Y ~ X, data = pdata, model = "within")
I get the following:
Warning message:
In pdata.frame(mydata, index = c("ID_roomplan", "season"), :
duplicate couples (id-time) in resulting pdata.frame
I understand that this is because there needs to be a unique pair of id-time (i.e. ID_roomplan-season) for plm to work, but I do not know how to fix the data because there can be same roomplans for different prices within the same season because of differences in options (e.g. because cancellation is possible in one room type, or dinner is included in one room).
I feel that I have two options:
- creating new ID_roomplan categories by distinguishing not just between the room plans also including the dummy variable options as identifiers, and then conducting plm. OR
- do the lm regression models for high season and low season separately and not use plm.
Which option would be best and also are there are any other options?