I have a panel data that contains some missing values for a number of variables. I want to impute the missing data with series mean for panel data. I tried to use the following code, but I do not know how to ask r to do the calculation by taking into account the year and id or country.
The following code is a trial to impute the missing value for one variable. **My goal is to do this step for all variables.
my_data$V1[is.na(my_data$V1)] <- mean(my_data$V1,na.rm = TRUE)
head(my_data)
year id V1 V2
2000 AA
2001 AA
2002 AA 2 2
2003 AA 3 3
2000 BB 4 4
2001 BB
2002 BB 3 3
2003 BB
2000 CC 2 2
2001 CC 3
2002 CC 3 3
2003 CC
2000 DD 4
2001 DD 2
2002 DD
2003 DD
How can I deal with the missing value by calculating the mean for the missing values?
It does not have to be the code I put here, if you have another method or way, please add it here.
Thank you.