I have a panel dataset: panel and a dataset with a list of events: Events. For the panel dataset, an equal panelID shows that two observations belong together.
panelID = c(1:50)
year= c(2001:2010)
country = c("NLD", "GRC", "GBR")
n <- 2
library(data.table)
set.seed(123)
Panel <- data.table(panelID = rep(sample(panelID), each = n),
country = rep(sample(country, length(panelID), replace = T), each = n),
year = c(replicate(length(panelID), sample(year, n))),
some_NA = sample(0:5, 6),
some_NA_factor = sample(0:5, 6),
norm = round(runif(100)/10,2),
Income = round(rnorm(10,-5,5),2),
Happiness = sample(10,10),
Sex = round(rnorm(10,0.75,0.3),2),
Age = sample(100,100),
Educ = round(rnorm(10,0.75,0.3),2))
Panel[, uniqueID := .I] # Creates a unique ID
Panel[Panel == 0] <- NA
Events <- fread(
"Event_Type country year
A NLD 2005
C NLD 2004
A GBR 2006
B GBR 2003
A GRC 2002
D GRC 2007",
header = TRUE)
================================================================================ EDIT:
Events <- fread(
"Event_Type country year
A NLD 2005
A NLD 2004
A GBR 2006
A GBR 2003
A GRC 2002
A GRC 2007",
header = TRUE)
EDITED DESIRED OUTCOME:
panleID country year 2002 2003 2004 2005 2006 2007
1 NLD 2002 NA NA 1 1 NA NA
1 NLD 2006 NA NA 1 1 NA NA
=======================================================================
I would like the value in the column Event_Type to be added to Panel, if the year of the event is between two panel observation (and in the same country).
As an example, let's take the following panel observation:
panleID country year
1 NLD 2002
1 NLD 2006
Panel will get 4 extra columns A to D. The column A, will get a 1 in the column if the event in country NLD in the year 2005 (first line Events, takes place in one of or between the two years. Since it does this results in the following:
panleID country year A B C D
1 NLD 2002 1 NA NA NA
1 NLD 2006 1 NA NA NA
I know that merging with the same year goes as follows:
merge(Panel, dcast(Events, iso + country ~ Event_Type),
by = c("country", "year"))
But how should I make the merge if I want the values to be equal to or in between the two panelID years?