I have some panel data across several years for several countries. Some individuals were treated at the age of 18 and from there on every individual that became 18 was treated.
eg: In the year 2000 the first individuals that were 18 had a treatment, in 2002 they are 20 years old in 2002:
age\year 1999 2000 2001 2002
18 z x x x
19 z z x x
20 z z z x
So I want to compare the individuals that got treated (x
) with the one that were not treated (y
).
I was able to compare all x with all y with this code:
data$dummy = ifelse(data$age <= 18 & data$year == 2000 |
data$age <= 19 & data$year == 2001 |
data$age <= 20 & data$year == 2002, 1, 0)
df <- lm(y ~ dummy , data = data)
summary(df)
But I want to compare all x
that are 18 with the 18 y
's. I tried this with:
data$age18 <- (data$age <= 18)
data$year2000 = ifelse(data$year >= 2000, 1,0)
data$age19 <- (data$age <= 19 & data$age > 18)
data$year2001 = ifelse(data$year >= 2001, 1,0)
data$age20 <- (data$age <= 20 & data$age > 19)
data$year2002 = ifelse(data$year >= 2002, 1,0)
df <- lm(y ~ age18:year2000 + age19:year2001 + age20:year2002, data = data)
summary(df)
But in the output i get wired coefficients:
(intercept)
age18FASLE:year2000
age18TRUE:year2000
age19FALSE:year2001
age19TRUE:year2001
age20FALSE:year2002
age20TRUE:year2002
Is there another way to compare subgroups within an age group? Thank you!