I am trying to build a multiple linear regression model:
lm1 <- lm(outcome ~ sex + age + education + cohort, data=data.in)
Among the independent variables, sex, education, and cohort are all categorical variables, so should I add factor() for these variables in the model? I also need to get the adjusted(estimated) mean for the outcome for males and for females separately, so I created two datasets separated by sex:
data.in.m <- data.in %>% fitler(sex==1)
data.in.f <- data.in %>% fitler(sex==2)
Then I tried to use
mean(fitted(lm1,data.in.m))
or
mean(predict(lm1,data.in.m))
to get the adjusted mean outcomes in males (from the original dataset).
I am not sure if I should use fitted() or predict() since the two functions give two different values here.