I have a lot of survey data where respondents were asked many different multiple-choice questions for which they could choose multiple answers. The survey software coded each question as multiple variables that could have values of either the answer or NA. NA isn't really appropriate, though, as, unless the individual skipped the question, not selecting an answer really means "no." I want to re-code all the questions of this type to fix this so I can analyze the data. If the individual skipped the questions the NAs should stand, but if they clicked at least one of the multiple choices, then the NAs should be "no"s. Example below:
library(tidyverse)
df <- tibble(SC_1 = c("yes", "yes", NA, "yes", "yes", NA, "yes", "yes", NA, "yes"),
SC_2 = c("yes", NA, NA, NA, "yes", "yes", NA, "yes", NA, "yes"),
RF_1 = c("gas", "gas", NA, "gas", "gas", NA, "gas", "gas", NA, "gas"),
RF_2 = c("electricity", NA, NA, NA, "electricity", "electricity", NA, "yes", NA, "electricity"))
I could do this by taking each question one at a time
df %>% mutate(SC_1_recode = ifelse(is.na(SC_1) & is.na(SC_2), SC_1,
ifelse(is.na(SC_1),"no", SC_1)))
But that seems cumbersome, given that I have dozens of this kind of question and they all have this problem.
Any ideas? I've been trying out mutate_if()
, but haven't gotten anywhere.