Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201945

Counting rows where two columns were both NA, then excluding

$
0
0

I am trying to:

First - isolate the number of rows where the response in two particular columns was NA (so I can make sure I've done the second step right in terms of how many responses are deleted).

Second - write the code to remove the rows where the response was either 2 for both columns, or NA for both columns (they can be retained where either is a 1).

STEP 1 - COUNT: I can sum them individually

sum(is.na(data$columnname1))
sum(is.na(data$columnname2))

But what I need to know is how many are NA in both. I have tried this and other equally incorrect variations but no luck -

sum[(is.na(data$columnname1)) & (is.na(columnname2))]

STEP 2 - DELETE: I've tried a few options:

data = data[data$columnname1 == 1 | data$columnname2 == 1,]

but this doesn't deal with the NAs

I also tried this:

data <- data[is.na(data$columnname1 & data$columnname2) == FALSE, (data$columnname1 == 1 | data$columnname2 == 1, )]

but I am getting error messages about incorrect tokens before I can even try to run it.

If I did this - would it work to remove where both are NA:

data <- data[complete.cases(data[, c("columnname1", "columnname2")]),] 

And if so, would I need to run this first?

data = data[data$columnname1 == 1 | data$columnname2 == 1,] first?

Viewing all articles
Browse latest Browse all 201945

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>