Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 209005

Dplyr : unclear usage of any_vars and all_vars predicates when filtering rows over multiples columns using negative str_detect

$
0
0

I am building a network :

from <- c("America, port unspecified", "Boston", "Chicago", "America, port unspecified")
to <-  c("Europe, port unspecified", "Nantes", "Le Havre", "Lisbonn")

dataset <- data.frame(from, to)

library(dplyr)

I want to subset my datatset with rows NOT containing unspecified ports :

     from       to
     Boston     Nantes
     Chicago    Le Havre

I tried this : in the code below I’m searching for the string “port unspecified” across all columns. I want to keep rows where the string “port unspecified” is NOT present in ANY of the variables.

dataset2 <- dataset %>%
              filter_all(any_vars(!str_detect(., "port unspecified")))

Result :

 from   to
Boston  Nantes
Chicago Le Havre
America, port unspecified   Lisbonn

I tried the code below with sucess :

dataset3 <- dataset %>%
    filter_all(all_vars(!str_detect(., "port unspecified")))

Result :

from  to
Boston  Nantes
Chicago Le Havre

Why all_vars gives me the expected result and not any_vars ?


Viewing all articles
Browse latest Browse all 209005

Latest Images

Trending Articles



Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>