Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

Mutating multiple columns dynamically while conditioning on specific rows

$
0
0

I know there are several similar questions around here, but none of them seems to address the precise issue I'm having.

set.seed(4)
df = data.frame(
  Key = c("A", "B", "A", "D", "A"),
  Val1 = rnorm(5),
  Val2 = runif(5),
  Val3 = 1:5
)

I want to zeroise values of the value columns for the rows where Key == "A" The column names are referenced through a grep:

cols = grep("Val", names(df), value = TRUE)

Normally to achieve what I want in this case I would use data.table like this:

library(data.table)
df = as.data.table(df)
df[Key == "A", (cols) := 0]

And the desired output is like this:

  Key      Val1       Val2 Val3
1   A  0.000000 0.00000000    0
2   B -1.383814 0.55925762    2
3   A  0.000000 0.00000000    0
4   D  1.437151 0.05632773    4
5   A  0.000000 0.00000000    0

However this time I need to use dplyr as I am working on a team project where everyone uses it. The data I just provided is illustrative and my real data is >5m rows with 16 value columns to be updated. The only solution I could come up with is using mutate_at like this:

df %>% mutate_at(.vars = vars(cols), .funs = function(x) ifelse(df$Key == "A", 0, x))

However, this seems to be extremely slow on my real data. I was hoping to find a solution which is more elegant and, more importantly, faster.

I have tried many combinations using map, unquoting using !!, using get and := (which annoyingly can get masked by the := in data.table) etc, but I think my understanding of how these work is simply not deep enough to construct a valid solution.


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>