Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 205491

Add summarize variable in multiple statements using dplyr?

$
0
0

In dplyr, group_by has a parameter add, and if it's true, it adds to the group_by. For example:

data <- data.frame(a=c('a','b','c'), b=c(1,2,3), c=c(4,5,6))
data <- data %>% group_by(a, add=TRUE)
data <- data %>% group_by(b, add=TRUE)
data %>% summarize(sum_c = sum(c))

Output:

  a         b sum_c
1 a         1     4
2 b         2     5
3 c         3     6

Is there an analogous way to add summary variables to a summarize statement? I have some complicated conditionals (with dbplyr) where if x=TRUE I want to add variable x_v to the summary.

I see several related stackoverflow questions, but I didn't see this.

EDIT: Here is some precise example code, but simplified from the real code (which has more than two conditionals).

summarize_num <- TRUE
summarize_num_distinct <- FALSE

data <- data.frame(val=c(1,2,2))

if (summarize_num && summarize_num_distinct) {
  summ <- data %>% summarize(n=n(), n_unique=n_distinct())
} else if (summarize_num) {
  summ <- data %>% summarize(n=n())
} else if (summarize_num_distinct) {
  summ <- data %>% summarize(n_unique=n_distinct())
}

Depending on conditions (summarize_num, and summarize_num_distinct here), the eventual summary (summ here) has different columns.

As the number of conditions goes up, the number of clauses goes up combinatorially. However, the conditions are independent, so I'd like to add the summary variables independently as well.

I'm using dbplyr, so I have to do it in a way that it can get translated into SQL.


Viewing all articles
Browse latest Browse all 205491

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>