I have a data.table like so:
dt = data.table(id_1 = c(rep(1:3, 5)), id_2 = sort(rep(c('A', 'B', 'C'), 5)), value_1 = rnorm(15, 1, 1), value_2 = rpois(15, 1))
I would like to create a function which groups the table by some columns specified by the function parameter and performs action (let's say sum) to several other columns specified by another parameter. Finally, i'd like to specify names for the new columns as another function parameter. My problem is: i dont really know how to create names from character vector when i am not using the assignment by reference :=
.
The following two approaches achieve exactly what i want to do, i just don't like the way:
Approach one: use the assignment by reference and then choose only one record per group (and forget original columns)
dt_aggregator_1 <- function(data,
group_cols = c('id_1', 'id_2'),
new_names = c('sum_value_1', 'sum_value_2'),
value_cols = c('value_1', 'value_2')){
data_out = data
data_out[,(new_names) := lapply(.SD, function(x){sum(x)}),by = group_cols, .SDcols = value_cols]
data_out[,lapply(.SD, max), by = group_cols, .SDcols = new_names]
}
Approach 2: rename columns after grouping. I assume this is way better approach.
dt_aggregator_2 <- function(data,
group_cols = c('id_1', 'id_2'),
new_names = c('sum_value_1', 'sum_value_2'),
value_cols = c('value_1', 'value_2')){
data_out = data[,lapply(.SD, function(x){sum(x)}),by = group_cols, .SDcols = value_cols]
setnames(data_out, value_cols, new_names)
data_out[]
}
My question is, if in approach number 2 i can somehow set the names while performing the grouping opperation? So that i would reduce it to one line of code instead of 2:)