Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 205491

r - Calculated mean and sum values group by the first row

$
0
0

I have a dataframe, I would like to calculate all the mean values of x and all the sum of y group by the first row of the dateframe.

The dateframe to be calculate

The following link is the result I want. The result expected

Here are the data.

dt=structure(list(year = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("1980", 
    "1981", "1982", "1985", "group"), class = "factor"), x1 = structure(c(4L, 
    1L, 3L, 2L, 1L), .Label = c("1", "2", "4", "A"), class = "factor"), 
        y1 = structure(c(4L, 1L, 3L, 2L, 2L), .Label = c("1", "3", 
        "5", "A"), class = "factor"), x2 = structure(c(5L, 1L, 4L, 
        3L, 2L), .Label = c("2", "4", "5", "6", "A"), class = "factor"), 
        y2 = structure(c(4L, 1L, 3L, 3L, 2L), .Label = c("3", "5", 
        "7", "A"), class = "factor"), x3 = structure(c(4L, 1L, 3L, 
        2L, 1L), .Label = c("4", "6", "8", "B"), class = "factor"), 
        y3 = structure(c(4L, 1L, 3L, 2L, 1L), .Label = c("3", "5", 
        "6", "B"), class = "factor"), x4 = structure(c(4L, 1L, 3L, 
        2L, 3L), .Label = c("2", "4", "5", "C"), class = "factor"), 
        y4 = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("3", "4", 
        "5", "6", "C"), class = "factor"), x5 = structure(c(5L, 2L, 
        1L, 3L, 4L), .Label = c("3", "4", "6", "7", "C"), class = "factor"), 
        y5 = structure(c(4L, 2L, 1L, 3L, 2L), .Label = c("2", "5", 
        "8", "C"), class = "factor")), class = "data.frame", row.names = c(NA, 
    -5L))

And result expected,

result_expected <- structure(list(year = c(1980L, 1981L, 1982L, 1985L), A_x_mean = c(1.5, 
5, 3.5, 2.5), A_y_sum = c(4L, 12L, 10L, 8L), B_x_mean = c(4L, 
8L, 6L, 4L), B_y_sum = c(3L, 6L, 5L, 3L), C_x_mean = 3:6, C_y_sum = c(8L, 
6L, 13L, 11L)), class = "data.frame", row.names = c(NA, -4L))

I have search key words in goole and stackoverflow, but no proper answers. My current thinking is to calculate unique group A,B,C in first row.

require(tidyverse)
group_variables <- dt%>%gather(key,value)%>%distinct(value)%>%arrange(value)

then get the row in group_variables by the for

for i in group_variables{......}

or can I change the structure of the dataframe by gathe and spread in tidyr,and by dplyr method, something just like following code,

dt_new%>% group_by (group)%>%
          summarise(mean=mean(x,na.rm=TRUE),
          sum=sum(x,na.rm=TURE))

Any information is appreciated!


Viewing all articles
Browse latest Browse all 205491

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>