Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 211995

Aggregate in R with date column but by the identifier column

$
0
0

I want to aggregate (=summarize) my data according to an id-variable. Nonetheless, the date column is getting only NAs after that, I think because it's set up as "Date".

I would like to preserve the dates as they are.

Data (10 first obs):

          TUCASEID AGE MALE BLACK YEAR DATASET INTERVIEW_DAY INTERVIEW_DATE
1   2.00301e+13  60    1     1 2003    2003             5      03Jan2003
2   2.00301e+13  60    1     1 2003    2003             5      03Jan2003
3   2.00301e+13  60    1     1 2003    2003             5      03Jan2003
4   2.00301e+13  60    1     1 2003    2003             5      03Jan2003
5   2.00301e+13  60    1     1 2003    2003             5      03Jan2003
6   2.00301e+13  60    1     1 2003    2003             5      03Jan2003
7   2.00301e+13  60    1     1 2003    2003             5      03Jan2003
8   2.00301e+13  60    1     1 2003    2003             5      03Jan2003
9   2.00301e+13  60    1     1 2003    2003             5      03Jan2003
10  2.00301e+13  41    0     0 2003    2003             6      04Jan2003

Then, I summarize it with aggregate:

timeuse_2003_mean <- aggregate(timeuse_2003[,c("AGE","MALE","BLACK","YEAR","DATASET","INTERVIEW_DAY","INTERVIEW_DATE")],
      by=list(timeuse_2003$TUCASEID),mean)

Here the output:

  TUCASEID         AGE MALE BLACK YEAR DATASET INTERVIEW_DAY INTERVIEW_DATE
1   2.0030100e+13  60    1     1 2003    2003             5             NA
2   2.0030100e+13  41    0     0 2003    2003             6             NA
3   2.0030100e+13  26    0     0 2003    2003             6             NA
4   2.0030100e+13  36    0     1 2003    2003             4             NA
5   2.0030100e+13  51    1     0 2003    2003             4             NA
6   2.0030100e+13  32    0     0 2003    2003             4             NA
7   2.0030100e+13  44    0     0 2003    2003             1             NA
8   2.0030100e+13  21    0     0 2003    2003             2             NA
9   2.0030100e+13  33    0     0 2003    2003             6             NA
10  2.0030100e+13  39    0     1 2003    2003             4             NA

I've got a warning message, probably because the date is formatted as "as.Date", but I do need it in that format and that they also get "summarized" by "aggregate".

Thank you in advance.


Viewing all articles
Browse latest Browse all 211995

Trending Articles