Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

Stata collapse (sum) in R

$
0
0

I'm trying to translate the stata collapse varlist (sum)into R.

The problem seems to be with the (sum), because I'm getting some different values than the original output in Stata. The only good news is that I'm getting exactly the same obs. number than in Stata.

My try in R:

timeuse_2003_sum <- aggregate(timeuse_2003[,c("CHILD_CARE_BASIC","CHILD_CARE_TEACH","CHILD_CARE_PLAY", 

                                              "CIVIC",

                                              "EATING","SLEEPING","PERSONAL_CARE","SELF_CARE",
                                              "OWN_MEDICAL_CARE","OTHER_CARE",

                                              "MEALS", "HOUSEWORK","HOME_CAR_MAINTENANCE","HOME_OTHER",
                                              "OBTAINING_GOODS","OBTAINING_SERVICES_ALT",

                                              "WORK_TRAVEL","WORK_RELATED","WORK_CORE",
                                              "WORK_UNEMP","WORK_ACTIVITIES","PET","GARDEN",
                                              "HOMEOWN_PRE","PEOPLE_TIME_AT_WORK",
                                              "EATING_TIME_AT_WORK",

                                              "EDUCATION",

                                              "EXERCISE_SPORTS","TV","SOCIALIZING","READING",
                                              "ENT_NOT_TV","HOBBIES",

                                              "GARDEN_PET","EMAIL","LEISURE_0",

                                              "OTHER")],

                              by=list(timeuse_2003$TUCASEID),sum_col)

My data:

> str(timeuse_2003)
'data.frame':   412611 obs. of  103 variables:
 $ TUCASEID              :Class 'labelled' num  2e+13 2e+13 2e+13 2e+13 2e+13 ...
   .. .. LABEL: ATUS Case ID (14-digit identifier) 
 $ TULINENO              :Class 'labelled' int  1 1 1 1 1 1 1 1 1 1 ...
   .. .. LABEL: ATUS person line number 
 $ GESTFIPS              :Class 'labelled' int  6 6 6 6 6 6 6 6 6 6 ...
   .. .. LABEL: Federal Processing Information Standards (FIPS) state code 
 $ GEREG                 :Class 'labelled' int  4 4 4 4 4 4 4 4 4 4 ...
   .. .. LABEL: Region 
   .. .. VALUE LABELS [1:7]: -3=Refused, -2=Don't Know, -1=Blank, 1=Northeast, 2=Midwest (formerly North Central), 3=South, 4=West 
 $ PEEDUCA               :Class 'labelled' int  44 44 44 44 44 44 44 44 44 40 ...
   .. .. LABEL: Edited: what is the highest level of school you have completed or the highest degree you have received? 
   .. .. VALUE LABELS [1:19]: -3=Refused, -2=Don't Know, -1=Blank, 31=Less than 1st grade, 32=1st, 2nd, 3rd, or 4th grade, 33=5th or 6th grade, 34=7th or 8th grade, 35=9th grade, 36=10th grade, 37=11th grade, 38=12th grade - no diploma, 39=High school graduate - diploma or equivalent (GED), 40=Some college but no degree, 41=Associate degree - occupational/vocational, 42=Associate degree - academic program, 43=Bachelor's degree (BA, AB, BS, etc.), 44=Master's degree (MA, MS, MEng, MEd, MSW, etc.), 45=Professional school degree (MD, DDS, DVM, etc.), 46=Doctoral degree (PhD, EdD, etc.) 
 $ PTDTRACE              :Class 'labelled' int  2 2 2 2 2 2 2 2 2 1 ...
   .. .. LABEL: Race (topcoded) 
   .. .. VALUE LABELS [1:24]: -3=Refused, -2=Don't Know, -1=Blank, 1=White only, 2=Black only, 3=American Indian, Alaskan Native only, 4=Asian only, 5=Hawaiian/Pacific Islander only, 6=White-Black, 7=White-American Indian ... 12=Black-Hawaiian, 13=American Indian-Asian, 14=Asian-Hawaiian, 15=White-Black-American Indian, 16=White-Black-Asian, 17=White-American Indian-Asian, 18=White-Asian-Hawaiian, 19=White-Black-American Indian-Asian, 20=2 or 3 races, 21=4 or 5 races 
 $ TESEX                 :Class 'labelled' int  1 1 1 1 1 1 1 1 1 2 ...
   .. .. LABEL: Edited: sex 
   .. .. VALUE LABELS [1:5]: -3=Refused, -2=Don't Know, -1=Blank, 1=Male, 2=Female 
 $ TEAGE                 :Class 'labelled' int  60 60 60 60 60 60 60 60 60 41 ...
   .. .. LABEL: Edited: age 
 $ TUYEAR                :Class 'labelled' int  2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 ...
   .. .. LABEL: Year of diary day (year of day about which respondent was interviewed) 
 $ TUMONTH               :Class 'labelled' int  1 1 1 1 1 1 1 1 1 1 ...
   .. .. LABEL: Month of diary day (month of day about which ATUS respondent was interviewed) 
 $ TUDIARYDATE           :Class 'labelled' int  20030103 20030103 20030103 20030103 20030103 20030103 20030103 20030103 20030103 20030104 ...
   .. .. LABEL: Date of diary day (date about which the respondent was interviewed) 
 $ TUDIARYDAY            :Class 'labelled' int  6 6 6 6 6 6 6 6 6 7 ...
   .. .. LABEL: Day of the week of diary day (day of the week about which the respondent was interviewed) 
   .. .. VALUE LABELS [1:10]: -3=Refused, -2=Don't Know, -1=Blank, 1=Sunday, 2=Monday, 3=Tuesday, 4=Wednesday, 5=Thursday, 6=Friday, 7=Saturday 
 $ TESPEMPNOT            :Class 'labelled' int  2 2 2 2 2 2 2 2 2 1 ...
   .. .. LABEL: Edited: employment status of spouse or unmarried partner 
   .. .. VALUE LABELS [1:5]: -3=Refused, -2=Don't Know, -1=Blank, 1=Employed, 2=Not employed 
 $ TRYHHCHILD            :Class 'labelled' int  -1 -1 -1 -1 -1 -1 -1 -1 -1 0 ...
   .. .. LABEL: Age of youngest household child < 18 
 $ TRSPPRES              :Class 'labelled' int  1 1 1 1 1 1 1 1 1 1 ...
   .. .. LABEL: Presence of the respondent's spouse or unmarried partner in the household 
   .. .. VALUE LABELS [1:6]: -3=Refused, -2=Don't Know, -1=Blank, 1=Spouse present, 2=Unmarried partner present, 3=No spouse or unmarried partner present 
 $ TUABSOT               :Class 'labelled' int  1 1 1 1 1 1 1 1 1 -1 ...
   .. .. LABEL: In the last seven days, did you have a job either full or part time? 
   .. .. VALUE LABELS [1:8]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No, 3=Retired, 4=Disabled, 5=Unable to work 
 $ TUDIS                 :Class 'labelled' int  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
   .. .. LABEL: Last time we spoke to someone in this household you were reported to have a disability. Does your disability prevent you from do 
   .. .. VALUE LABELS [1:6]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No, 3=Did not have a disability last time 
 $ TULAY                 :Class 'labelled' int  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
   .. .. LABEL: During the last seven days were you on layoff from your job? 
   .. .. VALUE LABELS [1:8]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No, 3=Retired, 4=Disabled, 5=Unable to work 
 $ TUFWK                 :Class 'labelled' int  2 2 2 2 2 2 2 2 2 1 ...
   .. .. LABEL: In the last seven days did you do any work for pay or profit? 
   .. .. VALUE LABELS [1:8]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No, 3=Retired, 4=Disabled, 5=Unable to work 
 $ TURETOT               :Class 'labelled' int  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
   .. .. LABEL: The last time we spoke to someone in this household you were reported to be retired. Are you still retired? 
   .. .. VALUE LABELS [1:6]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No, 3=Was not retired last time 
 $ TULK                  :Class 'labelled' int  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
   .. .. LABEL: Have you been doing anything to find work during the last four weeks? 
   .. .. VALUE LABELS [1:8]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No, 3=Retired, 4=Disabled, 5=Unable to work 
 $ TEHRUSLT              :Class 'labelled' int  30 30 30 30 30 30 30 30 30 30 ...
   .. .. LABEL: Edited: total hours usually worked per week (sum of TEHRUSL1 and TEHRUSL2) 
 $ TESCHENR              :Class 'labelled' int  -1 -1 -1 -1 -1 -1 -1 -1 -1 2 ...
   .. .. LABEL: Edited: are you enrolled in high school, college, or university? 
   .. .. VALUE LABELS [1:5]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No 
 $ TESCHFT               :Class 'labelled' int  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
   .. .. LABEL: Edited: are you enrolled as a full-time or part-time student? 
   .. .. VALUE LABELS [1:5]: -3=Refused, -2=Don't Know, -1=Blank, 1=Full time, 2=Part time 
 $ TELFS                 :Class 'labelled' int  2 2 2 2 2 2 2 2 2 1 ...
   .. .. LABEL: Edited: labor force status 
   .. .. VALUE LABELS [1:8]: -3=Refused, -2=Don't Know, -1=Blank, 1=Employed - at work, 2=Employed - absent, 3=Unemployed - on layoff, 4=Unemployed - looking, 5=Not in labor force 
 $ TUFINLWGT             :Class 'labelled' num  3958080 3958080 3958080 3958080 3958080 ...
   .. .. LABEL: ATUS final weight 
 $ TU06FWGT              :Class 'labelled' num  8155463 8155463 8155463 8155463 8155463 ...
   .. .. LABEL: ATUS final weight based on 2003 weighting methodology 
 $ TRERNWA               :Class 'labelled' int  66000 66000 66000 66000 66000 66000 66000 66000 66000 20000 ...
   .. .. LABEL: Weekly earnings (2 implied decimals) 
 $ TRERNHLY              :Class 'labelled' num  2200 2200 2200 2200 2200 2200 2200 2200 2200 -1 ...
   .. .. LABEL: Hourly earnings (2 implied decimals) 
 $ TESPUHRS              :Class 'labelled' int  -1 -1 -1 -1 -1 -1 -1 -1 -1 50 ...
   .. .. LABEL: Edited: usual hours of work of spouse or unmarried partner 
 $ TRDPFTPT              :Class 'labelled' int  2 2 2 2 2 2 2 2 2 2 ...
   .. .. LABEL: Full time or part time employment status of respondent 
   .. .. VALUE LABELS [1:5]: -3=Refused, -2=Don't Know, -1=Blank, 1=Full time, 2=Part time 
 $ TERET1                :Class 'labelled' int  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
   .. .. LABEL: Edited: do you currently want a job, either full or part time? 
   .. .. VALUE LABELS [1:6]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes or maybe/it depends, 2=No, 3=Has a job 
 $ TRCHILDNUM            :Class 'labelled' int  0 0 0 0 0 0 0 0 0 2 ...
   .. .. LABEL: Number of household children < 18 
 $ TUACTDUR24            :Class 'labelled' int  60 30 600 150 5 175 270 10 140 180 ...
   .. .. LABEL: Duration of activity in minutes (last activity truncated at 4:00 a.m.) 
 $ TUTIER1CODE           :Class 'labelled' int  13 1 1 12 11 12 1 1 13 1 ...
   .. .. LABEL: Lexicon Tier 1: 1st and 2nd digits of 6-digit activity code 
 $ TUTIER2CODE           :Class 'labelled' int  1 2 1 3 1 3 1 2 1 1 ...
   .. .. LABEL: Lexicon Tier 2: 3rd and 4th digits of 6-digit activity code 
 $ TUTIER3CODE           :Class 'labelled' int  24 1 1 3 1 3 1 1 24 1 ...
   .. .. LABEL: Lexicon Tier 3: 5th and 6th digits of 6-digit activity code 
 $ TUACTDUR              :Class 'labelled' int  60 30 600 150 5 175 270 10 170 180 ...
   .. .. LABEL: Duration of activity in minutes (last activity not truncated at 4:00 a.m.) 
 $ DAY_DATE              :Class 'labelled' int  3 3 3 3 3 3 3 3 3 4 ...
   .. .. LABEL: Date of diary day (date about which the respondent was interviewed) 
 $ INTERVIEW_DATE        : Date, format: "2003-01-03""2003-01-03""2003-01-03""2003-01-03" ...
 $ AGE                   :Class 'labelled' int  60 60 60 60 60 60 60 60 60 41 ...
   .. .. LABEL: Edited: age 
 $ MALE                  : num  1 1 1 1 1 1 1 1 1 0 ...
 $ BLACK                 : num  1 1 1 1 1 1 1 1 1 0 ...
 $ MARRIED               : num  1 1 1 1 1 1 1 1 1 1 ...
 $ NUM_CHILD             :Class 'labelled' num  0 0 0 0 0 0 0 0 0 2 ...
   .. .. LABEL: Number of household children < 18 
 $ HV_CHILD              : num  0 0 0 0 0 0 0 0 0 1 ...
 $ AGE_YOUNGEST          :Class 'labelled' int  NA NA NA NA NA NA NA NA NA 0 ...
   .. .. LABEL: Age of youngest household child < 18 
 $ CHILD_4               : num  0 0 0 0 0 0 0 0 0 1 ...
 $ CHILD_5               : num  0 0 0 0 0 0 0 0 0 1 ...
 $ SPOUSE_EMP            : num  0 0 0 0 0 0 0 0 0 1 ...
 $ SPOUSE_WORKHOURS      :Class 'labelled' int  NA NA NA NA NA NA NA NA NA 50 ...
   .. .. LABEL: Edited: usual hours of work of spouse or unmarried partner 
 $ GRADE                 : num  17 17 17 17 17 17 17 17 17 13 ...
 $ WORKING               : num  1 1 1 1 1 1 1 1 1 1 ...
 $ UNEMP                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ DISABLED              : num  0 0 0 0 0 0 0 0 0 0 ...
 $ STUDENT_BROAD         : num  0 0 0 0 0 0 0 0 0 0 ...
 $ RETIRED               : num  0 0 0 0 0 0 0 0 0 0 ...
 $ STUDENT               : num  0 0 0 0 0 0 0 0 0 0 ...
 $ HOMEMAKER             : num  0 0 0 0 0 0 0 0 0 0 ...
 $ WORK_PART             : num  1 1 1 1 1 1 1 1 1 1 ...
 $ HH_INCOME_03          :Class 'labelled' num  660 660 660 660 660 660 660 660 660 200 ...
   .. .. LABEL: Weekly earnings 
 $ WAGE_03               : num  22 22 22 22 22 ...
 $ WAGE_03_ALT           :Class 'labelled' num  22 22 22 22 22 22 22 22 22 NA ...
   .. .. LABEL: Hourly earnings (2 implied decimals) 
 $ YEAR                  : num  2003 2003 2003 2003 2003 ...
 $ DATASET               : num  2003 2003 2003 2003 2003 ...
 $ INTERVIEW_DAY         :Class 'labelled' num  5 5 5 5 5 5 5 5 5 6 ...
   .. .. LABEL: Day of the week of diary day (day of the week about which the respondent was interviewed) 
   .. .. VALUE LABELS [1:10]: -3=Refused, -2=Don't Know, -1=Blank, 1=Sunday, 2=Monday, 3=Tuesday, 4=Wednesday, 5=Thursday, 6=Friday, 7=Saturday 
 $ CHILD_CARE_BASIC      : int  NA NA NA NA NA NA NA NA NA NA ...
 $ CHILD_CARE_TEACH      : int  NA NA NA NA NA NA NA NA NA NA ...
 $ CHILD_CARE_PLAY       : int  NA NA NA NA NA NA NA NA NA NA ...
 $ EATING                : int  NA NA NA NA 5 NA NA NA NA NA ...
 $ SLEEPING              : int  NA NA 600 NA NA NA 270 NA NA 180 ...
 $ PERSONAL_CARE         : int  NA 30 NA NA NA NA NA 10 NA NA ...
 $ SELF_CARE             : int  NA NA NA NA NA NA NA NA NA NA ...
 $ OWN_MEDICAL_CARE      : int  NA NA NA NA NA NA NA NA NA NA ...
 $ OTHER_CARE            : int  NA NA NA NA NA NA NA NA NA NA ...
 $ MEALS                 : int  NA NA NA NA NA NA NA NA NA NA ...
 $ HOUSEWORK             : int  NA NA NA NA NA NA NA NA NA NA ...
 $ HOME_CAR_MAINTENANCE  : int  NA NA NA NA NA NA NA NA NA NA ...
 $ HOMEOWN_PRE           : int  NA NA NA NA NA NA NA NA NA NA ...
 $ HOME_OTHER            : int  NA NA NA NA NA NA NA NA NA NA ...
 $ GARDEN_PET            : int  NA NA NA NA NA NA NA NA NA NA ...
 $ GARDEN                : int  NA NA NA NA NA NA NA NA NA NA ...
 $ PET                   : int  NA NA NA NA NA NA NA NA NA NA ...
 $ OBTAINING_GOODS       : int  NA NA NA NA NA NA NA NA NA NA ...
 $ OBTAINING_SERVICES_ALT: int  NA NA NA NA NA NA NA NA NA NA ...
 $ WORK_TRAVEL           : int  NA NA NA NA NA NA NA NA NA NA ...
 $ WORK_RELATED          : int  NA NA NA NA NA NA NA NA NA NA ...
 $ WORK_CORE             : int  NA NA NA NA NA NA NA NA NA NA ...
 $ WORK_UNEMP            : int  NA NA NA NA NA NA NA NA NA NA ...
 $ PEOPLE_TIME_AT_WORK   : int  NA NA NA NA NA NA NA NA NA NA ...
 $ EATING_TIME_AT_WORK   : int  NA NA NA NA NA NA NA NA NA NA ...
 $ WORK_ACTIVITIES       : int  NA NA NA NA NA NA NA NA NA NA ...
 $ EDUCATION             : int  NA NA NA NA NA NA NA NA NA NA ...
 $ CIVIC                 : int  NA NA NA NA NA NA NA NA NA NA ...
 $ EXERCISE_SPORTS       : int  60 NA NA NA NA NA NA NA 140 NA ...
 $ TV                    : int  NA NA NA 150 NA 175 NA NA NA NA ...
 $ EMAIL                 : int  NA NA NA NA NA NA NA NA NA NA ...
 $ SOCIALIZING           : int  NA NA NA NA NA NA NA NA NA NA ...
 $ READING               : int  NA NA NA NA NA NA NA NA NA NA ...
  [list output truncated]

This last value in OTHERfor example, the 42 should be 0. And that happens to a lot of other variables.

Thank you in advance.


Viewing all articles
Browse latest Browse all 201839

Trending Articles