I'm trying to translate the stata collapse varlist (sum)
into R.
The problem seems to be with the (sum)
, because I'm getting some different values than the original output in Stata. The only good news is that I'm getting exactly the same obs. number than in Stata.
My try in R:
timeuse_2003_sum <- aggregate(timeuse_2003[,c("CHILD_CARE_BASIC","CHILD_CARE_TEACH","CHILD_CARE_PLAY",
"CIVIC",
"EATING","SLEEPING","PERSONAL_CARE","SELF_CARE",
"OWN_MEDICAL_CARE","OTHER_CARE",
"MEALS", "HOUSEWORK","HOME_CAR_MAINTENANCE","HOME_OTHER",
"OBTAINING_GOODS","OBTAINING_SERVICES_ALT",
"WORK_TRAVEL","WORK_RELATED","WORK_CORE",
"WORK_UNEMP","WORK_ACTIVITIES","PET","GARDEN",
"HOMEOWN_PRE","PEOPLE_TIME_AT_WORK",
"EATING_TIME_AT_WORK",
"EDUCATION",
"EXERCISE_SPORTS","TV","SOCIALIZING","READING",
"ENT_NOT_TV","HOBBIES",
"GARDEN_PET","EMAIL","LEISURE_0",
"OTHER")],
by=list(timeuse_2003$TUCASEID),sum_col)
My data:
> str(timeuse_2003)
'data.frame': 412611 obs. of 103 variables:
$ TUCASEID :Class 'labelled' num 2e+13 2e+13 2e+13 2e+13 2e+13 ...
.. .. LABEL: ATUS Case ID (14-digit identifier)
$ TULINENO :Class 'labelled' int 1 1 1 1 1 1 1 1 1 1 ...
.. .. LABEL: ATUS person line number
$ GESTFIPS :Class 'labelled' int 6 6 6 6 6 6 6 6 6 6 ...
.. .. LABEL: Federal Processing Information Standards (FIPS) state code
$ GEREG :Class 'labelled' int 4 4 4 4 4 4 4 4 4 4 ...
.. .. LABEL: Region
.. .. VALUE LABELS [1:7]: -3=Refused, -2=Don't Know, -1=Blank, 1=Northeast, 2=Midwest (formerly North Central), 3=South, 4=West
$ PEEDUCA :Class 'labelled' int 44 44 44 44 44 44 44 44 44 40 ...
.. .. LABEL: Edited: what is the highest level of school you have completed or the highest degree you have received?
.. .. VALUE LABELS [1:19]: -3=Refused, -2=Don't Know, -1=Blank, 31=Less than 1st grade, 32=1st, 2nd, 3rd, or 4th grade, 33=5th or 6th grade, 34=7th or 8th grade, 35=9th grade, 36=10th grade, 37=11th grade, 38=12th grade - no diploma, 39=High school graduate - diploma or equivalent (GED), 40=Some college but no degree, 41=Associate degree - occupational/vocational, 42=Associate degree - academic program, 43=Bachelor's degree (BA, AB, BS, etc.), 44=Master's degree (MA, MS, MEng, MEd, MSW, etc.), 45=Professional school degree (MD, DDS, DVM, etc.), 46=Doctoral degree (PhD, EdD, etc.)
$ PTDTRACE :Class 'labelled' int 2 2 2 2 2 2 2 2 2 1 ...
.. .. LABEL: Race (topcoded)
.. .. VALUE LABELS [1:24]: -3=Refused, -2=Don't Know, -1=Blank, 1=White only, 2=Black only, 3=American Indian, Alaskan Native only, 4=Asian only, 5=Hawaiian/Pacific Islander only, 6=White-Black, 7=White-American Indian ... 12=Black-Hawaiian, 13=American Indian-Asian, 14=Asian-Hawaiian, 15=White-Black-American Indian, 16=White-Black-Asian, 17=White-American Indian-Asian, 18=White-Asian-Hawaiian, 19=White-Black-American Indian-Asian, 20=2 or 3 races, 21=4 or 5 races
$ TESEX :Class 'labelled' int 1 1 1 1 1 1 1 1 1 2 ...
.. .. LABEL: Edited: sex
.. .. VALUE LABELS [1:5]: -3=Refused, -2=Don't Know, -1=Blank, 1=Male, 2=Female
$ TEAGE :Class 'labelled' int 60 60 60 60 60 60 60 60 60 41 ...
.. .. LABEL: Edited: age
$ TUYEAR :Class 'labelled' int 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 ...
.. .. LABEL: Year of diary day (year of day about which respondent was interviewed)
$ TUMONTH :Class 'labelled' int 1 1 1 1 1 1 1 1 1 1 ...
.. .. LABEL: Month of diary day (month of day about which ATUS respondent was interviewed)
$ TUDIARYDATE :Class 'labelled' int 20030103 20030103 20030103 20030103 20030103 20030103 20030103 20030103 20030103 20030104 ...
.. .. LABEL: Date of diary day (date about which the respondent was interviewed)
$ TUDIARYDAY :Class 'labelled' int 6 6 6 6 6 6 6 6 6 7 ...
.. .. LABEL: Day of the week of diary day (day of the week about which the respondent was interviewed)
.. .. VALUE LABELS [1:10]: -3=Refused, -2=Don't Know, -1=Blank, 1=Sunday, 2=Monday, 3=Tuesday, 4=Wednesday, 5=Thursday, 6=Friday, 7=Saturday
$ TESPEMPNOT :Class 'labelled' int 2 2 2 2 2 2 2 2 2 1 ...
.. .. LABEL: Edited: employment status of spouse or unmarried partner
.. .. VALUE LABELS [1:5]: -3=Refused, -2=Don't Know, -1=Blank, 1=Employed, 2=Not employed
$ TRYHHCHILD :Class 'labelled' int -1 -1 -1 -1 -1 -1 -1 -1 -1 0 ...
.. .. LABEL: Age of youngest household child < 18
$ TRSPPRES :Class 'labelled' int 1 1 1 1 1 1 1 1 1 1 ...
.. .. LABEL: Presence of the respondent's spouse or unmarried partner in the household
.. .. VALUE LABELS [1:6]: -3=Refused, -2=Don't Know, -1=Blank, 1=Spouse present, 2=Unmarried partner present, 3=No spouse or unmarried partner present
$ TUABSOT :Class 'labelled' int 1 1 1 1 1 1 1 1 1 -1 ...
.. .. LABEL: In the last seven days, did you have a job either full or part time?
.. .. VALUE LABELS [1:8]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No, 3=Retired, 4=Disabled, 5=Unable to work
$ TUDIS :Class 'labelled' int -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
.. .. LABEL: Last time we spoke to someone in this household you were reported to have a disability. Does your disability prevent you from do
.. .. VALUE LABELS [1:6]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No, 3=Did not have a disability last time
$ TULAY :Class 'labelled' int -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
.. .. LABEL: During the last seven days were you on layoff from your job?
.. .. VALUE LABELS [1:8]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No, 3=Retired, 4=Disabled, 5=Unable to work
$ TUFWK :Class 'labelled' int 2 2 2 2 2 2 2 2 2 1 ...
.. .. LABEL: In the last seven days did you do any work for pay or profit?
.. .. VALUE LABELS [1:8]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No, 3=Retired, 4=Disabled, 5=Unable to work
$ TURETOT :Class 'labelled' int -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
.. .. LABEL: The last time we spoke to someone in this household you were reported to be retired. Are you still retired?
.. .. VALUE LABELS [1:6]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No, 3=Was not retired last time
$ TULK :Class 'labelled' int -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
.. .. LABEL: Have you been doing anything to find work during the last four weeks?
.. .. VALUE LABELS [1:8]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No, 3=Retired, 4=Disabled, 5=Unable to work
$ TEHRUSLT :Class 'labelled' int 30 30 30 30 30 30 30 30 30 30 ...
.. .. LABEL: Edited: total hours usually worked per week (sum of TEHRUSL1 and TEHRUSL2)
$ TESCHENR :Class 'labelled' int -1 -1 -1 -1 -1 -1 -1 -1 -1 2 ...
.. .. LABEL: Edited: are you enrolled in high school, college, or university?
.. .. VALUE LABELS [1:5]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes, 2=No
$ TESCHFT :Class 'labelled' int -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
.. .. LABEL: Edited: are you enrolled as a full-time or part-time student?
.. .. VALUE LABELS [1:5]: -3=Refused, -2=Don't Know, -1=Blank, 1=Full time, 2=Part time
$ TELFS :Class 'labelled' int 2 2 2 2 2 2 2 2 2 1 ...
.. .. LABEL: Edited: labor force status
.. .. VALUE LABELS [1:8]: -3=Refused, -2=Don't Know, -1=Blank, 1=Employed - at work, 2=Employed - absent, 3=Unemployed - on layoff, 4=Unemployed - looking, 5=Not in labor force
$ TUFINLWGT :Class 'labelled' num 3958080 3958080 3958080 3958080 3958080 ...
.. .. LABEL: ATUS final weight
$ TU06FWGT :Class 'labelled' num 8155463 8155463 8155463 8155463 8155463 ...
.. .. LABEL: ATUS final weight based on 2003 weighting methodology
$ TRERNWA :Class 'labelled' int 66000 66000 66000 66000 66000 66000 66000 66000 66000 20000 ...
.. .. LABEL: Weekly earnings (2 implied decimals)
$ TRERNHLY :Class 'labelled' num 2200 2200 2200 2200 2200 2200 2200 2200 2200 -1 ...
.. .. LABEL: Hourly earnings (2 implied decimals)
$ TESPUHRS :Class 'labelled' int -1 -1 -1 -1 -1 -1 -1 -1 -1 50 ...
.. .. LABEL: Edited: usual hours of work of spouse or unmarried partner
$ TRDPFTPT :Class 'labelled' int 2 2 2 2 2 2 2 2 2 2 ...
.. .. LABEL: Full time or part time employment status of respondent
.. .. VALUE LABELS [1:5]: -3=Refused, -2=Don't Know, -1=Blank, 1=Full time, 2=Part time
$ TERET1 :Class 'labelled' int -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
.. .. LABEL: Edited: do you currently want a job, either full or part time?
.. .. VALUE LABELS [1:6]: -3=Refused, -2=Don't Know, -1=Blank, 1=Yes or maybe/it depends, 2=No, 3=Has a job
$ TRCHILDNUM :Class 'labelled' int 0 0 0 0 0 0 0 0 0 2 ...
.. .. LABEL: Number of household children < 18
$ TUACTDUR24 :Class 'labelled' int 60 30 600 150 5 175 270 10 140 180 ...
.. .. LABEL: Duration of activity in minutes (last activity truncated at 4:00 a.m.)
$ TUTIER1CODE :Class 'labelled' int 13 1 1 12 11 12 1 1 13 1 ...
.. .. LABEL: Lexicon Tier 1: 1st and 2nd digits of 6-digit activity code
$ TUTIER2CODE :Class 'labelled' int 1 2 1 3 1 3 1 2 1 1 ...
.. .. LABEL: Lexicon Tier 2: 3rd and 4th digits of 6-digit activity code
$ TUTIER3CODE :Class 'labelled' int 24 1 1 3 1 3 1 1 24 1 ...
.. .. LABEL: Lexicon Tier 3: 5th and 6th digits of 6-digit activity code
$ TUACTDUR :Class 'labelled' int 60 30 600 150 5 175 270 10 170 180 ...
.. .. LABEL: Duration of activity in minutes (last activity not truncated at 4:00 a.m.)
$ DAY_DATE :Class 'labelled' int 3 3 3 3 3 3 3 3 3 4 ...
.. .. LABEL: Date of diary day (date about which the respondent was interviewed)
$ INTERVIEW_DATE : Date, format: "2003-01-03""2003-01-03""2003-01-03""2003-01-03" ...
$ AGE :Class 'labelled' int 60 60 60 60 60 60 60 60 60 41 ...
.. .. LABEL: Edited: age
$ MALE : num 1 1 1 1 1 1 1 1 1 0 ...
$ BLACK : num 1 1 1 1 1 1 1 1 1 0 ...
$ MARRIED : num 1 1 1 1 1 1 1 1 1 1 ...
$ NUM_CHILD :Class 'labelled' num 0 0 0 0 0 0 0 0 0 2 ...
.. .. LABEL: Number of household children < 18
$ HV_CHILD : num 0 0 0 0 0 0 0 0 0 1 ...
$ AGE_YOUNGEST :Class 'labelled' int NA NA NA NA NA NA NA NA NA 0 ...
.. .. LABEL: Age of youngest household child < 18
$ CHILD_4 : num 0 0 0 0 0 0 0 0 0 1 ...
$ CHILD_5 : num 0 0 0 0 0 0 0 0 0 1 ...
$ SPOUSE_EMP : num 0 0 0 0 0 0 0 0 0 1 ...
$ SPOUSE_WORKHOURS :Class 'labelled' int NA NA NA NA NA NA NA NA NA 50 ...
.. .. LABEL: Edited: usual hours of work of spouse or unmarried partner
$ GRADE : num 17 17 17 17 17 17 17 17 17 13 ...
$ WORKING : num 1 1 1 1 1 1 1 1 1 1 ...
$ UNEMP : num 0 0 0 0 0 0 0 0 0 0 ...
$ DISABLED : num 0 0 0 0 0 0 0 0 0 0 ...
$ STUDENT_BROAD : num 0 0 0 0 0 0 0 0 0 0 ...
$ RETIRED : num 0 0 0 0 0 0 0 0 0 0 ...
$ STUDENT : num 0 0 0 0 0 0 0 0 0 0 ...
$ HOMEMAKER : num 0 0 0 0 0 0 0 0 0 0 ...
$ WORK_PART : num 1 1 1 1 1 1 1 1 1 1 ...
$ HH_INCOME_03 :Class 'labelled' num 660 660 660 660 660 660 660 660 660 200 ...
.. .. LABEL: Weekly earnings
$ WAGE_03 : num 22 22 22 22 22 ...
$ WAGE_03_ALT :Class 'labelled' num 22 22 22 22 22 22 22 22 22 NA ...
.. .. LABEL: Hourly earnings (2 implied decimals)
$ YEAR : num 2003 2003 2003 2003 2003 ...
$ DATASET : num 2003 2003 2003 2003 2003 ...
$ INTERVIEW_DAY :Class 'labelled' num 5 5 5 5 5 5 5 5 5 6 ...
.. .. LABEL: Day of the week of diary day (day of the week about which the respondent was interviewed)
.. .. VALUE LABELS [1:10]: -3=Refused, -2=Don't Know, -1=Blank, 1=Sunday, 2=Monday, 3=Tuesday, 4=Wednesday, 5=Thursday, 6=Friday, 7=Saturday
$ CHILD_CARE_BASIC : int NA NA NA NA NA NA NA NA NA NA ...
$ CHILD_CARE_TEACH : int NA NA NA NA NA NA NA NA NA NA ...
$ CHILD_CARE_PLAY : int NA NA NA NA NA NA NA NA NA NA ...
$ EATING : int NA NA NA NA 5 NA NA NA NA NA ...
$ SLEEPING : int NA NA 600 NA NA NA 270 NA NA 180 ...
$ PERSONAL_CARE : int NA 30 NA NA NA NA NA 10 NA NA ...
$ SELF_CARE : int NA NA NA NA NA NA NA NA NA NA ...
$ OWN_MEDICAL_CARE : int NA NA NA NA NA NA NA NA NA NA ...
$ OTHER_CARE : int NA NA NA NA NA NA NA NA NA NA ...
$ MEALS : int NA NA NA NA NA NA NA NA NA NA ...
$ HOUSEWORK : int NA NA NA NA NA NA NA NA NA NA ...
$ HOME_CAR_MAINTENANCE : int NA NA NA NA NA NA NA NA NA NA ...
$ HOMEOWN_PRE : int NA NA NA NA NA NA NA NA NA NA ...
$ HOME_OTHER : int NA NA NA NA NA NA NA NA NA NA ...
$ GARDEN_PET : int NA NA NA NA NA NA NA NA NA NA ...
$ GARDEN : int NA NA NA NA NA NA NA NA NA NA ...
$ PET : int NA NA NA NA NA NA NA NA NA NA ...
$ OBTAINING_GOODS : int NA NA NA NA NA NA NA NA NA NA ...
$ OBTAINING_SERVICES_ALT: int NA NA NA NA NA NA NA NA NA NA ...
$ WORK_TRAVEL : int NA NA NA NA NA NA NA NA NA NA ...
$ WORK_RELATED : int NA NA NA NA NA NA NA NA NA NA ...
$ WORK_CORE : int NA NA NA NA NA NA NA NA NA NA ...
$ WORK_UNEMP : int NA NA NA NA NA NA NA NA NA NA ...
$ PEOPLE_TIME_AT_WORK : int NA NA NA NA NA NA NA NA NA NA ...
$ EATING_TIME_AT_WORK : int NA NA NA NA NA NA NA NA NA NA ...
$ WORK_ACTIVITIES : int NA NA NA NA NA NA NA NA NA NA ...
$ EDUCATION : int NA NA NA NA NA NA NA NA NA NA ...
$ CIVIC : int NA NA NA NA NA NA NA NA NA NA ...
$ EXERCISE_SPORTS : int 60 NA NA NA NA NA NA NA 140 NA ...
$ TV : int NA NA NA 150 NA 175 NA NA NA NA ...
$ EMAIL : int NA NA NA NA NA NA NA NA NA NA ...
$ SOCIALIZING : int NA NA NA NA NA NA NA NA NA NA ...
$ READING : int NA NA NA NA NA NA NA NA NA NA ...
[list output truncated]
This last value in OTHER
for example, the 42
should be 0
. And that happens to a lot of other variables.
Thank you in advance.