Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

Checking the change in factor value for panel observations

$
0
0

I have a panel data set which looks as follows:

library(plm)
library(Hmisc)
library(data.table)
set.seed(1)
DT <- data.table(panelID = sample(50,50),                                                    # Creates a panel ID
                      Country = c(rep("Albania",30),rep("Belarus",50), rep("Chilipepper",20)),       
                      some_NA = sample(0:5, 6),                                             
                      some_NA_factor = sample(0:5, 6),         
                      Group = c(rep(1,20),rep(2,20),rep(3,20),rep(4,20),rep(5,20)),
                      Time = rep(seq(as.Date("2010-01-03"), length=20, by="1 month") - 1,5),
                      norm = round(runif(100)/10,2),
                      Income = round(rnorm(10,-5,5),2),
                      Happiness = sample(10,10),
                      Sex = round(rnorm(10,0.75,0.3),2),
                      Age = sample(100,100),
                      Educ = round(rnorm(10,0.75,0.3),2))           
DT [, uniqueID := .I]                                                                        # Creates a unique ID     
DT[DT == 0] <- NA                                                                            # https://stackoverflow.com/questions/11036989/replace-all-0-values-to-na
DT$some_NA_factor <- factor(DT$some_NA_factor)
DTp <- plm::pdata.frame(DT, index= c("panelID", "Time"))

I want to evaluate, for each panel observation, whether some_NA_factor or for example Countrychanges from one time period to another (a 1 for a change and a 0 for no change). I would like to write something like:

setDT(DT)[, difference := c(-1,1)*diff(some_NA_factor), by=panelID]

But I don't know how to write this when it concerns factors. If I apply this to the data.table I expectedly get:

Warning messages:
1: In Ops.factor(c(-1, 1), diff(weight)) : ‘*’ not meaningful for factors

If I apply the same thing to the pdata.frame. I get:

setDT(DTp)[, difference := c(-1,1)*diff(some_NA_factor), by=panelID]
Error in alloc.col(x) : 
  Internal error: length of names (14) is not length of dt (13)

Additionally, when apply this to my actual data I get the following error:

Supplied 107438 items to be assigned to group 1 of size 2 in column 'difference'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code.

And I am not sure why that happens (I cannot seem to reproduce it in the example).

Any ideas?


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>