Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201945

apply hpfilter to grouped variables with NAs using dplyr

$
0
0

I am trying to apply the hpfilter to one of the variables in my dataset that has a panel structure (id + year) and then add the filtered series to my dataset. It works perfectly fine as long as I do not have any NAs in one of the variables, but it yields an error if one of the ids has missing values. The reason for this is that the hpfilter function does not work with NAs (it yields only NAs).

Here's a reproducible example:

df1  <- read.table(text="country   year   X1  X2    W
                   A         1990   10  20    40
                   A         1991   12  15    NA
                   A         1992   14  17    41
                   A         1993   17  NA    44
                   B         1990   20  NA    45
                   B         1991   NA  13    61
                   B         1992   12  12    67
                   B         1993   14  10    68
                   C         1990   10  20    70
                   C         1991   11  14    50
                   C         1992   12  15    NA
                   C         1993   14  16    NA
                   D         1990   20  17    80
                   D         1991   16  20    91
                   D         1992   15  21    70 
                   D         1993   14  22    69
                   ", header=TRUE, stringsAsFactors=FALSE)

My approach was to use the dplyrgroup_by function to apply the hpfilter by country to variable X1:

library(mFilter)
library(plm)

# Organizing the Data as a Panel
df1 <- pdata.frame(df1, index = c("country","year"))

# Apply hpfilter to X1 and add trend to the sample 
df1 <- df1 %>% group_by(country) %>% mutate(X1_trend = mFilter::hpfilter(na.exclude(X1), type = "lambda", freq = 6.25)$trend)

However, this yields the following error:

Error in `[[<-.data.frame`(`*tmp*`, col, value = c(11.1695436493374, 12.7688604220353,  : 
  replacement has 15 rows, data has 16

The error occurs because the filtered series is shortened after applying the hp filter (by the NAs).

Since I have a large dataset with many countries it would be really great if there was a workaround, to maybe ignore the NAs when passing the series to the hpfilter, but not removing them. Thank you!


Viewing all articles
Browse latest Browse all 201945

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>