Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201977

Easiest way to create indicator variables for changes in time series in R

$
0
0

I have a 14 million row dataset of products, tariff rates, trade volumes, and year-month combinations in the following format:

df <- as.data.frame(matrix(c(1220, "2013-1", 10011900, 29307, .1,
                   1220, "2013-2", 10011900, 28202, .1,
                   1220, "2013-3", 10011900, 22383, .15,
                   1220, "2013-4", 10011900, 21303, .15,
                   1220, "2013-5", 10011900, 21201, .15,
                   1220, "2013-1", 10019900, 9960, .12,
                   1220, "2013-2", 10019900, 10043, .12,
                   1220, "2013-3", 10019900, 11001, .1,
                   1220, "2013-4", 10019900, 10997, .1,
                   1220, "2013-5", 10019900, 12038, .1), 
                 ncol = 5, byrow = T))
colnames(df) <- c("country", "date", "product", "value", "rate" )

I'm trying to add a column to the data such that I'll be able to use to create a set of indicator variables marking how many months before / after a change in the tariff rate occurred. So, the above would look like this:

df_transformed <- as.dataframe(matrix(c(1220, "2013-1", 10011900, 29307, .1, -2, 
                                        1220, "2013-2", 10011900, 28202, .1, -1,
                                        1220, "2013-3", 10011900, 22383, .15, 0, 
                                        1220, "2013-4", 10011900, 21303, .15, 1, 
                                        1220, "2013-5", 10011900, 21201, .15, 2,
                                        1220, "2013-1", 10019900, 9960, .12, -2,
                                        1220, "2013-2", 10019900, 10043, .12, -1,
                                        1220, "2013-3", 10019900, 11001, .1, 0,
                                        1220, "2013-4", 10019900, 10997, .1, 1,
                                        1220, "2013-5", 10019900, 12038, .1, 2)))
colnames(df_transformed) <- c("country", "date", "product", "value", "rate", "months_since_change")

I'm not sure how to best find when the tariff variable changes and create a new column based on that.

Thanks for the help!


Viewing all articles
Browse latest Browse all 201977

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>