Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

Conditional calculation based on other columns lagged values

$
0
0

Newbie: I have a dataset where I want to calculate the y-o-y growth of sales of a company. The dataset contains approx. 1000 companies with each different number of years listed on a public stock exchange. The data looks like this:

#      gvkey fyear at    company name
#22    17436 2010  59393 BASF SE
#23    17436 2011  61175 BASF SE
#24    17436 2012  64327 BASF SE
       ...
#30    17436 2018  86556 BASF SE
#31    17828 1989  62737 DAIMLER AG
#32    17828 1990  67339 DAIMLER AG
#33    17828 1991  75714 DAIMLER AG
       ...
#60    17828 2018  281619  DAIMLER AG

I would like to create a new column growth where I calculate the percentage increase of at from e.g. BASF SE (gvkey 17436) from 2010 to 2011, to 2012 and so on. In row #31 the conditional statement is supposed to work that it would not calculate the increase based on values that belong to BASF but rather have a NA value. Therefore the next value in this new column "growth" in row 32 would be the percentage increase of DAIMLER (gvkey 17828) from 62727 to 67339

So far I tried:

if TA$gvkey == lag(TA$gvkey) {mutate(TA, growth = (at - lag(at))/lag(at))} else {NULL}

Basically I tried to condition the calculation on the change of the gvkey identifier as this makes the most sense to me. I believe there is a nicer way of maybe running a loop until the gvkey changes and the continue with the next set of values - but I simply don't know how to code that.

I am very new to R and quite lost. I would appreciate every support! Thank you, guys :)


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>