I have a data frame that tracks whenever a project's record was edited, the time of the edit, what field was edited, the old value, and the new value. Columns OrderDate
, Probability
, and Total
display what the value of those fields are today:
df.raw <- data.frame(project=rep(c('A','B'), each=4),
createDate=as.Date(rep(c('2015-01-01','2017-05-01'), each=4)),
editDate=as.Date(c('2018-06-01','2019-04-01','2019-05-01','2019-06-01', '2018-10-01','2018-11-01','2018-11-15','2019-01-01')),
editField=c('OrderDate', 'OrderDate','Probability','Probability', 'Total','Total', 'Probability','Total'),
oldValue=c('2018-06-01','2019-05-01',20,30,500,550,70,400),
newValue=c('2019-05-01','2019-06-01',30,50,550,400,30,450),
OrderDate=as.Date(rep(c('2019-06-01','2019-01-01'), each=4)),
Probability=rep(c(50,70), each=4),
Total=rep(c(10,450), each=4))
project createDate editDate editField oldValue newValue OrderDate Probability Total
1 A 2015-01-01 2018-06-01 OrderDate 2018-06-01 2019-05-01 2019-06-01 50 10
2 A 2015-01-01 2019-04-01 OrderDate 2019-05-01 2019-06-01 2019-06-01 50 10
3 A 2015-01-01 2019-05-01 Probability 20 30 2019-06-01 50 10
4 A 2015-01-01 2019-06-01 Probability 30 50 2019-06-01 50 10
5 B 2017-05-01 2018-10-01 Total 500 550 2019-01-01 70 450
6 B 2017-05-01 2018-11-01 Total 550 400 2019-01-01 70 450
7 B 2017-05-01 2018-11-15 Probability 70 30 2019-01-01 70 450
8 B 2017-05-01 2019-01-01 Total 400 450 2019-01-01 70 450
I would like to transform this data frame so that:
- There's an additional row for project creation.
- Each row displays what the project's OrderDate, Probability, and Total was at the time of the creation or edit.
- If a field was never edited, the field will always be equal to the final
OrderDate
,Probability
, orTotal
value for that project.
The end result would look something like this:
df.reshaped <- data.frame(project=rep(c('A','B'), each=5),
editDate=as.Date(c('2015-01-01','2018-06-01','2019-04-01','2019-05-01','2019-06-01', '2017-05-01', '2018-10-01','2018-11-01','2018-11-15','2019-01-01')),
editField=c('Created','OrderDate', 'OrderDate','Probability','Probability','Created', 'Total','Total', 'Probability','Total'),
OrderDateAtEdit=as.Date(c('2018-06-01','2019-05-01','2019-06-01','2019-06-01','2019-06-01',rep('2019-01-01', 5))),
ProbabilityAtEdit=c(20,20,20,30,50,70,70,70,30,30),
TotalAtEdit=c(10,10,10,10,10,500,550,400,400,450))
project editDate editField OrderDateAtEdit ProbabilityAtEdit TotalAtEdit
1 A 2015-01-01 Created 2018-06-01 20 10
2 A 2018-06-01 OrderDate 2019-05-01 20 10
3 A 2019-04-01 OrderDate 2019-06-01 20 10
4 A 2019-05-01 Probability 2019-06-01 30 10
5 A 2019-06-01 Probability 2019-06-01 50 10
6 B 2017-05-01 Created 2019-01-01 70 500
7 B 2018-10-01 Total 2019-01-01 70 550
8 B 2018-11-01 Total 2019-01-01 70 400
9 B 2018-11-15 Probability 2019-01-01 30 400
10 B 2019-01-01 Total 2019-01-01 30 450
I'm not sure where to begin with this, any help would be greatly appreciated! Thanks.