Quantcast
Viewing all articles
Browse latest Browse all 206503

Filling gaps before, between and after spells (date ranges) in R

Imagine that you have a data frame of three study participants with their dates of entry (doe) and exit (dox):

> participants <- data.frame(id = 1:3, 
+            doe = c("1990/01/04","1988/05/02","2001/06/04"), 
+            dox = c("2020/01/02","1999/03/01","2011/05/06"))
> participants
  id        doe        dox
1  1 1990/01/04 2020/01/02
2  2 1988/05/02 1999/03/01
3  3 2001/06/04 2011/05/06

All three of them have been into foster care during different time periods between the doe and dox:

> placement_dates<-data.frame(
+           id = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L),
+           placement_start = c("1993/04/21","1994/12/04","1994/12/19",
+                  "2000/01/14","2003/11/22","2015/03/03","1993/04/21",
+                  "1993/05/13","1998/02/19","2005/01/31"),
+           placement_end = c("1993/06/01","1994/12/06","1995/05/02",
+                  "2002/12/04","2005/06/01","2019/02/08","1993/05/12",
+                  "1996/01/30","1998/02/28","2006/08/19")
+ )
> placement_dates
   id placement_start placement_end
1   1      1993/04/21    1993/06/01
2   1      1994/12/04    1994/12/06
3   1      1994/12/19    1995/05/02
4   1      2000/01/14    2002/12/04
5   1      2003/11/22    2005/06/01
6   1      2015/03/03    2019/02/08
7   2      1993/04/21    1993/05/12
8   2      1993/05/13    1996/01/30
9   2      1998/02/19    1998/02/28
10  3      2005/01/31    2006/08/19

I am now interested in combining the participants and placement_dates data frames to generate spells of not being placed into foster care (spell_type = A) and being placed in foster care (spell_type = B). My desired output is therefore:

> desired_df <- data.frame(
+   id = c(1L,1L,1L,1L,1L,1L,1L,1L,
+          1L,1L,1L,1L,1L,2L,2L,2L,2L,2L,2L,3L,3L,3L),
+   spell_start = c("1990/01/04","1993/04/21",
+          "1993/06/02","1994/12/04","1994/12/07","1994/12/19",
+          "1995/05/03","2000/01/14","2002/12/05","2003/11/22",
+          "2005/06/02","2015/03/03","2019/02/09","1988/05/02",
+          "1993/04/22","1993/05/13","1996/01/31","1998/02/19",
+          "1998/03/01","2001/06/04","2005/01/31","2006/08/20"),
+   spell_end = c("1993/04/20","1993/06/01",
+          "1994/12/03","1994/12/06","1994/12/18","1995/05/02",
+          "2000/01/13","2002/12/04","2003/11/21","2005/06/01",
+          "2015/03/02","2019/02/08","2020/01/02","1993/04/21",
+          "1993/05/12","1996/01/30","1998/02/18","1998/02/28",
+          "1999/03/01","2005/01/30","2006/08/19","2011/05/06"),
+   spell_type = c("A","B","A","B","A","B",
+          "A","B","A","B","A","B","A","A","B","B","A","B",
+          "A","A","B","A")
+ )
> desired_df
   id spell_start  spell_end spell_type
1   1  1990/01/04 1993/04/20          A
2   1  1993/04/21 1993/06/01          B
3   1  1993/06/02 1994/12/03          A
4   1  1994/12/04 1994/12/06          B
5   1  1994/12/07 1994/12/18          A
6   1  1994/12/19 1995/05/02          B
7   1  1995/05/03 2000/01/13          A
8   1  2000/01/14 2002/12/04          B
9   1  2002/12/05 2003/11/21          A
10  1  2003/11/22 2005/06/01          B
11  1  2005/06/02 2015/03/02          A
12  1  2015/03/03 2019/02/08          B
13  1  2019/02/09 2020/01/02          A
14  2  1988/05/02 1993/04/21          A
15  2  1993/04/22 1993/05/12          B
16  2  1993/05/13 1996/01/30          B
17  2  1996/01/31 1998/02/18          A
18  2  1998/02/19 1998/02/28          B
19  2  1998/03/01 1999/03/01          A
20  3  2001/06/04 2005/01/30          A
21  3  2005/01/31 2006/08/19          B
22  3  2006/08/20 2011/05/06          A

My actual dataset has around 300k foster care placements with varying types of arrangements. The spell_type variable is therefore more complex but I need to get some ideas to start. I have looked into the complete function in the tidyr package but I haven't been able to implement it for my specific problem.


Viewing all articles
Browse latest Browse all 206503

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>