My task is to calculate the ratio of academic credits per student/expected academic credits (with no courses failed) during the time period. Since I have got limited access to university databases, I have to create a reference table for each class, where I can see what the amount of credits the students should have attained given the current date.
My reference table will look something like this (I haven't yet gathered data from more than one class though):
Cred Sumb weeks startdate ExpCred Start_date_points End_date_points
1 15.0 0 10 2018-09-02 0.0 2018-09-02 2018-12-01
2 15.0 0 10 2018-09-02 15.0 2018-12-02 2019-02-09
3 15.0 0 10 2018-09-02 30.0 2019-02-10 2019-04-20
4 15.0 0 10 2018-09-02 45.0 2019-04-21 2019-06-29
5 7.5 1 5 2018-09-02 60.0 2019-06-30 2019-10-26
6 7.5 1 5 2018-09-02 67.5 2019-10-27 2019-11-30
7 15.0 1 10 2018-09-02 75.0 2019-12-01 2020-02-08
8 7.5 1 5 2018-09-02 90.0 2020-02-09 2020-03-14
9 7.5 1 5 2018-09-02 97.5 2020-03-15 2020-04-18
10 15.0 1 10 2018-09-02 105.0 2020-04-19 2020-06-27
11 15.0 2 10 2018-09-02 120.0 2020-06-28 2020-11-28
12 15.0 2 10 2018-09-02 135.0 2020-11-29 2021-02-06
13 30.0 2 20 2018-09-02 150.0 2021-02-07 2021-06-26
14 0.0 2 0 2018-09-02 180.0 2021-06-27 2021-06-26
15 15.0 0 10 2019-09-03 0.0 2019-09-03 2019-12-02
16 15.0 0 10 2019-09-03 15.0 2019-12-03 2020-02-10
17 15.0 0 10 2019-09-03 30.0 2020-02-11 2020-04-20
18 15.0 0 10 2019-09-03 45.0 2020-04-21 2020-06-29
19 7.5 1 5 2019-09-03 60.0 2020-06-30 2020-10-26
20 7.5 1 5 2019-09-03 67.5 2020-10-27 2020-11-30
21 15.0 1 10 2019-09-03 75.0 2020-12-01 2021-02-08
22 7.5 1 5 2019-09-03 90.0 2021-02-09 2021-03-15
23 7.5 1 5 2019-09-03 97.5 2021-03-16 2021-04-19
24 15.0 1 10 2019-09-03 105.0 2021-04-20 2021-06-28
25 15.0 2 10 2019-09-03 120.0 2021-06-29 2021-11-29
26 15.0 2 10 2019-09-03 135.0 2021-11-30 2022-02-07
27 30.0 2 20 2019-09-03 150.0 2022-02-08 2022-06-27
28 0.0 2 0 2019-09-03 180.0 2022-06-28 2022-06-27
When I try to use my reference table in calculations, however, I run into problems. I write:
fulldata<-fulldata%>%mutate(PERC_CREDIT=CREDITS/ifelse(UTBILDNINGSTILLFALLE_STARTDATUM==ekon_program$sd & Sys.Date()>=ekon_program$finished_date,180,
ifelse(UTBILDNINGSTILLFALLE_STARTDATUM==ekon_program$sd & Sys.Date()>=ekon_program$start_date_points & Sys.Date()<=ekon_program$end_date_points,ekon_program$points_expected,100000 )))
And get the following error messages:
Warning messages:
1: In UTBILDNINGSTILLFALLE_STARTDATUM == ekon_program$sd :
longer object length is not a multiple of shorter object length
2: In UTBILDNINGSTILLFALLE_STARTDATUM == ekon_program$sd & Sys.Date() >= :
longer object length is not a multiple of shorter object length
3: In UTBILDNINGSTILLFALLE_STARTDATUM == ekon_program$sd :
longer object length is not a multiple of shorter object length
4: In UTBILDNINGSTILLFALLE_STARTDATUM == ekon_program$sd & Sys.Date() >= :
longer object length is not a multiple of shorter object length
5: In UTBILDNINGSTILLFALLE_STARTDATUM == ekon_program$sd & Sys.Date() >= :
longer object length is not a multiple of shorter object length
What I want to do is to choose a denominator based on the information in my reference table. First and formost I want to limit the observations the calculations are based on to the ones with the same starting point for the program as the one for the current observation in the main table.
If the current sys.date() exceeds the calculated end date for the program (starting at that specified date) in the reference table, I want the denominator to be set to 180.
If the sys.date is less than that, I want it to reflect the value of points_expected for whatever observation in the reference table (with the same starting point for the program as the current observation in the main table) where the current sys.date falls between the values of start_date_points and end_date_points.
How can I make this happen and what am I doing wrong?
Excerpt main table
structure(list(UTBILDNINGSTILLFALLE_STARTDATUM = c("2018-09-03",
"2018-09-03", "2018-09-03", "2018-09-03", "2018-09-03", "2018-09-03"
), CREDITS = c(30, 0, 22.5, 9.5, 0, 54)), row.names = c(NA, 6L
), class = "data.frame")
Excerpt reference table:
structure(list(sd = c("2018-09-03", "2018-09-03", "2018-09-03",
"2018-09-03", "2018-09-03", "2018-09-03"), points_ekon = c(15,
15, 15, 15, 7.5, 7.5), summer_break_ekon = c(0, 0, 0, 0, 1, 1
), weeks_course = c(10, 10, 10, 10, 5, 5), points_expected = c(0,
15, 30, 45, 60, 67.5), order = 1:6, start_date = structure(c(17776,
17776, 17776, 17776, 17776, 17776), class = "Date"), start_date_points = structure(c(17776,
17867, 17937, 18007, 18077, 18196), class = "Date"), end_date_points = structure(c(17866,
17936, 18006, 18076, 18195, 18230), class = "Date"), finished_date = structure(c(18805,
18805, 18805, 18805, 18805, 18805), class = "Date")), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L), groups = structure(list(
start_date = structure(17776, class = "Date"), .rows = list(
1:6)), row.names = c(NA, -1L), class = c("tbl_df", "tbl",
"data.frame"), .drop = TRUE))