I would like to modify the answer to the question here or have a new solution to include another column which shows the second largest consecutive run of "0". My sample data and code is below, the function is operating on the month columns and the second largest run
column is what I hope to add. I am working with a large dataset so the more efficient the better, any ideas are appreciated, thanks.
sample data
structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9), V1 = c("A",
"B", "A", "B", "B", "A", "A", "B", "B"), V2 = c(21, 233, 185,
85, 208, 112, 238, 66, 38), V3 = c(149, 250, 218, 104, 62, 19,
175, 168, 28), Jan = c(10, 20, 10, 12, 76, 28, 137, 162, 101),
Feb = c(20, 25, 15, 0, 89, 0, 152, 177, 119), March = c(0,
28, 20, 14, 108, 0, 165, 194, 132), April = c(0, 34, 25,
16, 125, 71, 181, 208, 149), May = c(25, 0, 30, 22, 135,
0, 191, 224, 169), June = c(29, 0, 35, 24, 145, 0, 205, 244,
187), July = c(34, 0, 40, 28, 163, 0, 217, 256, 207), August = c(37,
0, 45, 29, 173, 0, 228, 276, 221), Sep = c(0, 39, 50, 31,
193, 0, 239, 308, 236), Oct = c(0, 48, 55, 35, 210, 163,
252, 0, 247), Nov = c(48, 55, 60, 40, 221, 183, 272, 0, 264
), Dec = c(50, 60, 65, 45, 239, 195, 289, 0, 277), `Second largest run` = c(1,
NA, NA, NA, NA, 2, NA, NA, NA), result = c(2, 4, -Inf, 1,
-Inf, 5, -Inf, 3, -Inf)), row.names = c(NA, -9L), class = c("tbl_df",
"tbl", "data.frame"))
code
most_consecutive_val = function(x, val = 0) {
with(rle(x), max(lengths[values == val]))
}
test$result=apply(test[,-c(1:4,17)], MARGIN = 1, most_consecutive_val)