I have a vector of several regexes. They are so short and so different that it is not worth trying to make a single regex that captures all of them at once.
I also have a data table with two columns, one contains strings, and the other an ID, with multiple strings per ID.
What I need is to find out for which ID, there is no match between at least one of the associated IDs and at least one of the regexes.
MWE:
icd10.autol.tr.regex <- c("C91\\.00", "C92\\.00", "D61\\.\\d{1,2}", "C91\\.10", "C92\\.10",
"Q82\\.8", "D76\\.1", "C81\\.\\d{1,2}", "E76\\.0", "C90\\.00",
"C94\\.60", "C85\\.9", "Q78\\.2", "D59\\.5", "D57\\.1",
"D56\\.\\d{1,2}", "D82\\.\\d{1,2}", "C86\\.4", "C93\\.3\\d",
"C91\\.6\\d")
codes.to.check <- data.frame(code=c("E85.3", "C90.00", "Z45.20", "N08.4", "Z29.21",
"Z52.01", "C79.3", "Z45.20", "F05.9", "B99", "A04.7",
"R63.3"),
id=c(1,1,1,1,1,1,2,2,2,2,2,2))
Here, I want the result to look like
ID result
1 TRUE #because we matched C90.00
2 FALSE #no match
If I had the list of possible codes as strings, I would have used %in%. For regexes, I tried using str_extract from stringr, but it doesn't seem to take vectors for the serched pattern. I guess I could do nested loops with str_extract, but this feels inefficient. Is there a more idiomatic way?