What is the fastest way to check a full column in data.table R if each value appears in a different data.table column?
Example problem:
Create example big data:
dt1 <- data.table(dt1row=c(1:1000000),code=sapply(c(1:1000000),FUN=function(x){paste(sample(letters,5), collapse="")}))
dt2 <- data.table(dt2row=c(1:500000),code=sapply(c(1:500000),FUN=function(x){paste(sample(letters,5), collapse="")}))
Slow function I'd like to replace (but works):
#SLOW ON BIG DATA!
dt1$in_dt2 <- sapply(c(1:nrow(dt1)),FUN=function(x){dt1$code[x] %in% dt2$code})