Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 211971

How to parallelize gsub currently in a for loop in R?

$
0
0

I'm fairly new to trying to run parallel processes in R because most of the data I work with just isn't that large. However, I am no working with a larger set of data where I am attempting to 'find and replace' a set of about 2000 names from 9000 survey comments. I've created a for loop using gsub that gets the job done, but it takes quite a long time:

completed <- 0

for (name in names){
  text_df$text <- sapply(text_df$text, gsub, pattern=paste0("(?<=\\W|^)", name, "(?=\\W|$)"), replacement="RemovedLeader", ignore.case=TRUE, perl=TRUE)
  completed <- completed + 1
  print(paste0("Completed ", completed," out of ", length(names)))
} 

From what I understand, this should be a fairly simple process to run in parallel, yet I'm having a bit of trouble. I've tried running this using parSapply, but I'm having a hard time re-writing the gsub (which itself is currently in an sapply in the for loop) to work outside of the for loop. Thanks for the help.


Viewing all articles
Browse latest Browse all 211971

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>