How do I get my loop on pdf_text only to read all the files?

December 29, 2019, 3:45 am

≫ Next: How to plot the fitting curves in a GAM regression in R?

≪ Previous: base_layer not showing in ggmap

I have a series of 475 files that I need to convert to text. I have written the following code to do that:

files <- list.files(pattern = "pdf$")

for (i in 1:length(files)){
print(i)
files_pdfs <- pdf_text(files[i]) %>% tibble(txt = .) %>% unnest_tokens(word, txt)}

It appears to execute successfully but when I inspect the output, it has clearly only read the text from the final file. I tried breaking the corpus of PDFs up into smaller segments and I still get the same problem - always just the text from the final file. I'm sure it's a basic error in my code but I can't figure it out. Any ideas?

Thanks for your help!

↧