I have a series of 475 files that I need to convert to text. I have written the following code to do that:
files <- list.files(pattern = "pdf$")
for (i in 1:length(files)){
print(i)
files_pdfs <- pdf_text(files[i]) %>% tibble(txt = .) %>% unnest_tokens(word, txt)}
It appears to execute successfully but when I inspect the output, it has clearly only read the text from the final file. I tried breaking the corpus of PDFs up into smaller segments and I still get the same problem - always just the text from the final file. I'm sure it's a basic error in my code but I can't figure it out. Any ideas?
Thanks for your help!