Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 205399

How do I get my loop on pdf_text only to read all the files?

$
0
0

I have a series of 475 files that I need to convert to text. I have written the following code to do that:

files <- list.files(pattern = "pdf$")

for (i in 1:length(files)){
print(i)
files_pdfs <- pdf_text(files[i]) %>% tibble(txt = .) %>% unnest_tokens(word, txt)}

It appears to execute successfully but when I inspect the output, it has clearly only read the text from the final file. I tried breaking the corpus of PDFs up into smaller segments and I still get the same problem - always just the text from the final file. I'm sure it's a basic error in my code but I can't figure it out. Any ideas?

Thanks for your help!


Viewing all articles
Browse latest Browse all 205399

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>