I am trying to extract data (tables) from pdf files and store them as data frames.Here is my code
` #using package pdftools
library(pdftools)
f <- file.path("........")
text <- pdf_text(f)
#using package tabulizer
library(tabulizer)
d <- pdf_data(f) `
Both options return long rows of unstructured and messy data. Is there any other way to extract these type of data from the pdf files or I have to clean and tidy these data? You can find the file here : statement.pdf