I got following error when trying to inspect the DocumentTermMatrix after performing lemmatization in R: no applicable method for 'meta' applied to an object of class "character"
I've tried the PlainTextDocument function to solve this but unfortunately this function removes meta data from corpus which results in following error: Error in [.simple_triplet_matrix(x, terms, docs) : Repeated indices currently not allowed.
This is my code:
corp9 <- Corpus(URISource(files),
readerControl = list(reader =readPDF))
corp9 <- tm_map(corp9, removePunctuation, ucp = TRUE)
corp9 <- tm_map(corp9, removeNumbers)
corp9 <- tm_map(corp9, content_transformer(tolower))
corp9 <- tm_map(corp9, removeWords, stopwords("en"))
corp9 <- tm_map(corp9, stripWhitespace)
library("textstem")
corp9 <- tm_map(corp9, lemmatize_strings)
corp9 <- tm_map(corp9, PlainTextDocument)
corp.tdm9 <- TermDocumentMatrix(corp9)
inspect(corp.tdm9)
Would be glad if someone could help me! :)