Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201894

How can I make large additions to textstem's lexicon in R?

$
0
0

I have a large body of free-text survey comments that I'm attempting to analyze. I used the textstem package to perform lemmatization, but after looking at the unique tokens it identified I'd like to make further adjustments. For example, it identified "abuses", "abused", and "abusing" as the lemma "abuse" but it left "abusive" untouched...I'd like to change that to "abuse" as well.

I found this post which described how to add to the lexicon on a piecemeal basis such as

lemmas <- lexicon::hash_lemmas[token=="abusive",lemma:="abuse"]
lemmatize_strings(words, dictionary = lemmas)

but in my case I'll have a dataframe with several hundred token/lemma pairs. How can I quickly add them all to lexicon::hash_lemmas?


Viewing all articles
Browse latest Browse all 201894

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>