Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201894

Remove space after lemmatization

$
0
0

I simply lemmatized a character vector. The problem is that the lemmatization creates a space between words unified by a dash (eg. short-term becomes short - term). My character vector is full of these words, so I would like to find a way to remove this distortion.

Let me take an example:

text <- c("Stackoverflow is a great website where you can find great and very skilled people who are so kind to solve your coding problems. In the short-term is a very good thing because you can speed up your research, in the long-term is better if you learn how to code on your own. Let me add more non-sense to make my point. The growth-friendly composition of public finance is a good thing.")

ch_vector <- lemmatize_strings(text)

As I said before the outcome is this:

"Stackoverflow be a great website where you can find great and very skill people who be so kind to solve your code problem. In the **short - term** be a very good thing because you can speed up your research, in the **long - term** be good if you learn how to code on your own. Let me add much **non - sense** to make my point. The **growth - friendly** composition of public finance be a good thing."

Instead I want this:

"Stackoverflow be a great website where you can find great and very skill people who be so kind to solve your code problem. In the **short-term** be a very good thing because you can speed up your research, in the **long-term** be good if you learn how to code on your own. Let me add much **non-sense** to make my point. The **growth-friendly** composition of public finance be a good thing."

So far, I have done it in this way for each word of interest:

ch <- sub(pattern = "growth - friendly", replacement = "growth-friendly", x = ch_vector, fixed = TRUE)

But it is honestly time-consuming, inefficient and not always works fine (depending on capital letters, etc.)

Can you suggest a better way to do it?

Thanks a lot


Viewing all articles
Browse latest Browse all 201894

Trending Articles