Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

Regex to remove everything, but emojis from the string in R?

$
0
0

I have a big .xlsx file containing tweets with emojis. I am working on a personal project where I want to make a network graph from the extracted emojis. For example, if I have this in one of the columns:

Christian✝️, Husband👫, Father👨‍👩‍👦‍👦, Former TV 📺Meteorologist🌪, GOP🐘, LTC 🔫, Dolfan🐬, since ‘75, Yanks Fan⚾️ & UCONN Alum🏀 Go Whalers🐋!

So how would I only get this as on output?

✝️👫👨‍👩‍👦‍👦📺🌪🐘🔫🐬⚾️🏀🐋

I have looked thoroughly everywhere, in stack overflow and over the internet, however I couldn't find anything. I am a beginner in R, and would greatly appreciate if you can either give me some direction to where to look at or give me solution to this problem. Thanks in advance.

EDIT 1: I am getting the unicode (in UTF-8 format) when I normally read the file, but I don't know how to turn those unicode to the emojis.There are dictionaries online, but they only give me the name of some of these emojis, they are very outdated.

EDIT 2: There is a solution that works in Linux, but if anybody ever has a solution/hint/direction to get this to work in the Windows, I would really appreciate it.


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>