Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201894

Trying to plot word count on a geom_col. Can't seem to figure it out

$
0
0

Trying to plot wordcount in a geom_col.

library(dplyr)
library(ggplot2)
library(tidytext)
df <- read_csv("C:/Data/Data.csv")

df %>% 
  count(word, sort=TRUE)
top_n(df, 15) %>%
  ggplot(df,mapping = aes(x = word, y = n)) +
  geom_col(fill="royalblue") +
  labs(x="Top Unique Words", y="Word Count")

csv file contains two columns of data:

users,word
user_gffast2,stop
user_gffast2,the
user_gffast2,along
user_gffast3,rain
user_gffast3,a
user_gffast3,the
user_gffast3,course
user_gffast4,stop
user_gffast4,the
user_gffast4,I
.
...etc.

The part I think I'm having trouble with is

ggplot(df_task4,mapping = aes(x = word, y = n))

The output looks like:

# A tibble: 912 x 2
   word       n
   <chr>  <int>
 1 the      244
 2 I        96
 3 and      90
 4 a         76
 5 from      72
 6 is        70
 7 to        68
 8 i         60
 9 this      55
10 for       50
#
# ... with 902 more rows
> top_n(df, 15) %>%
+   ggplot(df,mapping = aes(x = word, y = n)) +
+   geom_col(fill="royalblue") +
+   labs(x="Top Unique Words", y="Word Count")
Selecting by word
Don't know how to automatically pick scale for object of type gg/ggplot. Defaulting to continuous.
Error: Aesthetics must be either length 1 or the same as the data (25): y
>

Can I assign y to n, or would it view n as a pure variable? I need to somehow take the data, count up all the words, then take the top 15 and do a geom_col where top words are x axis and total wordcount is y axis.


Viewing all articles
Browse latest Browse all 201894

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>