Trying to plot wordcount in a geom_col.
library(dplyr)
library(ggplot2)
library(tidytext)
df <- read_csv("C:/Data/Data.csv")
df %>%
count(word, sort=TRUE)
top_n(df, 15) %>%
ggplot(df,mapping = aes(x = word, y = n)) +
geom_col(fill="royalblue") +
labs(x="Top Unique Words", y="Word Count")
csv file contains two columns of data:
users,word
user_gffast2,stop
user_gffast2,the
user_gffast2,along
user_gffast3,rain
user_gffast3,a
user_gffast3,the
user_gffast3,course
user_gffast4,stop
user_gffast4,the
user_gffast4,I
.
...etc.
The part I think I'm having trouble with is
ggplot(df_task4,mapping = aes(x = word, y = n))
The output looks like:
# A tibble: 912 x 2
word n
<chr> <int>
1 the 244
2 I 96
3 and 90
4 a 76
5 from 72
6 is 70
7 to 68
8 i 60
9 this 55
10 for 50
#
# ... with 902 more rows
> top_n(df, 15) %>%
+ ggplot(df,mapping = aes(x = word, y = n)) +
+ geom_col(fill="royalblue") +
+ labs(x="Top Unique Words", y="Word Count")
Selecting by word
Don't know how to automatically pick scale for object of type gg/ggplot. Defaulting to continuous.
Error: Aesthetics must be either length 1 or the same as the data (25): y
>
Can I assign y to n, or would it view n as a pure variable? I need to somehow take the data, count up all the words, then take the top 15 and do a geom_col where top words are x axis and total wordcount is y axis.