I have a nested tibble/dataframe that looks like this toy dataset
(EDITED SINCE ORIGINAL POSTING):
library(tibble)
library(ggplot2)
library(dplyr)
library(tidyr)
library(purrr)
id <- c(1,2)
set.seed(123)
sess <- c(0,0,1,1,1,2,3,3,3,3,4,5,5,5,5,6)
meas <- c(rnorm(16, mean=3, sd=2))
meas2 <- c(rnorm(16, mean=2, sd=1))
test.tb <- tibble(id=id,
sess=list(sess),
spd=c(list(meas), list(meas2)))
test2.tb <- test.tb %>% mutate(
max = unlist(map(spd, ~ round(max(., na.rm=TRUE),2))),
mean = map2(sess, spd, calcXrun, mean))
Where I've clunkily defined by function calcXrun
(suggestions to improve this very welcome!):
calcXrun <- function(sessVec, otherVec, FUN=max) {
df <- data.frame(sess=sessVec,meas=otherVec)
## remove run 0
df <- df[df$sess!=0,]
calckedXrun <- aggregate(meas~sess, df, FUN,
na.rm=TRUE, na.action="na.pass")
calckedXrun$meas <- round(calckedXrun$meas,2)
names(calckedXrun)[names(calckedXrun) == "meas"] <-
deparse(substitute(FUN))
return(calckedXrun)
}
This gives me a tibble that looks like this:
> test2.tb
# A tibble: 2 x 5
id sess spd max mean
<dbl> <list> <list> <dbl> <list>
1 1 <dbl [16]> <dbl [16]> 6.57 <df[,2] [6 × 2]>
2 2 <dbl [16]> <dbl [16]> 3.25 <df[,2] [6 × 2]>
where my mean
column, unnested, is
> test2.tb$mean
[[1]]
sess mean
1 1 4.17
2 2 6.43
3 3 2.03
4 4 5.45
5 5 3.16
6 6 6.57
[[2]]
sess mean
1 1 1.72
2 2 1.78
3 3 0.98
4 4 2.84
5 5 2.17
6 6 1.709
I would like to print out the dataframe nested in column mean
of my tibble in the appropriate facet panel in ggplot, and I have a clunky way of doing it that I'd like to improve:
meanspd <- test2.tb %>% unnest(id,mean)
test2.tb %>% unnest(c(sess,spd)) %>% filter(sess > 0) %>%
ggplot() +
geom_freqpoly(aes(x=spd,
group=factor(sess),
color=factor(sess)),
alpha=0.8) +
scale_color_brewer(palette="Spectral") +
theme_bw() +
labs(color="run") +
facet_grid(~ id) +
geom_text(x=5, y=2, inherit.aes=FALSE,
aes(label=paste("max spd",max)),
show.legend = FALSE) +
geom_text(data=test2.tb, inherit.aes=FALSE,
aes(label=paste("sess",sess1, " mean ",mean),
x=5, y=1.7-0.1*sess1),
show.legend = FALSE)
I also see that the geom_text of the non-list-column max
gets overplotted (I assume as many times as I have sess
levels). I feel like I would like to have access to the levels of sess
so I can access specific rows in my nested dataframe and/or limit geom_text to printing max
only once?
ggplot results:
Thanks in advance for any help/advice!
gj