Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

Caching a huge data table with knitr

$
0
0

I am trying to cache a big data.table and then make a plot out of it, the code is as follow:

{r gen-data, tidy=TRUE, warning=FALSE, tidy.opts=list(width.cutoff=60), cache = TRUE,  cache.lazy=FALSE}
DT = fread("reference.txt.gz", header = FALSE)
vc = c("chromosome_1", "chromosome_2", "chromosome_3", "chromosome_4", "chromosome_5", "chromosome_6")
colnames(DT) = c("chrom", "position", "score", "corrected base", "score of the corrected base")
DT=setDT(DT, key = "chrom")[J(vc), nomatch = 0]
{r, cache=TRUE, tidy=TRUE, warning=FALSE, tidy.opts=list(width.cutoff=60), dependson='gen-data'}
plot = ggplot(data = DT) + geom_line(aes(x = position, y = score, group = 1), stat = "summary_bin", fun.y = "mean", binwidth = 100000, color = ghibli_palette("MononokeMedium")[2])
ttle = paste0("coverage of the 6 longest scaffolds of Shasta + instagraal assembly")
plot = plot + labs(
  title = ttle) + theme(plot.title = element_markdown(lineheight = 1.5, size = 12), legend.text = element_markdown(size = 14))
plot = plot + theme(axis.title = element_markdown(size = 12)) + theme(axis.text.x = element_text(size=5)) + theme(axis.text.y = element_text(size=3))
plot = plot + theme(legend.title = element_markdown(size = 12)) 
p = plot + facet_wrap(~chrom, scales = "free_x") +xlab( "position") + ylab("mean score per 100 Kb windows")
v = ggplotly(p) %>% 
 layout( 
 xaxis = list(automargin=TRUE), 
 yaxis = list(automargin=TRUE)
 )
v

So what I was thinking, is that the first chunk read the data into a data.table, then apply the relevant selection, and finally cache a DT object.

However, the first chunk is evaluated every time, no matter what. Therefore I must be doing something wrong but I can't see what.

Thanks for any help.


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>