Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 204922

Garbage Collection in R process running inside docker

$
0
0

I have a process where we currently large amount of data per day, perform a map-reduce level of functions and use only the output of the function. We currently run a code sequence that looks like the below

lapply(start_times, function(start_time){
 <get_data>
 <setofoperations>
}

so currently we loop through start times , which helps us get data for a particular day , analyse and output dataframes of results per output per day. set of operations is a series of functions that keep working on and return dataframes.

While running this on a docker container with a memory limit , we often see that the process runs out of memory when its dealing with large data (around 250-500MB) over periods of days and R isnt able to effectively do garbage collection.

Im trying an approach to monitor each process using cadvisor and notice spikes , but not really able to understand better.

  1. If R does a lazy gc, ideally the process should be able to reuse the memory over and over, is there something that is not being captured through the gc process?

  2. How can an R process reclaim more memory when its the only primary process running in the docker container ?


Viewing all articles
Browse latest Browse all 204922

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>