Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201977

what if data to big for 1 reducer (RHadoop)?

$
0
0

i'm new to big data and hadoop thing. I try to find median with mapreduce. From what i know, mapper pass data to 1 reducer then 1 reducer sort and find the middle value using median() function.

R running in memmory, so what if data too big to store in 1 reducer, which is running on 1 computer?

here is the example of my code to find median with RHadoop.

map <- function(k,v) {
    key <- "median"
    keyval(key, v)
}
reduce <- function(k,v) {
    keyval(k, median(v))
}

medianMR <- mapreduce (
    input= random, output="/tmp/ex3",
    map = map, reduce = reduce
)

Viewing all articles
Browse latest Browse all 201977

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>