Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201894

Pattern changes doesn't aggregate rows with same char strings [duplicate]

$
0
0

I generate reports wherein I update/replaces certain details using gsub and %in% operator. Issue is after replacing certain string, the associated numerical values doesn't aggregate, when I use "match" operator, it picks up the first occurrence of same string leaving the others.

Sample Code:

 o <- data.frame(branch = c('MDB','PMP','MWC'),val = c(1.1,0.9,0.75), stringsAsFactors = 0)
 o$branch <- gsub('MDB','Others',o$branch)
 o$branch <- gsub('PMP','Others',o$branch)

# o$branch[o$branch %in% c('MDB','PMP')] <- 'Others'

 o

#>  branch  val
#>1 Others 1.10
#>2 Others 0.90
#>3    MWC 0.75

 p <- data.frame(branch = c('Others','MWC'),rev = c(1,1.25), stringsAsFactors = 0)
 p

#>   branch  rev
#> 1 Others 1.00
#> 2    MWC 1.25

 p$rev <- o$val[match(p$branch,o$branch)]
 p

#>  branch  rev
#>1 Others 1.10
#>2    MWC 0.75

As shown above, after I use gsub on "o" dataframe, there are two "others" rows, whereas I need only one "others" row and the corresponding "val" column aggregated to (1.10 + 0.90) = 2.00. My final "p" dataframe should have "others" value 2.00 instead of 1.10. I ran the report few times getting a deflated value each time. Could someone let me know how to correct the issue.


Viewing all articles
Browse latest Browse all 201894

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>