Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

Tidying financial data with mixed decimal and grouping digits

$
0
0

Context

I need to clean financial data with mixed formats. The data has been punched in manually by different departments, some of them using "." as decimal and "," as grouping digit (e.g. US notation: $1,000,000.00) while others are using "," as decimal and "." as grouping digit (e.g. notation used in certain European countries: $1.000.000,00).

Input:

Here's a fictional example set:

df <- data.frame(Y2019= c("17.530.000,03","28000000.05", "256.000,23", "23,000", 
                           "256.355.855","2565467,566","225,4534.126") 
)
          Y2019
1 17.530.000,03
2   28000000.05
3    256.000,23
4        23,000
5   256.355.855
6   2565467,566
7  225,4534.126

Desired result:

         Y2019
1  17530000.03
2  28000000.05
3    256000.23
4        23000
5    256355855
6  2565467.566
7  2254534.126

My attempt:

I got pretty close by considering the first occurrence (starting from the right) of "," or "." as the decimal operator and replacing the other occurrences accordingly. However, some entries are without decimals (e.g. entry 4 and 5), rendering this strategy less useful.

Any input is greatly appreciated!


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>