Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201894

How to convert incorrectly-written dates (e.g. "1990-4-31") from 'character' to 'Date' format

$
0
0

I have a large data of scraped reports. Date information is in the text of the reports, and I have converted them to a character vector of the following format:

date_vec <- c("2001-4-31", "2000-12-31", "2003-6-31")

However, as can be seen in the example some of the reports have human errors, and when I try to convert them to "Date" format as.Date(date_vec) doesn't work, because "2001-4-31" and "2003-6-31" are not real dates (only 30 days in April and June).

I want to convert the data to "Date" format by approximating to the nearest Date value that makes sense so that I get something like the following:

date_vec
[1] "2001-4-30""2000-12-31""2003-6-30"

Other than a brute-force way of creating a list of common mistakes and checking for them, is there a good way to do that?


Viewing all articles
Browse latest Browse all 201894

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>