Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201867

parse dates from multiple columns with NAs and dates hidden in text

$
0
0

I have a data.frame with dates distributed across columns and in a messy format: the year column contains years and NAs, the column date_old contains the format Month DD or DD (or a date duration) or NAs, and the column hidden_date contains text and dates either in thee format .... YYYY .... or in the format .... DD Month YYYY .... (with .... representing general text of variable length).

An example data.frame looks like this:

df <- data.frame(year = c("1992", "1993", "1995", NA),
                 date_old = c("February 15", "October 02-24", "15", NA),
                 hidden_date = c(NA, NA, "The hidden date is 15 July 1995", "The hidden date is 2005"))

I want to get the dates in the format YYYY-MM-DD (take the first day of date durations) and fill unknown values with zeroes.

Using parse_date_time didn't help me so far, and the expected output would be:

  year      date_old                     hidden_date        date
1 1992   February 15                            <NA>  1992-02-15
2 1993 October 02-24                            <NA>  1993-10-02
3 1995            15 The hidden date is 15 July 1995  1995-07-15
4 <NA>          <NA>         The hidden date is 2005  2005-00-00

How do I best go about this?


Viewing all articles
Browse latest Browse all 201867

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>