Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201867

Evaluate a condition in a row and return a certain value if the condition is met

$
0
0

So I am somewhat new to R and I'm trying to return a column value based on a condition which is proving rather difficult for me to figure out!

I have two databases I'm working with - One is generally 70 000 rows with a single unix time number in each row (lets call it df1). The other provides a start time and finish time (which I have converted to a unix time number) and an activity name completed between the start and finish time for multiple participants (lets say df2).

I have managed to filter df2 to the participant who's data I am using in df1 which looks like this:

head(df2, 5)
      Name       Period.Name   Start.Time    End.Time    Unix.Start.Time    Unix.End.Time
27  Name 1          Period 1     17:59:40    18:11:00         1579075181       1579075860
53  Name 1          Period 2     18:11:59    18:15:13         1579075919       1579076114
79  Name 1          Period 3     18:17:55    18:23:22         1579076275       1579076603
96  Name 1          Period 4     18:24:58    18:31:56         1579076699       1579077116
131 Name 1          Period 5     18:37:45    18:45:30         1579077465       1579077930

and df1 looks like this:

head(df1, 20)
   data.point     Label    Timestamp     Name   dateCode
1           0   Label 1   1579075180   Name 1     200115
2           1   Label 1   1579075181   Name 1     200115
3           1   Label 1   1579075182   Name 1     200115
4           2   Label 1   1579075183   Name 1     200115
5           2   Label 1   1579075184   Name 1     200115
6           2   Label 1   1579075185   Name 1     200115
7           1   Label 1   1579075186   Name 1     200115
8           1   Label 1   1579075187   Name 1     200115
9           1   Label 1   1579075188   Name 1     200115
10          3   Label 1   1579075189   Name 1     200115
11          3   Label 1   1579075190   Name 1     200115
12          3   Label 1   1579075191   Name 1     200115
13          3   Label 1   1579075192   Name 1     200115
14          4   Label 1   1579075193   Name 1     200115
15          4   Label 1   1579075194   Name 1     200115
16          4   Label 1   1579075195   Name 1     200115
17          2   Label 1   1579075196   Name 1     200115
18          2   Label 1   1579075197   Name 1     200115
19          1   Label 1   1579075198   Name 1     200115
20          0   Label 1   1579075199   Name 1     200115

I am trying to create a new column in df1 which returns the respective period name from df2$Period.Name if the df1$Timestamp value is between df2$Unix.Start.Time and df2$Unix.End.Time to look like this:

   data.point     Label    Timestamp     Name   dateCode    Period
1           0   Label 1   1579075180   Name 1     200115      Null
2           1   Label 1   1579075181   Name 1     200115  Period 1
3           1   Label 1   1579075182   Name 1     200115  Period 1
4           2   Label 1   1579075183   Name 1     200115  Period 1
5           2   Label 1   1579075184   Name 1     200115  Period 1
6           2   Label 1   1579075185   Name 1     200115  Period 1
7           1   Label 1   1579075186   Name 1     200115  Period 1
8           1   Label 1   1579075187   Name 1     200115  Period 1
9           1   Label 1   1579075188   Name 1     200115  Period 1
10          3   Label 1   1579075189   Name 1     200115  Period 1
...
1001        3   Label 1   1579075916   Name 1     200115      Null
1002        3   Label 1   1579075917   Name 1     200115      Null
1003        3   Label 1   1579075918   Name 1     200115      Null
1004        4   Label 1   1579075919   Name 1     200115  Period 2
1005        4   Label 1   1579075920   Name 1     200115  Period 2
1006        4   Label 1   1579075921   Name 1     200115  Period 2
1007        2   Label 1   1579075922   Name 1     200115  Period 2
1008        2   Label 1   1579075923   Name 1     200115  Period 2
1009        1   Label 1   1579075924   Name 1     200115  Period 2
1010        0   Label 1   1579075925   Name 1     200115  Period 2

This is a process completed a few times a week and each time the length of both data frames is different and the time stamp is also different.

I have tried the ifelse function but haven't been able to figure out how to evaluate the df1$Timestamp value across all the df2 unix time points and return the row value from period name where the df1$Timestamp fits.

Thanks in advance!


Viewing all articles
Browse latest Browse all 201867

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>