Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

How to update a data.frame based on information from another data.frame

$
0
0

I have two tables: Display and Review. The Review table contains information on reviews on products of an online store. Each row represents the date of the review as well as the cumulative number of reviews and the average rating for the product up to the date.

page_id<-c("1072659", "1072659" , "1072659","1072650","1072660","1072660")  
review_id<-c("1761023","1761028","1762361","1918387","1761427","1863914")
date<-as.Date(c("2013-07-11","2013-08-12","2014-07-15","2014-09-10","2013-07-27","2014-08-12"),format = "%Y-%m-%d")
cumulative_No_reviews<-c(1,2,3,1,1,2)
average_rating<-c(5,3.5,4,3,5,5)
Review<-data.frame(page_id,review_id,date,cumulative_No_reviews,average_rating)
page_id        review_id          date    cumulative_No_reviews   average_rating
1072659          1761023        2013-07-11      1                       5
1072659          1761028        2013-08-12      2                       3.5
1072659          1762361        2014-07-15      3                       4
1072650          1918387        2014-09-10      1                       3
1072660          1761427        2013-07-27      1                       5
1072660          1863914        2014-08-12      2                       5

The Display table captures the data on customers’ visit to product pages.

page_id<-c("1072659","1072659","1072659","1072650","1072650","1072660","1072660","1072660")
date<-as.Date(c("2013-07-10","2013-08-03","2015-02-11","2014-08-10","2014-09-09","2013-08-12","2014-09-12","2015-08-12"),format = "%Y-%m-%d")
Display<-data.frame(page_id,date)
page_id         date        
1072659     2013-07-10      
1072659     2013-08-03      
1072659     2015-02-11      
1072650     2014-08-10  
1072650     2014-09-09      
1072660     2013-08-12      
1072660     2014-09-12      
1072660     2015-08-12      

I’d like to add two column to the Display table (call it Display2) in a way that it reflects the latest review information up the point of visit for each product, as follows:

page_id<-c("1072659","1072659","1072659","1072650","1072650","1072660","1072660","1072660")
date<-as.Date(c("2013-07-10","2013-08-03","2015-02-11","2014-08-10","2014-09-09","2013-08-12","2014-09-12","2015-08-12"),format = "%Y-%m-%d")
cumulative_No_reviews<-c(0,1,3,0,0,1,2,2)
average_rating<-c(NA,5,4,NA,NA,5,5,5)
Display2<-data.frame(page_id,date,cumulative_No_reviews,average_rating)
 page_id            date        cumulative_No_reviews   average_rating
 1072659        2013-07-10                 0                NA
 1072659        2013-08-03                 1                5
 1072659        2015-02-11                 3                4
 1072650        2014-08-10                 0                NA
 1072650        2014-09-09                 0                NA
 1072660        2013-08-14                 1                5
 1072660        2014-09-11                 2                5
 1072660        2015-08-12                 2                5

I would appreciate your help with this.


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>