Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 205466

Find next 2 rows with same value and paste rows into new columns behind original entry with dplyr

$
0
0

This is the first time I am posting a question here, so please be gentle :)
I have a data frame with goal and corner statistics from the English football/soccer league (Premier League) with one game per row. head(premierleague) will give you something like this (made up data):

| Home          | Home_goals    | Away          | Away_goals    | Home_Corners  | Away_Corners  |
|------------   |------------   |-----------    |------------   |-------------- |-------------- |
| Tottenham     | 1             | Arsenal       | 0             | 5             | 2             |
| Man United    | 2             | Watford       | 1             | 7             | 4             |
| Man City      | 3             | West Ham      | 0             | 10            | 2             |
| Chelsea       | 2             | Arsenal       | 1             | 7             | 6             |
| Tottenham     | 4             | Norwich       | 1             | 6             | 0             |
| Man United    | 2             | Liverpool     | 2             | 4             | 7             |
| Tottenham     | 0             | Man City      | 2             | 3             | 8             |

I would like to find for each entry in the column Home (first one is in this case Tottenham) the next 2 matching entries (rows 5 and 7) and paste them into new columns in row 1.
I want to do this for each row in my data frame and also keep all rows. I just want to add the statistics of the next two games as new columns:
Home_2
Home_goals_2
Away_2 and so on.

I honestly have no idea how to even google this and as far as my experience goes with stackoverflow, I am sure some of you will solve that within minutes :) Any help is highly appreciated.

Thanks so much in advance
Philipp

Edit:

Don't really know if I can attach something here, but the dataframe is this:

premierleague <- data.frame("Home" = c("Tottenham", "ManUnited", "ManCity", "Chelsea", "Tottenham", "ManUnited", "Tottenham"), 
                            "Home_goals" = c(1,2,3,2,4,2,0), 
                            "Away" = c ("Arsenal", "Watford", "Westham", "Arsenal", "Norwich", "Liverpool", "ManCity"), 
                            "Away_goals" = c(0,1,0,1,1,2,2), 
                            "Home_corners" = c(5,7,10,7,6,4,3), 
                            "Away_corners" = c(2,4,2,6,0,7,8))

### The desired result looks like this

premierleague_new <- data.frame(
  "Home" = c("Tottenham", "ManUnited", "ManCity", "Chelsea", "Tottenham", "ManUnited", "Tottenham"), 
  "Home_goals" = c(1,2,3,2,4,2,0), 
  "Away" = c("Arsenal", "Watford", "Westham", "Arsenal", "Norwich", "Liverpool", "ManCity"), 
  "Away_goals" = c(0,1,0,1,1,2,2), 
  "Home_corners" = c(5,7,10,7,6,4,3), 
  "Away_corners" = c(2,4,2,6,0,7,8),
  "Home_goals_2" = c(4,2,NA,NA,0,NA,NA),
  "Away_2" = c("Norwich", "Liverpool",NA,NA,"ManCity",NA,NA),
  "Away_goal_2" = c(1,2,NA,NA,2,NA,NA),
  "Home_corn_2" = c(6,4,NA,NA,3,NA,NA),
  "Away_corn_2" = c(0,7,NA,NA,8,NA,NA),
  "Home_goal_3" = c(0,NA,NA,NA,NA,NA,NA),
  "Away_3" = c("ManCity",NA,NA,NA,NA,NA,NA),
  "Away_goal_3" = c(2,NA,NA,NA,NA,NA,NA),
  "Home_corners_3" = c(3,NA,NA,NA,NA,NA,NA),
  "Away_corners_3" = c(8,NA,NA,NA,NA,NA,NA)
                                 )

Tottenham is the only team with all in all 3 games, so all columns are filled for Tottenham in row 1.

The 2nd entry for Tottenham in row 5 has only values for the second game because there is only one below that with Tottenham as home team in this example.

I hope, it's clearer now. Should be at least reproduceable.


Viewing all articles
Browse latest Browse all 205466

Trending Articles