Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

R: Test for overlap of name values in dataframe

$
0
0

I have a dataframe filled with names.

For a given row in the dataframe, I'd like to compare that row to every row above it in the df and determine if the number of matching names is less than or equal to 4 for every row.

Toy Example where row 3 is the row of interest

  1. "Jim","Dwight","Michael","Andy","Stanley","Creed"

  2. "Jim","Dwight","Angela","Pam","Ryan","Jan"

  3. "Jim","Dwight","Angela","Pam","Creed","Ryan"<--- row of interest

So first we'd compare row 3 to row 1 and see that the name overlap is 3, which meets the <= 4 criteria.

Then we'd compare row 3 to row 2 and see that the name overlap is 5 which fails the <= 4 criteria, ultimately returning a failed condition for being <=4 for every row above it.

Right now I am doing this operation using a for loop but the speed is much too slow for the dataframe size I am working with.


Viewing all articles
Browse latest Browse all 201839

Trending Articles