Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201867

Is there a way to optimize this code, as this is taking hours to execute [closed]

$
0
0
for (i in 1:99653)
{
  for(j in 1:3226)
    {
    if (grepl(cdata$LegDigitsDialed[i],sdata$SavedPhone[j]) == TRUE)

        {
          cdata$category[i] = "Supplier"
          cdata$su_name[i] = sdata$sushortname[j]
        }

      else
        {
          cdata$category[i] = "Customer"
          cdata$su_name[i] = "Null"      
        }

    }
}

I have two data frames and I want to categorize each element of a column based on the presence in the second data frame.

My data looks like this:

>cdata
LegDigitsDialed
"a""b""c">sdata
SavedPhone
"aa""c"

What I want is;

LegDigitsDialed     category
"a""Supplier""b""Customer""c""Supplier"

So basically my pseudo code is

for (i=1,i<100000,i++)   for(j=1,j<3500,j++)
      {
        if (j contains i) //partial string matching
            populate i(different column) with some value
        else
            populate i(different column) with some other value
      }

this script in R has been running for over 24 hours now, and only one third of the records have been processed. Is there anyway to optimize this code.


Viewing all articles
Browse latest Browse all 201867

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>