Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201894

Function to loop through partial dependence command in R?

$
0
0

I'm trying to write a function that will calculate the partial dependence of all the variables in a model and store them in a data frame. But I'm new to loops in R and I'm not sure how to achieve this. Below is some example code to explain what I'm trying to achieve.

Setting up the model:

x1 <- rnorm(100,0,1)
x2 <- rnorm(100,0,1)
x3 <- rnorm(100,0,1)
x4 <- rnorm(100,0,1)
x5 <- rnorm(100,0,1)
y <- x1*100 + x2*10
df <- data.frame(x1,x2,x3,x4,x5,y)

library(randomForest)
rf <- randomForest(y~., data=df)

Then I'm using the pdp package in R to calculate the partial dependence (pd). What I'm trying to achieve is to write a function that will calc the pd for each variable and then store those values in a data frame. For example, if I were to manually calc the pd for each variable I would do something like this:

library(pdp)
pdp  <- partial(rf, pred.var = "x1")
pdp2 <- partial(rf, pred.var = "x2")

:
 etc           
:
pdp5 <- partial(rf, pred.var = "x5")

and then create a df of the values and all the y-hats, like so:

pdpDF <- data.frame(pdp,pdp2,...,pdp5)

But I would like to automate the process. Im not sure how to do this in R. Very naively I would say it would look something like this:

xVars <- df[-6] # remove y
for (i in 1:length(xVars))
  pdpValues <- partial(rf, pred.var = xVars[I]) #calc pdp for each variable
  pdpVal <-cbind(all the pdpValues for each variable) #column bind all the values
  pdpDF<- data.frame(pdpVal) # Create df

but I have no idea how to make this work!? Any suggestions?


Viewing all articles
Browse latest Browse all 201894

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>