Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201894

R auto compute features log and poly data matrix

$
0
0

Background

I'm writing my own function. The function takes a dataframe input with unfixed number of features. Also, the features' type maybe different,i.e. numeric,factor and chr.

I want to maximise my likelihood function, which is support on a extended data matrix, with each features'log transformation and up to quadratic orders, e.g. columns interception + log(feature1) + feature1 + feature1^2 + log(feature2)+..+ feature1*feature2 + ... + feature_{n-1}*feature_n

Take bulit-in dataset iris as an example:

Code:

str(iris)

Out:

'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

As we can see, the first 4 features,from Sepal.length to Petal.Width, are numeric. I want to bulid a model on their log to quadratic orders. So I want to output a data matrix like following:

Code:

colnames(model.matrix(~ 1+ log(Sepal.Length) + poly(Sepal.Length,degree = 2)
+ log(Sepal.Width) + poly(Sepal.Width,degree = 2) +
log(Petal.Length) + poly(Petal.Length,degree = 2)+
log(Petal.Width) + poly(Petal.Width,degree = 2)+Sepal.Length*Sepal.Width +  Sepal.Length*Petal.Length+ Sepal.Length*Petal.Width + Sepal.Width*Petal.Length +Sepal.Width*Petal.Width + Petal.Length*Petal.Width,data = iris))

Out:

 [1] "(Intercept)""log(Sepal.Length)"              
 [3] "poly(Sepal.Length, degree = 2)1""poly(Sepal.Length, degree = 2)2"
 [5] "log(Sepal.Width)""poly(Sepal.Width, degree = 2)1" 
 [7] "poly(Sepal.Width, degree = 2)2""log(Petal.Length)"              
 [9] "poly(Petal.Length, degree = 2)1""poly(Petal.Length, degree = 2)2"
[11] "log(Petal.Width)""poly(Petal.Width, degree = 2)1" 
[13] "poly(Petal.Width, degree = 2)2""Sepal.Length"                   
[15] "Sepal.Width""Petal.Length"                   
[17] "Petal.Width""Sepal.Length:Sepal.Width"       
[19] "Sepal.Length:Petal.Length""Sepal.Length:Petal.Width"       
[21] "Sepal.Width:Petal.Length""Sepal.Width:Petal.Width"        
[23] "Petal.Length:Petal.Width"

The problem

The problem is that using poly to type formula from scratch is not wisely, especially when we have hundreds features! My function should treat dataframe automaticlly.

I know model.matrix can extend my original dataset, like iris. model.matrix can even auto dealing with factor and chr features, converting them to dummy variables. And poly can extend feature to high orders but do not provide log transformations.

My question is how to get log and up to quadratic order transformation of any given dataframe automatically. I want to share my new model with others, so I think my function should be replicable on any others' data sets.


Viewing all articles
Browse latest Browse all 201894

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>