Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 205491

Why is a keras neural network model with one output neuron so much better than the same network but with two output neurons?

$
0
0

Using the code below I train two networks. Both networks are identical but the second network (model2) has two output neurons. The target output for the first network (model1) is the sale price of some houses. The target output of the second network (model2) is also the sale price, but given twice for both outputs.

library(keras)
library(deepviz)
library(caret)

data(Sacramento)

x_train<-as.matrix(Sacramento[,c("beds","baths","sqft")])
x_train<-scale(x_train)
y_train<-log(Sacramento$price)

# model1, one output neuron
input<-layer_input(shape=3)
hidden<-layer_dense(input,units=4,activation="sigmoid",use_bias=T)
output<-layer_dense(hidden,units=1,activation="linear",use_bias=F)

model1<-keras_model(inputs=input, outputs=output)

model1 %>% compile(
  optimizer = "rmsprop",
  loss = 'mse',
  metrics = c('mean_squared_error')
)

model1 %>% fit(
  x_train, 
  y_train,
  epochs = 100,
  batch_size=10,
  validation_split=0.2
)

# model2, two output neurons
input<-layer_input(shape=3)
hidden<-layer_dense(input,units=4,activation="sigmoid",use_bias=F)
outputs<-list(
  layer_dense(hidden,units=1,activation="linear",use_bias=F),
  layer_dense(hidden,units=1,activation="linear",use_bias=F)
)

model2<-keras_model(inputs=input, outputs=outputs)

model2 %>% compile(
  optimizer = "rmsprop",
  loss = 'mse',
  metrics = c('mean_squared_error')
)

model2 %>% fit(
  x_train, 
  list(y_train,y_train),
  epochs = 100,
  batch_size=10,
  validation_split=0.2
)

The result of the first network (model1) is:

# loss: 0.1118 - mean_squared_error: 0.1118 - val_loss: 0.1124 - val_mean_squared_error: 0.1124

The result of the second network (model2) is:

# loss: 0.8307 - dense_84_loss: 0.4268 - dense_85_loss: 0.4115 - dense_84_mean_squared_error: 0.4223 - dense_85_mean_squared_error: 0.4084 - val_loss: 0.9867 - val_dense_84_loss: 0.4844 - val_dense_85_loss: 0.4878 - val_dense_84_mean_squared_error: 0.4918 - val_dense_85_mean_squared_error: 0.4950

Why is the performance of model2 substantially worse than model1? Shouldn't the performance of both models be approximately identical?

Here are the network weights of model2:

[[1]]
          [,1]       [,2]      [,3]        [,4]
[1,]  0.320372 -0.1731332 0.5624840 -0.47232226
[2,] -2.519146  0.2757542 0.8330284 -0.01051062
[3,] -1.838133  0.4940852 0.7297845  0.37787062

[[2]]
         [,1]
[1,] 5.708315
[2,] 6.818939
[3,] 5.347315
[4,] 6.232662

[[3]]
         [,1]
[1,] 6.026654
[2,] 5.976376
[3,] 6.360726
[4,] 5.751414

Obviously, the connections weights to the output neurons [[3]] and [[4]] are very different? Why is that? Shouldn't they be approximately identical?


Viewing all articles
Browse latest Browse all 205491

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>