Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 206700

Kmeans: Wrong size of clusters

$
0
0

I am running Kmeans algorithm in R on Heart Disease UCI dataset. I am supposed to get 2 clusters with 138 165 size for each like what in the data set.

Steps:

  1. Store dataset in a data frame:
df <- read.csv(".../heart.csv",fileEncoding = "UTF-8-BOM")
  1. Extract the features:
features = subset(df, select = -target)
  1. Normalize it:
normalize <- function(x) {
  return ((x - min(x)) / (max(x) - min(x)))
}

features = data.frame(sapply(features, normalize))
  1. Run the algorithm:
set.seed(0)
cluster = kmeans(features, 2)
cluster$size

Output:

[1]  99 204

Why?


Viewing all articles
Browse latest Browse all 206700

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>