I'm trying to using simulations, draw 1 to 1000 samples from a population with a mean of 50 and a standard deviation of 10. Calculate the mean of each sample, and make a plot that shows how that mean changes as you increase your sample size. Based on that plot, is the mean a biased or an unbiased estimate of the population mean?
Below is what I have done:
sd_uncorrected<-function(x){ return(sqrt(sum((x-mean(x))^2)/length(x)))
}
population <- rnorm(n = 1000, mean = 50, sd = 10)
population_mean <- mean(population)
population_std <- sd_uncorrected(population)
paste('population mean=',population_mean)
paste('population std = ', population_std)
sample_size <- 1000 # how many elements we want to sample
sample_n <- sample(population, size = sample_size, replace = FALSE)
sample_n
mean(sample_n)
sd_uncorrected(sample_n)
n_experiments <- 1000 # we will sample 1000 times
sample_size <- 10 # how many elements we want to sample?
sample_means <- c()
library(ggplot2)
sample_means_df <- data.frame(means=sample_means)
ggplot(sample_means_df, aes(x=means)) + geom_histogram() +
geom_vline(xintercept = population_mean, color='red') + # population mean
geom_vline(xintercept = mean(sample_means_df$means), color='black')
I'm getting the follow error message and I don't know what I need to do. Can someone please help me?
Error in FUN(X[[i]], ...) : object 'means' not found
In addition: Warning message:
In mean.default(sample_means_df$means) :
argument is not numeric or logical: returning NA