2024-04-04 更新 🔗
- MLE_theta_hat is a RV because X is a RV. One realization of data gives the point estimate.
- its OK we only get one realization of MLE_theta. Theoretically, we are studying how MLE_theta will behave if we repeat the experiment. That is, we are more interested in E(MLE_theta). we know bias(MLE_theta) = E(MLE_theta - true theta) = E(MLE_theta) - true theta = 0, so we are good with the MLE_theta from one realization, because we know on average, MLE estimator give unbiased estimation of theta.
- see mle.R in static/R folder
2024-04-11 更新 🔗
- Fisher information is defined based on the log-likelihood of theta with 1 observation. The log-likelihood function values changes as X is a RV.X changes –> different log-likelihood function values –> different score values (first derivative) for all thetas –> fisher information takes the average/expectation of (score)^2. given theta(i.e., theta is a constant). we can compute the fisher information for different thetas for example, FI(theta=1); FI(theta=2); observed information refers to one realization.
- Cramer-Rao inequality gives the lower bound of variance: minimum variance of unbiased estimator; MLE give the smallest variance
set.seed(1017)
## population mu =1, one sample x
mu = 1; n =100; x = rnorm(n,mean = mu)
library(dplyr)
## mu candidates
mu_list =seq(-2,2,0.01)
## this is the likelihood value when observed data = x
likelihood_value <- sapply(mu_list,FUN = function(mu_cand){
### likelihood value under different mu
dnorm(x, ## this is observed data
mean= mu_cand ## x-axis
) %>%
prod() ## x1, x2, ...xn are iid, so multiply all pdf to get the total likelihood,
})
likelihood_dat <- data.frame(
theta = mu_list,
likelihood_value =likelihood_value
)
## theta is mu
MLE_theta <- likelihood_dat$theta[which.max(likelihood_dat$likelihood_value)]
library(ggplot2)
##
likelihood_dat %>%
ggplot(aes(x = theta,y = likelihood_value))+
geom_point(size=0.5)+
theme_bw
# this plot showed the MLE_theta under **one**
# experiment/realization (one random sample of X1 X2,...Xn),
# since X is a RV, if we repeat the random sampling process,
# we will get a distribution of MLE_theta, this is very important
# in practice, we only get one sample (one row of X1, X2, ..Xn),
# i.e., a sample point from the distribution of MLE_theta
## Let's repeat the experiment to
## get the distribution of MLE_theta
set.seed(0629)
MLE_thetas <- sapply(1:1000,FUN = function(i){
x <- rnorm(n,mean = mu) ## sample from the true population
## calculate the likelihood = prod of pdf under different mu_cand
likelihood_values <- sapply(mu_list,
FUN = function(mu_cand){
dnorm(x,mean = mu_cand) %>% prod()
})
## which mu_cand has the max. likelihood?
mu_list[which.max(likelihood_values)]
})
## we did the experiment for 1000 times, we
## now know the distribution of MLE_theta
density(MLE_thetas) %>% plot()
## we know one nice property of MLE_theta is unbiased
## so on average, E(MLE_theta) = true theta (bias=0)
mean(MLE_thetas)
#> mean(MLE_thetas)
# [1] 0.99903
MLE, Fisher information, and Cramer-Rao notes: Reference 1 and Reference 2