general rule: 通常根据研究目的来选择相应的prior (non-informative and informative)。
non-informative: 🔗
-
intended for use in situations where scientific objectivity is at a premium, for example, when presenting results to a regulator or in a scientific journal, and essentially means the Bayesian apparatus is being used as a convenient way of dealing with complex multi-dimensional models.
-
Though conjugate priors are computationally nice, objective Bayesians instead prefer priors which do not strongly influence the posterior distribution. Such a prior is called an uninformative prior.
-
The term “non-informative” is misleading, since all priors contain some information, so such priors are generally better referred to as “vague” or “diffuse.”
Problems of using uniform as informative prior 🔗
- a uniform distribution for $\theta$ does not generally imply a uniform distribution for functions of $\theta$ .(works for discrete cases, but not continuum)
- This becomes more problematic in higher dimensions: the uniform prior in large dimension does not integrate anymore. In addition, the flat prior becomes very informative: it tells that most of the probability mass lies at +∞, far from the origin.
find prior distribution that have a minimal impact as possible on the data.
Jeffery prior 🔗
Harold Jeffrey proposed prior which would be invariant to 1-1 monotone transformation.
informative: 🔗
- the use of informative prior distributions explicitly acknowledges that the analysis is based on more than the immediate data in hand whose relevance to the parameters of interest is modelled through the likelihood, and also includes a considered judgement concerning plausible values of the parameters based on external information.
References: (reading in the following order is highly recommended)
[1] Harvard Statistics110 Lecture 10: 5-11min
[2] Harvard Statistics110 Lecture 14: 42-48min
[3] Standford CS109 Lecture 06, 2020