\[P(H \mid D) = \frac{P(D \mid H) \cdot P(H)}{P(D)}\]

the posterior is:

\[P(H|D)\]

the likelihood is:

\[P(D|H)\]

marginal likelihood \(P(D)\)

prior \(P(H)\)

Rosetta Stone

Hypothesis

  • event
  • proposition
  • statement

a propostion is something that can be true or false e.g. my age is 47

Likelihood

  • sampling distribution
  • probability model for data
  • generative model
  • Likelihood function

Marginal Likelihood

  • evidence
  • prior predictive probability
  • normalising constand

In Bayesian statistics, the normalizing constant is often referred to as the “prior predictive probability” because it represents the probability of observing the data under the prior distribution, before any data is actually observed. Here’s why:

  1. Role in Bayes’ Theorem: In Bayes’ Theorem, the normalizing constant ensures that the posterior distribution is a valid probability distribution. It is calculated as the integral (or sum) of the likelihood times the prior over all possible parameter values.

  2. Prior Predictive Distribution: This constant is the marginal likelihood of the data, obtained by integrating the product of the likelihood and the prior over all parameter values. It reflects the probability of the data under the model, considering all possible parameter values weighted by their prior probabilities.

  3. Interpretation: By evaluating how well the model, with its prior beliefs, predicts the observed data, the prior predictive probability provides a measure of model fit before updating beliefs with the actual data.

Thus, the normalizing constant is called the prior predictive probability because it quantifies the likelihood of the data given the prior distribution, serving as a bridge between prior beliefs and observed evidence.


This site uses Just the Docs, a documentation theme for Jekyll.