The concept of the significance test as it appears in 'classical' statistics does not fall within the Bayesian philosophy. In classical statistics a hypothesis is tested by constructing an appropriate test statistic and obtaining the distribution of this statistic under a null hypothesis (e.g. the treatment difference is zero).
Acceptance of the null hypothesis is then obtained from the position of the test statistic within this 'null' distribution. Specifically, we calculate the probability under the null hypothesis of obtaining values of the test statistic which are as, or more, extreme than the observed value. Probabilities below 0.05 are often seen as sufficient evidence to reject a null hypothesis. The closest equivalent using Bayesian methods is achieved through the use of probability intervals. The value of a for the probability interval which has zero on the boundary can be used to provide a 'Bayesian' p-value to examine the plausibility that a parameter is zero. This is equivalent to a two-sided 'classical' p-value. However, it has the advantage of being exact and there are no potential inaccuracies in obtaining a test statistic (based on standard error estimates) or the degrees of freedom for its distribution.
2.3.4 Specifying non-informative prior distributions
We have given little indication so far of the distributional form that noninformative priors should take. The requirement is that a non-informative prior for a parameter should have minimal influence on the results obtained for that parameter. The theoretical background of how we set non-informative priors is not easily accessible to those without a background in Bayesian statistics, so we will outline the methods which can be used in the following section. Prior to that, though (excusing the pun), we will take the pragmatic approach and simply describe some distributions which have been suggested to provide non-informative priors for mixed models.
For the fixed effects t andp in the cross-over model), there are (at least) two suitable non-informative priors:
• uniform distribution (—to, to), p(0) a c, i.e. a flat prior,
We note that as K tends to to, so the normal distribution tends to the uniform (—to, to) distribution. For the practitioner there is the question of how big a number is very large? This depends on the scale on which observations are being recorded. Recording distances in millimetres gives larger numbers than recording in kilometres. The choice of K should be so that its square root is at least an order of magnitude larger than any of the observations.
For the variance components (a2 and a2 in the cross-over model), any of the following distributions may provide suitable non-informative priors:
• reciprocal distribution,p(0) a c/0(c = constant),
• inverse gamma distribution (K, K), where K isa very small number.
In this book we will not describe the inverse gamma distribution, other than to note that it is a two-parameter distribution, and that as the parameters tend to zero, the distribution tends to the reciprocal distribution. The practical guidance to the choice of K is again that it should be at least an order of magnitude smaller than the observations.
In practice, it often makes little difference to the results obtained from the posterior distribution, whichever of these priors is chosen. An exception is when the true value of a variance component is close to zero. Under these circumstances the posterior distribution arising from the uniform prior will differ with the alternative choices of the prior. We note, though, that many statisticians would be unlikely to choose the uniform prior for variance components, because it is known that variance components cannot be negative. However, it is this prior which leads to a posterior density that is exactly proportional to the likelihood!
We now introduce in more detail a general approach to the setting of noninformative priors. This section will be of greatest interest to those readers who wish to extend their knowledge of the Bayesian approach.
Was this article helpful?