Beta Distribution - Parameter Estimation - Fisher Information Matrix

Fisher Information Matrix

Let a random variable X have a probability density f(x;α). The partial derivative with respect to the (unknown, and to be estimated) parameter α of the log likelihood function is called the score. The second moment of the score is called the Fisher information:


\mathcal{I}(\alpha)=\operatorname{E} ,

The expectation of the score is zero, therefore the Fisher information is also the second moment centered around the mean of the score: the variance of the score.

If the log likelihood function is twice differentiable with respect to the parameter α, and under certain regularity conditions, then the Fisher information may also be written as follows (which is often a more convenient form for calculation purposes):


\mathcal{I}(\alpha) = - \operatorname{E} \,.

Thus, the Fisher information is the negative of the expectation of the second derivative with respect to the parameter α of the log likelihood function. Therefore Fisher information is a measure of the curvature of the log likelihood function of α. A low curvature (and therefore high radius of curvature), flatter log likelihood function curve has low Fisher information; while a log likelihood function curve with large curvature (and therefore low radius of curvature) has high Fisher information. When the Fisher information matrix is computed at the evaluates of the parameters ("the observed Fisher information matrix") it is equivalent to the replacement of the true log likelihood surface by a Taylor's series approximation, taken as far as the quadratic terms. The word information, in the context of Fisher information, refers to information about the parameters. Information such as: estimation, sufficiency and properties of variances of estimators. The Cramér–Rao bound states that the inverse of the Fisher information is a lower bound on the variance of any estimator of a parameter α:


\operatorname{var}\left \, \geq \, \frac{1}{\mathcal{I}\left(\alpha\right)}.

The precision to which one can estimate the estimator of a parameter α is limited by the Fisher Information of the log likelihood function. The Fisher information is a measure of the minimum error involved in estimating a parameter of a distribution and it can be viewed as a measure of the resolving power of an experiment needed to discriminate between two alternative hypothesis of a parameter.

When there are N parameters, then the Fisher information takes the form of an NxN positive semidefinite symmetric matrix, the Fisher Information Matrix, with typical element:

Under certain regularity conditions, the Fisher Information Matrix may also be written in the following form, which is often more convenient for computation:

With iid random variables, an N-dimensional "box" can be constructed with sides . Costa and Cover show that the (Shannon) differential entropy h(X) is related to the volume of the typical set (having the sample entropy close to the true entropy), while the Fisher information is related to the surface of this typical set.

Read more about this topic:  Beta Distribution, Parameter Estimation

Famous quotes containing the words fisher, information and/or matrix:

    One ... aspect of the case for World War II is that while it was still a shooting affair it taught us survivors a great deal about daily living which is valuable to us now that it is, ethically at least, a question of cold weapons and hot words.
    —M.F.K. Fisher (1908–1992)

    The information links are like nerves that pervade and help to animate the human organism. The sensors and monitors are analogous to the human senses that put us in touch with the world. Data bases correspond to memory; the information processors perform the function of human reasoning and comprehension. Once the postmodern infrastructure is reasonably integrated, it will greatly exceed human intelligence in reach, acuity, capacity, and precision.
    Albert Borgman, U.S. educator, author. Crossing the Postmodern Divide, ch. 4, University of Chicago Press (1992)

    “The matrix is God?”
    “In a manner of speaking, although it would be more accurate ... to say that the matrix has a God, since this being’s omniscience and omnipotence are assumed to be limited to the matrix.”
    “If it has limits, it isn’t omnipotent.”
    “Exactly.... Cyberspace exists, insofar as it can be said to exist, by virtue of human agency.”
    William Gibson (b. 1948)