Fisher Information Matrix
Let a random variable X have a probability density f(x;α). The partial derivative with respect to the (unknown, and to be estimated) parameter α of the log likelihood function is called the score. The second moment of the score is called the Fisher information:
The expectation of the score is zero, therefore the Fisher information is also the second moment centered around the mean of the score: the variance of the score.
If the log likelihood function is twice differentiable with respect to the parameter α, and under certain regularity conditions, then the Fisher information may also be written as follows (which is often a more convenient form for calculation purposes):
Thus, the Fisher information is the negative of the expectation of the second derivative with respect to the parameter α of the log likelihood function. Therefore Fisher information is a measure of the curvature of the log likelihood function of α. A low curvature (and therefore high radius of curvature), flatter log likelihood function curve has low Fisher information; while a log likelihood function curve with large curvature (and therefore low radius of curvature) has high Fisher information. When the Fisher information matrix is computed at the evaluates of the parameters ("the observed Fisher information matrix") it is equivalent to the replacement of the true log likelihood surface by a Taylor's series approximation, taken as far as the quadratic terms. The word information, in the context of Fisher information, refers to information about the parameters. Information such as: estimation, sufficiency and properties of variances of estimators. The Cramér–Rao bound states that the inverse of the Fisher information is a lower bound on the variance of any estimator of a parameter α:
The precision to which one can estimate the estimator of a parameter α is limited by the Fisher Information of the log likelihood function. The Fisher information is a measure of the minimum error involved in estimating a parameter of a distribution and it can be viewed as a measure of the resolving power of an experiment needed to discriminate between two alternative hypothesis of a parameter.
When there are N parameters, then the Fisher information takes the form of an NxN positive semidefinite symmetric matrix, the Fisher Information Matrix, with typical element:
Under certain regularity conditions, the Fisher Information Matrix may also be written in the following form, which is often more convenient for computation:
With iid random variables, an N-dimensional "box" can be constructed with sides . Costa and Cover show that the (Shannon) differential entropy h(X) is related to the volume of the typical set (having the sample entropy close to the true entropy), while the Fisher information is related to the surface of this typical set.
Read more about this topic: Beta Distribution, Parameter Estimation
Famous quotes containing the words fisher, information and/or matrix:
“It seems to me that our three basic needs, for food and security and love, are so mixed and mingled and entwined that we cannot straightly think of one without the others. So it happens that when I write of hunger, I am really writing about love and the hunger for it, and warmth and the love of it and the hunger for it ... and then the warmth and richness and fine reality of hunger satisfied ... and it is all one.”
—M.F.K. Fisher (b. 1908)
“As information technology restructures the work situation, it abstracts thought from action.”
—Shoshana Zuboff (b. 1951)
“The matrix is God?
In a manner of speaking, although it would be more accurate ... to say that the matrix has a God, since this beings omniscience and omnipotence are assumed to be limited to the matrix.
If it has limits, it isnt omnipotent.
Exactly.... Cyberspace exists, insofar as it can be said to exist, by virtue of human agency.”
—William Gibson (b. 1948)