Interpretation
In the definitions above, the functions and were apparently arbitrarily defined. However, these functions play a significant role in the resulting probability distribution.
- is a sufficient statistic of the distribution. For exponential families, the sufficient statistic is a function of the data that fully summarizes the data within the density function. While other data sets may be quite different if then the density value is the same. The dimension of equals the number of parameters of and encompasses all of the information regarding the data related to the parameter . The sufficient statistic of a set of independent identically distributed data observations is simply the sum of individual sufficient statistics, and encapsulates all the information needed to describe the posterior distribution of the parameters, given the data (and hence to derive any desired estimate of the parameters). This important property is further discussed below.
- is called the natural parameter. The set of values of for which the function is finite is called the natural parameter space. It can be shown that the natural parameter space is always convex.
- is called the log-partition function because it is the logarithm of a normalization factor, without which would not be a probability distribution ("partition function" is often used in statistics as a synonym of "normalization factor"):
The function A is important in its own right, because the mean, variance and other moments of the sufficient statistic can be derived simply by differentiating . For example, because is one of the components of the sufficient statistic of the gamma distribution, can be easily determined for this distribution using . (Technically, this is true because is the cumulant generating function of the sufficient statistic.)
Read more about this topic: Exponential Family