Statistical Hypothesis Testing - Definition of Terms

Definition of Terms

The following definitions are mainly based on the exposition in the book by Lehmann and Romano:

Statistical hypothesis: A statement about the parameters describing a population (not a sample).
Statistic: A value calculated from a sample, often to summarize the sample for comparison purposes.
Simple hypothesis: Any hypothesis which specifies the population distribution completely.
Composite hypothesis: Any hypothesis which does not specify the population distribution completely.
Null hypothesis (H₀): A simple hypothesis associated with a contradiction to a theory one would like to prove.
Alternative hypothesis (H₁): A hypothesis (often composite) associated with a theory one would like to prove.
Statistical test: A procedure whose inputs are samples and whose result is a hypothesis.
Region of acceptance: The set of values of the test statistic for which we fail to reject the null hypothesis.
Region of rejection / Critical region: The set of values of the test statistic for which the null hypothesis is rejected.
Critical value: The threshold value delimiting the regions of acceptance and rejection for the test statistic.
Power of a test (1 − β): The test's probability of correctly rejecting the null hypothesis. The complement of the false negative rate, β. Power is termed sensitivity in biostatistics. ("This is a sensitive test. Because the result is negative, we can confidently say that the patient does not have the condition.") See sensitivity and specificity and Type I and type II errors for exhaustive definitions.
Size / Significance level of a test (α): For simple hypotheses, this is the test's probability of incorrectly rejecting the null hypothesis. The false positive rate. For composite hypotheses this is the upper bound of the probability of rejecting the null hypothesis over all cases covered by the null hypothesis. The complement of the false positive rate, (1 − α), is termed specificity in biostatistics. ("This is a specific test. Because the result is positive, we can confidently say that the patient has the condition.") See sensitivity and specificity and Type I and type II errors for exhaustive definitions.
p-value: The probability, assuming the null hypothesis is true, of observing a result at least as extreme as the test statistic.
Statistical significance test: A predecessor to the statistical hypothesis test (see the Origins section). An experimental result was said to be statistically significant if a sample was sufficiently inconsistent with the (null) hypothesis. This was variously considered common sense, a pragmatic heuristic for identifying meaningful experimental results, a convention establishing a threshold of statistical evidence or a method for drawing conclusions from data. The statistical hypothesis test added mathematical rigor and philosophical consistency to the concept by making the alternative hypothesis explicit. The term is loosely used to describe the modern version which is now part of statistical hypothesis testing.
Conservative test: A test is conservative if, when constructed for a given nominal significance level, the true probability of incorrectly rejecting the null hypothesis is never greater than the nominal level.
Exact test: a test in which the significance level or critical value can be computed exactly and without any approximation. In some contexts this term is restricted to tests applied to categorical data and to permutation tests, in which computations are carried out by complete enumeration of all possible outcomes and their probabilities.

A statistical hypothesis test compares a test statistic (z or t for examples) to a threshold. The test statistic (the formula found in the table below) is based on optimality. For a fixed level of Type I error rate, use of these statistics minimizes Type II error rates (equivalent to maximizing power). The following terms describe tests in terms of such optimality:

Most powerful test: For a given size or significance level, the test with the greatest power (probability of rejection) for a given value of the parameter(s) being tested, contained in the alternative hypothesis.
Uniformly most powerful test (UMP): A test with the greatest power for all values of the parameter(s) being tested, contained in the alternative hypothesis.

Read more about this topic: Statistical Hypothesis Testing

Famous quotes containing the words definition of, definition and/or terms:

“I’m beginning to think that the proper definition of “Man” is “an animal that writes letters.””
—Lewis Carroll [Charles Lutwidge Dodgson] (1832–1898)

“Was man made stupid to see his own stupidity?
Is God by definition indifferent, beyond us all?
Is the eternal truth man’s fighting soul
Wherein the Beast ravens in its own avidity?”
—Richard Eberhart (b. 1904)

“Theology—An effort to explain the unknowable by putting it into terms of the not worth knowing.”
—H.L. (Henry Lewis)

Related Phrases

Error Rate

Hypothesis Testing

Hypothesized Population

Null Hypothesis

Reject Null Hypothesis

Sample Standard Deviation

Statistical Hypothesis Test

Test Statistic

Type II

White Beans