Insensitivity To The Type of Sampling
If the data form a "population sample", then the cell probabilities p̂ij are interpreted as the frequencies of each of the four groups in the population as defined by their X and Y values. In many settings it is impractical to obtain a population sample, so a selected sample is used. For example, we may choose to sample units with X = 1 with a given probability f, regardless of their frequency in the population (which would necessitate sampling units with X = 0 with probability 1 − f). In this situation, our data would follow the following joint probabilities:
Y = 1 | Y = 0 | |
X = 1 | ||
X = 0 |
The odds ratio p11p00 / p01p10 for this distribution does not depend on the value of f. This shows that the odds ratio (and consequently the log odds ratio) is invariant to non-random sampling based on one of the variables being studied. Note however that the standard error of the log odds ratio does depend on the value of f. This fact is exploited in two important situations:
- Suppose it is inconvenient or impractical to obtain a population sample, but it is practical to obtain a convenience sample of units with different X values, such that within the X = 0 and X = 1 subsamples the Y values are representative of the population (i.e. they follow the correct conditional probabilities).
- Suppose the marginal distribution of one variable, say X, is very skewed. For example, if we are studying the relationship between high alcohol consumption and pancreatic cancer in the general population, the incidence of pancreatic cancer would be very low, so it would require a very large population sample to get a modest number of pancreatic cancer cases. However we could use data from hospitals to contact most or all of their pancreatic cancer patients, and then randomly sample an equal number of subjects without pancreatic cancer (this is called a "case-control study").
In both these settings, the odds ratio can be calculated from the selected sample, without biasing the results relative to what would have been obtained for a population sample.
Read more about this topic: Odds Ratio
Famous quotes containing the word type:
“Lise: Look, monsieur, I dont know what type of girl you think I am, but Im not. And now I would like to return to my friends.
Jerry: I thought you were bored with them. You sure looked it.
Lise: You should see me now.
Jerry: Ouch.”
—Alan Jay Lerner (19181986)