Precautions in The Usage of Dummy Variables
1. If one dummy variable (that has been introduced as an explanatory variable) has n categories, it is important that only (n − 1) dummy variables are introduced. For example, if the dummy variable is gender, there are 2 categories (female / male). A dummy should be created only for one category, either female or male, and not both. The regression equation should incorporate that dummy along with the intercept term. If this rule is not complied with, the regression would lead to a dummy variable trap. A dummy variable trap is a situation where there is perfect collinearity or perfect multicollinearity between the variables, i.e., there would be an exact linear relationship among the variables. This can be explained as: The value of the intercept is implicitly given as 1 for every observation. Suppose the columns of all the n dummy categories under the qualititative variable are added up. This sum will produce exactly the intercept column as it is. This is the perfect collinearity situation that leads to the dummy variable trap. The way to avoid this is to introduce (n − 1) dummies + the intercept term OR to introduce n dummies and no intercept term. In both these cases there will be no linear relationship among the explanatory variables. However, the latter case is not recommended because it will make it difficult to test the dummy categories for differences from the base category. Hence, the (n − 1) dummies + intercept term approach is the most advised approach to form the regression model.
2. The category for which the dummy is excluded or not assigned is known as the base group or the benchmark category. This category is the omitted category and all comparisons are made against this category. That is why it is also known as the comparison, reference or control category. This category should be carefully identified and omitted from the assignment of dummy variables.
3. Since the base category does not have a dummy variable, the mean value of this category is equal to the intercept term itself. The value of this intercept term will thus be the value against which the categories having dummies should be compared.
4. For comparison against the benchmark category, the "slope" coefficients of the dummy variables in the regression equation are considered. These "slope" coefficients are called differential intercept coefficients as they indicate by how much the mean value of the dummy categories differ from the mean value of the benchmark category (which is equal to the intercept).
5. The choice of the benchmark category is completely at the discretion of the researcher. The researcher must take precaution to ensure that the intercept term is equal to the mean value of the benchmark category and all other dummy categories are compared against this benchmark category.
Read more about this topic: Dummy Variable (statistics)
Famous quotes containing the words precautions, usage, dummy and/or variables:
“It is so manifestly incompatible with those precautions for our peace and safety, which all the great powers habitually observe and enforce in matters affecting them, that a shorter water way between our eastern and western seaboards should be dominated by any European government, that we may confidently expect that such a purpose will not be entertained by any friendly power.”
—Benjamin Harrison (18331901)
“I am using it [the word perceive] here in such a way that to say of an object that it is perceived does not entail saying that it exists in any sense at all. And this is a perfectly correct and familiar usage of the word.”
—A.J. (Alfred Jules)
“Fathers and Sons is not only the best of Turgenevs novels, it is one of the most brilliant novels of the nineteenth century. Turgenev managed to do what he intended to do, to create a male character, a young Russian, who would affirm histhat charactersabsence of introspection and at the same time would not be a journalists dummy of the socialistic type.”
—Vladimir Nabokov (18991977)
“Science is feasible when the variables are few and can be enumerated; when their combinations are distinct and clear. We are tending toward the condition of science and aspiring to do it. The artist works out his own formulas; the interest of science lies in the art of making science.”
—Paul Valéry (18711945)