An Introduction to the Blinder-Oaxaca Decomposition
Karen Tao, Researcher
August 11, 2021
The Blinder-Oaxaca decomposition is often used to study labor-market outcomes by groups, such as the gender wage gap experienced by women. For example, wages can be modeled as the dependent variable, while the explanatory variables may include education and prior experiences. This method shows how the differences in a dependent variable, such as wages, can be separated into explained and unexplained portions by looking at the explanatory variables.
Let’s say you are studying the weight differences between men and women. You collected data from individuals who disclosed their height, age, mothers’ weight, fathers’ weight, whether they are smokers, the number of hours of physical activity they engage in each week, and caloric intake each week. These are our explanatory variables that may affect the outcome variable of weight.
Let’s imagine that during your data analysis, you found the mean body weight for men was 50 pounds heavier than the mean body weight of women. This finding is the mean difference on which the Blinder-Oaxaca decomposition focuses. In accomplishing this analysis, separate regressions are built for men and women, with weight being the dependent variable. Thus, the regression coefficients can be seen as estimates of how body weight changes for one unit of change in the independent variables. The raw difference in weight can then be decomposed into a portion attributed to differences in the independent variables, called the explained difference, and an unexplained part of the difference.
For example, you may learn from the decomposition that an individual’s weight increases by 5 pounds with each additional inch in height. This change is part of the explained difference. You may then look at your analysis and realize that the mean height for men in your sample is 3 inches more than the mean height of women. This “explains” 15 pounds of the 50 pounds difference between men and women. You continue to examine the remaining coefficients for each of your independent variables. After you study each of the variables, you learn the coefficients can explain 40 pounds of the mean weight difference for your sample, leaving 10 pounds as the unexplained portion of the 50-pound difference.
As the name suggests, the independent variables selected cannot explain this 10 pounds portion of the difference. It is possible that you were unable to collect data on a variable that may explain this difference, for example, social media usage. Influencers on social media often are perceived as ideal, resulting in the followers wanting to gain or lose weight to conform to the beauty standards of their community. This tendency may impact men and women differently. In addition, attitudinal differences in weight gain for men and women may not be easily quantified. For example, overweight women may have a harder time shopping for outfits than overweight men when society offers “big and tall” stores for men but not for women. Overall, the unexplained portion of the Blinder-Oaxaca decomposition represents the part of the mean differences that cannot be explained by the independent variables examined.
It is important to see how attitude impacts body weight in this example. Attitude or biases in society may impact how much time someone exercises or chooses to include in their diet. In a gender wage gap study example, societal attitude may affect whether girls enter into a STEM program, directly impacting the type of education girls attain and their potential wage outcome.
Recently, the UDRC completed a research project on the gender wage gap and built models using the Blinder-Oaxaca decomposition to study the wage differences between men and women. Check back for our research report to learn what we found using the Blinder-Oaxaca decomposition.