In fact, what we have described here is actually a best case scenario, in which it is possible to enforce fairness by making simple changes that affect performance for each group. In practice, fairness algorithms may behave much more radically and unpredictably. This survey found that, on average, most algorithms in computer vision improved fairness by harming all groups—for example, by decreasing recall and accuracy. Unlike in our hypothetical, where we have decreased the harm suffered by one group, it is possible that leveling down can make everyone directly worse off. 

Leveling down runs counter to the objectives of algorithmic fairness and broader equality goals in society: to improve outcomes for historically disadvantaged or marginalized groups. Lowering performance for high performing groups does not self-evidently benefit worse performing groups. Moreover, leveling down can harm historically disadvantaged groups directly. The choice to remove a benefit rather than share it with others shows a lack of concern, solidarity, and willingness to take the opportunity to actually fix the problem. It stigmatizes historically disadvantaged groups and solidifies the separateness and social inequality that led to a problem in the first place.

When we build AI systems to make decisions about people’s lives, our design decisions encode implicit value judgments about what should be prioritized. Leveling down is a consequence of the choice to measure and redress fairness solely in terms of disparity between groups, while ignoring utility, welfare, priority, and other goods that are central to questions of equality in the real world. It is not the inevitable fate of algorithmic fairness; rather, it is the result of taking the path of least mathematical resistance, and not for any overarching societal, legal, or ethical reasons. 

To move forward we have three options: 

• We can continue to deploy biased systems that ostensibly benefit only one privileged segment of the population while severely harming others. 
• We can continue to define fairness in formalistic mathematical terms, and deploy AI that is less accurate for all groups and actively harmful for some groups. 
• We can take action and achieve fairness through “leveling up.” 

We believe leveling up is the only morally, ethically, and legally acceptable path forward. The challenge for the future of fairness in AI is to create systems that are substantively fair, not only procedurally fair through leveling down. Leveling up is a more complex challenge: It needs to be paired with active steps to root out the real life causes of biases in AI systems. Technical solutions are often only a Band-aid to deal with a broken system. Improving access to health care, curating more diverse data sets, and developing tools that specifically target the problems faced by historically disadvantaged communities can help make substantive fairness a reality.

This is a much more complex challenge than simply tweaking a system to make two numbers equal between groups. It may require not only significant technological and methodological innovation, including redesigning AI systems from the ground up, but also substantial social changes in areas such as health care access and expenditures. 

Difficult though it may be, this refocusing on “fair AI” is essential. AI systems make life-changing decisions. Choices about how they should be fair, and to whom, are too important to treat fairness as a simple mathematical problem to be solved. This is the status quo which has resulted in fairness methods that achieve equality through leveling down. Thus far, we have created methods that are mathematically fair, but cannot and do not demonstrably benefit disadvantaged groups. 

This is not enough. Existing tools are treated as a solution to algorithmic fairness, but thus far they do not deliver on their promise. Their morally murky effects make them less likely to be used and may be slowing down real solutions to these problems. What we need are systems that are fair through leveling up, that help groups with worse performance without arbitrarily harming others. This is the challenge we must now solve. We need AI that is substantively, not just mathematically, fair.