Big Tech Tries to Fight Racist and Sexist DataThe trouble is, no machine can be better than its underlying training data. That’s baked in
The fact that AI can pass on bias and prejudice is now widely recognized, probably because recent incidents of apparently racist or sexist algorithms involved big companies like Google and Amazon. A better understanding of how bad data gets encoded might make it easier to prevent.
The large-scale machine learning AI that undergirds most recent advances relies on immense quantities of data. As the system feeds on the data provided, thousands of small adjustments are made to internal parameters to tweak how the data will be categorized. This tweaking encodes the attributes of the training data. So, if the original training data is biased, the training is biased and the results will be biased.
The industry is trying to address the problem. Microsoft and IBM, have both acknowledged bias. Along with other tech companies, they are offering improved tools to assist in addressing bias. IBM says, for example, that it has identified over 180 human biases which they claim their tools can help assess.
So good training data is necessary if we’re to use Machine Learning-based AI. But, what if, at least in some applications, good training data is not enough? Recently, a reporter raised this question about police use of facial recognition and crime prediction systems in a way that hardened and hid the bias:
These technologies, academics and journalists found, were directly exacerbating the hardships already faced by communities of color. Algorithmic sentencing was helping incarcerate minorities for even longer. Predictive policing was allowing law enforcement to justify their over-policing of minority communities. Facial recognition software was enabling the police to arrest people with little to no evidence.Ali Breland, “Woke AI Won’t Save Us” at Mother Jones
Breland reviews the debate between the “inclusion-ists” (those who believe the answer is better training and better data) and the “abolitionists” (those who argue that the tech cannot be redeemed).
In this case, both sides are wrong. The problem lies not in the technology, but in our belief in technology. Neil Postman (1931–2003) while a communications Professor at New York University, published Technopoly (1992) which he defines as a society that believes “The primary, if not the only, goal of human labor and thought is efficiency, that technical calculation is in all respects superior to human judgment… and that the affairs of citizens are best guided and conducted by experts.”
The problem with machine learning-based AI is, not so much its inherent bias (none of us is bias-free), but the delegation to a machine of what should be a human decision. Like a magnifying glass, a machine can help us see that which we might miss but it cannot—because there is no ghost in the machine—decide what to do with what it, and we, see. Choosing requires a mind; machines are mindless and cannot choose.
Dehumanization—defining people as outside our common humanity—is the precursor to hurting them. And giving to machines decisions that belong to people reduces those affected to nothing more than bits of flesh to manage.
We need to improve the training data used for machine learning-based AI. But we also need to stop ceding our choices to machines.
As Postman observed regarding statistics in Technopoly: “But statistics, like any other technology, has a tendency to run out of control, to occupy more of our mental space than it warrants, to invade realms of discourse where it can only wreak havoc. When it is out of control, statistics buries in a heap of trivia what is necessary to know.”
We must use our tools to enhance our humanity and treatment of one another, not to push each other, under the rubric of efficiency, into boxes and slots. We can do better. And all of us deserve better.
Has AI been racist?
Did AI teach itself to “not like” women?