“Dirty by Nature” Data Sets: Facial Recognition Technology Raises Concerns

Facial-recognition-bias-1190624951-300x213The sweeping use of facial recognition software across public and private sectors has raised alarm bells in communities of color, for good reason. The data that feed the software, the photographic technology in the software, the application of the software—all these factors work together against darker-skinned people.

Datasets are “dirty by nature,” reflecting human bias that colors the outcome of any facial recognition program. A notorious example involves Beauty.AI, a remote beauty pageant launched in 2016 to demonstrate that what we think of as an inherently subjective physical quality can be judged biometrically through machine learning. Because the all-robot jury in that inaugural event was fed a photographic diet of former beauty queens, the predictive algorithm yielded a predictable result, picking winners based solely on skin tone.

That application of AI may have been frivolous, but the lesson is serious given the increasing use of facial recognition technology at all levels of government. The real-life consequences of this particular type of algorithm bias can be severe.

The problem is complex because it’s multifaceted. Besides the fact that people of color are underrepresented in the datasets that train facial recognition software, the photographic technology involved may have been optimized for lighter skin tones. And race isn’t the only factor at play. Women are harder to accurately identify, perhaps because they are more likely to change their appearance with makeup. Facial hair and glasses can also confuse the software.

A Massachusetts Institute of Technology study revealed both racial and gender bias in three major types of facial recognition software used in law enforcement. The software correctly identified gender 99 percent of the time in photos of white males, but only 35 percent of the time in photos of darker-skinned females.

A test of Idemia facial recognition software—used by police and border agents in the United States, France and Australia—confirmed the bias against women of color. “At sensitivity settings where Idemia’s algorithms falsely matched different white women’s faces at a rate of one in 10,000, it falsely matched black women’s faces about once in 1,000—10 times more frequently,” Wired reports.

The National Institute of Standards and Technology, which conducted the test, has also found facial recognition software to be disproportionately inaccurate in identifying Asians, Native Americans—anyone who isn’t Caucasian. Age is a problem too. NIST found a 10 times higher rate of misidentification of seniors compared to middle-aged people.

Given the scope of the problem, tech companies are reevaluating how their products are used. In June, IBM announced it would no longer sell software for “mass surveillance or racial profiling.” Some municipal and state governments have limited or prohibited the use of facial recognition software by law enforcement.

In September 2020 the city of Portland, Oregon, went further than that, banning the use of facial recognition technology by businesses. Such software is used in the private sector to verify employee work hours and prevent shoplifting, among other efficiency and cost-saving measures.

That move was met by pushback from the International Biometrics + Identity Association (IBIA), which asserted that language in the Portland City Council’s original proposal reflected a fundamental misunderstanding of the science behind facial recognition technology—in particular, the idea that non-physiological inferences can be made about a person based on biometric data.

“Facial recognition algorithms as a source of information about an individual’s characteristics is not science,” the IBIA wrote. “One cannot infer emotion, patriotism, criminal inclinations, sexual orientation, or other characteristics from a mathematical template of the face. This is not facial recognition.

“Conflating this with facial recognition only confuses the issues and will certainly preclude an informed discussion on the public safety and security benefits of facial recognition technology.”

With facial recognition technology already in widespread use, it’s doubtful that this genie can be put back in the bottle—nor should it be. But given the legitimate concerns about accuracy and privacy, there is a pressing need for well-designed legal safeguards and serious—and sustained—scrutiny.


About Face: Algorithm Bias and Damage Control

Retooling AI: Algorithm Bias and the Struggle to Do No Harm

Facial Recognition, Racial Recognition and the Clear and Present Issues with AI Bias

The Bias in the Machine: Facial Recognition Has Arrived, but Its Flaws Remain