Self-Driving Cars May Have A Racism Problem
Qatar Tribune
Theodore Kim Tesla recently announced the newest version of its self-driving car software, following the softwareâs role in a dozen reported collisions wit...
Theodore KimTesla recently announced the newest version of its self-driving car software, following the softwareâs role in a dozen reported collisions with emergency vehicles that are the subject of a federal agency probe.While these collisions happened for a variety of reasons, a major factor may be that the artificial intelligence driving the car is not used to seeing flashing lights and vehicles pulled over on the shoulder, so the underlying algorithms react in unpredictable and catastrophic ways.Modern AI systems are âtrainedâ on massive datasets of photographs and video footage from various sources, and use that training to determine appropriate behaviour. But, if the footage doesnât include lots of examples of specific behaviours, like how to slow down near emergency vehicles, the AI will not learn the appropriate behaviours. Thus, they crash into ambulances.Given these types of disastrous failures, one recent trend in machine learning is to identify these neglected cases and create âsyntheticâ training data to help the AI learn. Using the same algorithms that Hollywood used to assemble the Incredible Hulk in The Avengers: Endgame from a stream of ones and zeros, photorealistic images of emergency vehicles that never existed in real life are conjured from the digital ether and fed to the AI.I have been designing and using these algorithms for the last 20 years, starting with the software used to generate the sorting hat in Harry Potter And The Sorcererâs Stone, up through recent films from Pixar, where I used to be a senior research scientist.Using these algorithms to train AIs is extremely dangerous, because they were specifically designed to depict white humans. All the sophisticated physics, computer science and statistics that undergird this software were designed to realistically depict the diffuse glow of pale, white skin and the smooth glints in long, straight hair.In contrast, computer graphics researchers have not systematically investigated the shine and gloss that characterises dark and Black skin, or the characteristics of Afro-textured hair. As a result, the physics of these visual phenomena are not encoded in the Hollywood algorithms.To be sure, synthetic Black people have been depicted in film, such as in last yearâs Pixar movie Soul. But behind the scenes, the lighting artists found that they had to push the software far outside its default settings and learn all new lighting techniques to create these characters. These tools were not designed to make non-white humans; even the most technically sophisticated artists in the world strained to use them effectively.Regardless, these same white-human generation algorithms are currently being used by start-up companies like Datagen and Synthesis AI to generate âdiverseâ human datasets specifically for consumption by tomorrowâs AIs.A critical examination of some of their results reveal the same patterns. White skin is faithfully depicted, but the characteristic shine of Black skin is either disturbingly missing, or distressingly overlighted.Once the data from these flawed algorithms are ingested by AIs, the provenance of their malfunctions will become near-impossible to diagnose. When Tesla Roadsters start disproportionally running over Black paramedics, or Oakland residents with natural hairstyles, the cars wonât be able to report that ânobody told me how Black skin looks in real lifeâ. The behaviour of artificial neural networks is notoriously difficult to trace back to specific problems in their training sets, making the source of the issue extremely opaque.Synthetic training data are a convenient shortcut when real-world collection is too expensive. But AI practitioners should be asking themselves: Given the possible consequences, is it worth it? If the answer is no, they should be pushing to do things the hard way: by collecting the real-world data.Hollywood should do its part and invest in the research and development of algorithms that are rigorously, measurably, demonstrably capable of depicting the full spectrum of humanity.Not only will it expand the range of stories that can be told, but it could literally save someoneâs life. Otherwise, even though you may recognise that Black lives matter, pretty soon your car wonât. (Theodore Kim is an associate professor of computer science at Yale University.)