We're outsourcing ever more of our decision making to algorithms, partly as a matter of convenience, and partly because algorithms are ostensibly free of some of the biases that humans suffer from. Ostensibly. As it turns out, algorithms that are trained on data that's already subject to human biases can readily recapitulate them, as we've seen in places like the banking and judicial systems. Other algorithms have just turned out to be not especially good.
Now, researchers at Stanford have identified another area with potential issues: the speech-recognition algorithms that do everything from basic transcription to letting our phones fulfill our requests. These algorithms seem to have more issues with the speech patterns used by African Americans, although there's a chance that geography plays a part, too.
A non-comedy of errors
Voice-recognition systems have become so central to modern technology that most of the large companies in the space have developed their own. For the study, the research team tested systems from Amazon, Apple, Google, IBM, and Microsoft. While some of these systems are sold as services to other businesses, the ones from Apple and Google are as close as your phone. Their growing role in daily life makes their failures intensely frustrating, so the researchers decided to have a look at whether those failures display any sort of bias.
To do so, the researchers obtained large collections of spoken words. Two of these were dominated by a single group: African Americans from a community in North Carolina, and whites in Northern California. The remaining samples came from mixed communities: Rochester, New York; Sacramento, California; and Washington, DC. These were run through each of the five systems, and the accuracy was determined based on a comparison of the results to those of human translators.
Based on a score called word error rate (which includes inserted and missing words, as well as misinterpretations) all of the systems did well, having a score of less than 0.5. (Apple's was the worst, and Microsoft's system the best based on this measure.) In all cases, the recordings of African American speakers ended up with word error rates that were worse than the ones produced from recordings of white speakers—in general, the errors nearly doubled.
The effect was more pronounced among African American males. White men and women had error rates that were statistically indistinguishable, at 0.21 and 0.17, respectively. The rate for African American women averaged 0.30, while for men it rose to 0.41.
How important are these differences? The authors suggest it depends on how you define usability—above a certain percentage of error, it becomes more annoying to fix an automated transcript than to do it yourself, or your phone will end up doing the wrong thing more often than you're happy with. The authors tested how often individual chunks of text end up with a conservative word error rate of 0.5. They found that over 20 percent of the phrases spoken by African Americans would fail this standard; fewer than 2 percent of those spoken by whites would.
So what's going on? There may be a bit of a geographical issue. California speakers are often considered to be accent free from an American perspective, and the two samples from that state had very low error rates. Rochester had a rate similar to California's, while the District of Columbia had one closer to the rural North Carolina town. If there is a geographic influence, we're going to need a much larger sample to separate that out.
After that, the researchers analyzed the language usage itself. Since they didn't have access to the algorithms used by these systems, they turned to some open sourRead More – Source