Weapons of Math Destruction –– Talking Tech w/ Cathy O’Neil

Speaker 0 0:10 – 0:14

Welcome to Tech Talk. Bye. CT. Tea.

Speaker 1 0:16 – 1:11

Welcome to CDT's tech talk where we dish on tech and Internet policy while also explaining what these policies mean to our daily lives. I'm Brian Waslowski, and it's time to talk tech. Is math biased? That seems like a silly question. In math, there is right and wrong. So how could it possibly be biased? But humans use math and develop mathematical models and algorithms. And well, humans can certainly be biased. So maybe in fact, math can be biased. In her new book, Weapons of Math Destruction, Cathy O'Neil explores how big data is increasing inequality and threatening democracy, which is actually the subtitle of the book. Cathy is a mathematician herself and the author of a very cool blog, mathbabe.org. I am thrilled to welcome to Tech Talk today, Kathy O'Neil. Welcome, Kathy. Thanks so much for having me. And thanks so much for rerecording. We had a little snafu last time, so it just shows to what a wonderful person you are.

Speaker 0 1:12 – 1:17

So I don't think we could possibly be as good this time in a lot. We're gonna try. We are gonna try.

Speaker 1 1:18 – 1:24

So tell me a bit. The title and premise of the book, it probably surprises a lot of people. How can math be a weapon of

Speaker 0 1:26 – 1:35

destruction? Well, just to be clear, I think of math itself as being completely innocent in this whole endeavor. I think math math could be weaponized.

Speaker 1 1:36 – 1:36

Okay.

Speaker 0 1:37 – 3:53

And math so it's not math's fault, to be clear. Math's false is not a problem. But people wield thematic algorithms, in bizarre and unreasonable ways. And it blinds people. And it's, you know, and I'll I'll say a little bit more about that. Okay. My my first example came like, my first I witnessed this first in the financial crisis. And everyone actually, everyone witnessed it. It was a AAA ratings on mortgage backed securities. Like, AAA ratings were supposed to they're supposed to signify something mathematically rigorous, which was like a you know, the math PhDs were in the backroom checking on all the data around these mortgage backed securities and double checking and triple checking that they were gonna be safe investments. That was the idea. But, of course, that wasn't what was actually happening. They were actually kind of disciplined lies. They were, you know, essentially gaming these risk models with bad proxy data, bad assumption, in particular, the assumption that home prices over the nation could never go down, that we could never have, like, highly correlated defaults, even though that the actual kinds of mortgages that were being put into these mortgage backed securities were worse and worse quality. So they should have known better. They didn't know better. And they sort of made mathematics look bad. And the reason they worked, so that's like the sort of the cowardly, inside story. The reason they worked, which is a slightly different story, is that people in general, and and this is this is gonna come back to us. People are afraid of mathematics, and they also trust mathematics. So it's like that combination of fear and trust, that allows excuse me. That allows mathematics to be weaponized like this. So people essentially, it's a perfect mechanism where by you can do whatever core process you wanna do and then slap a sort sort of, the imprimatur of mathematics on it and say,

Speaker 1 3:54 – 4:30

because this is mathematical, you should both trust this and fear this. Yeah. I think I see that a lot. Ask any questions. Yeah. I think I see that a lot. You you know, you hear the term algorithms thrown about. And I think, you know, certainly, even I, until I started working at, CDT, didn't have a full sense of how often algorithms were used to make decisions about us. It's just one of those things that are it's kind of unseen. And then when you're like, oh, well, there's formula for this, you do, like, feel a little bit more comfort about that. You know, where do you have some of the other examples of, like, where decisions are made about us using these magical algorithms in math?

Speaker 0 4:31 – 6:18

Well, I mean, they're literally all around us. Right? So Facebook's algorithm decides which which things to show us. Yeah. And, you know, between you and me, not a very good algorithm in terms of its effect on democracy and the informed citizenry that is required for democracy. The the example that I first witnessed after leaving finance was the teacher value add model. Now my friend, had opened a school in in, she was a principal of a school in Downtown Brooklyn, so her teachers were being evaluated by this new teacher assessment system algorithm. And she said to me, you know, Kathy, like, all my teachers are getting these mysterious scores. They're unexplained. I don't understand them. Some of them are terrible scores. You know, if they get enough bad scores, they're not gonna be able to get tenure. What can I do? And I said, well, you know, mathematics is supposed to clarify, not obfuscate. So you should ask your department of education, contact, like, can you explain these these scores to me? And she said, well, I tried that. That. And I said, well, what happened? She said, well, the Department of Education contact I had told me it's math. You wouldn't understand it. Wow. So yeah. So that was, like, when when I was, like, holy crap. They're weaponized math again. It's not finance anymore. It's not like tricking investors in Norway to buy, you know, mortgage backed securities that are rotten at the core. But it's still the same tactic, which is don't look under the covers because this is math, and you are not an expert in math, and you should be ashamed of yourself. And, like, we're gonna literally produce shame in you so that you will not ask any questions. And that's sort of for me, that's the that's the first

Speaker 1 6:19 – 6:55

sign of a weapon of mass destruction. Destruction. And if anyone should be asking questions, you'd want teachers to be asking questions. I mean, these are people that are part of their job is asking questions and then imparting that knowledge on our on youth. So that seems like an important one. You you, alluded to this a bit, Facebook's algorithm. And, of course, we had an election. You know, when we first recorded, it was pre election. Now it's post election. And, yikes, what a different world we're in. And, certainly, fake news, grabbing headlines everywhere. Do you think that math I mean, you kind of suggested it played a role in the election or at least informing, you know, democracy?

Speaker 0 6:58 – 9:13

Yeah. I mean, like and I'll I'll bring Google in as well. Like Sure. Google and Facebook, among others, have made just tons and tons of money on the assumption that we have mostly bought into, that they can do stuff with algorithms that used to require human beings. And Facebook has been the gatekeeper for information, and so has Google. They're literally the gatekeepers for information. And Facebook is that kind of inform the type of information that they're they they gatekeep is essentially news. Like, more than half the people, especially millennials, get their news from Facebook. They don't go directly to websites news websites anymore. They literally get their news from Facebook. For the Google, I just saw an ad last night while I was watching a football game. Don't tell anyone I was watching football. I was trying to boycott football, but it's like post Trump, I'm just like, whatever, we're all gonna die. We we need something to distract us. Right? So maybe that's right. I either have to watch football or smoke cigarettes, so that's it. This is, like, slightly less bad. But I was watching football and, like, a commercial came on where it was by Google saying, you know, hey, Google. Like, what does a blue whale sound like? You know? So it was depicting, you know, this father reading to their daughter, I think. The idea being like, oh, I don't have the answer to this question my daughter is asking me in my brain right now, so I'll just refer to Google, and Google will give me that information. And it Right. And it's trustworthy, and it's, you know, and it's safe for my children. And that that isn't a thing that they are claiming. Right? And they're claiming they can do this with algorithms. And yet and yet, like, soon after the election, when you googled who won the popular vote in the US election, the answer came out Trump. Wow. Yeah. We have a problem. You know? That we have a problem. Facebook is giving us fake news, and it's putting it in the same context as real news. That's a problem, and that's a gatekeeping problem. So both of these large companies are making billions and billions of dollars of profit, and it's not uncorrelated to the fact

Speaker 1 9:14 – 9:58

that we trust them, and that we trust them in particular to feed us the information that we need to survive as as citizens, and it's a problem. And they just become something that's kind of a given in our home. I mean, I think the example you were talking about was probably Google's new home product. You know, a lot of people think of Amazon's Alexa. I remember, early on going back to football, you know, you could ask Alexa, you know, it was after one game into the season, and if you asked her who the best team in football was, it was the Buffalo Bills, which we all know is never true. And it was just because alphabetically, they were first the first team with one win. So it's like, you know, truth as given to you by these companies. And, you know, more and more if you're doing it through a connected device, you're not even having access to, you know, that that laptop or that interface to to fact check it, and that's scary.

Speaker 0 10:00 – 10:55

Yeah. I mean, we're being essentially asked to believe that this algorithm is better than than our own understanding of the world. Yeah. And at the same time, they're saying, well, we can't possibly use humans, to help us. This is Facebook's stance and Google's stance. Like, we can't possibly hire all those journalists that we put out of work to help us, gatekeep because, well, it's just too high too scaled. It's too scaled up. It was like, well, you actually ruined an enormous industry which was doing this job. Yeah. And and it just makes it makes me wonder, like, thirty years ago, some of that the fake news websites that that are all over Facebook, like, what would have happened to them thirty years ago Mhmm. Before before we had, like, large scale, devastation of the

Speaker 1 10:56 – 10:59

of of media. Yeah. No. It's I don't think that's an understatement. I'm not

Speaker 0 11:01 – 11:26

arguing that, like, the media does a perfect job of gatekeeping. It doesn't. And, like, we can all talk about the kinds the specific newspapers that we think are have fledged views. But one thing that's great about those those kind of half, you know, half good, gatekeepers is that people know there's gatekeeping going on, and there's accountability.

Speaker 1 11:26 – 11:27

Right.

Speaker 0 11:28 – 11:46

Limited amount of accountability, but there really is an accountability. There's a policy, and people know, like, that the editors are making those decisions. And they can complain. And if they don't like it, they can go to another newspaper. Right now, what we have is, like, one choice, Facebook. No accountability. No stated policy.

Speaker 1 11:47 – 12:12

And decisions be made based on, you know, the mystical math. Let's go back to your book a little bit because I think we could talk about elections, fake news, endlessly, and it's certainly something that needs to be explored much more in-depth. In your book, one of the, I thought, most powerful examples was, where you're talking about predictive policing and recidivism. Could you explain a little bit how math is used in a destructive manner there?

Speaker 0 12:13 – 15:10

Sure. So I should mention what they those two algorithms do first. Sure. The predictive the predictive policing algorithm uses, arrest records to try to predict future arrests. And they usually specifically think about location of those arrests. So they if they find a bunch of, arrests have happened in a certain location, they tend to send police back to that location to make more arrests. The problem with that is that it is essentially like a pseudo scientific, justification for continued uneven policing. So we have uneven policing in this country. And by that, I really mean racist policing, where we have way way more police in poor black neighborhoods than in richer, whiter neighborhoods. Mhmm. And when we are when we're focusing on arrests as proxies for crime, then, crime, then just literally the history of of arrests in these neighborhoods will is it reason enough, if you use this algorithm, to say, oh, the the algorithm told us to go back to the very same neighborhoods and look for more crime. And it that it sort of what it does is it propagates this this biased policing system. Now I wanna go back to this little just this the distinction I wanna make between crime and arrest. The problem is that there really is a big difference. Like, the statistic I like to mention is that blacks and whites smoke pot at about the same rate, but that blacks are arrested for smoking pot about four times as often. Wow. And it actually depends on the jurisdiction. It could be up to 10 times as often as whites. So what that's telling you is that arrests, are not the same thing as crimes. And the extent to which they agree or disagree, it really depends on the police practices, rather than the people who are doing the crime so much. So so anyway, so that's that's predictive policing. It creates this feedback loop where we just keep doing the same The inherent biases are programmed into that. Yeah. Exactly. The bias is in the data, and then the data is fed to the computer. And the computer says, guess what? Send police back to those same places. Right. Or another another thought experiment I like to have is, like, if after the financial crisis, we had gone we had sent police back, like, down to Wall Street to stop and frisk all the bankers and to arrest them for their crimes, then, you know, we'd be having police constantly being controlled Wall Street because trust me, there there's plenty of there's plenty of drug users on Wall Street. I don't doubt that for a second. But they're just not getting arrested for it. Right. So so that's pretty displeasing. And then further downstream, in the in terms of the data, there's something called recidivism risk, algorithms. And these are given to judges when they're sentencing, when they're putting when deciding on parole, when they're deciding on bail, but it'll focus on sentencing. And the idea is is go ahead.

Speaker 1 15:10 – 15:12

Oh, nope. Didn't say anything.

Speaker 0 15:12 – 16:59

Oh, sorry. I heard a I heard a noise. So I should mention that recidivism risk means the risk of coming back to prison after leaving prison. So ninety seven percent of prisoners eventually leave prison. So the question is, are they coming back? And the tendency is for judges to to give people longer sentences when they have higher risk of recidivism. Sure. And you could already ask the question, like, wait. Does that make any sense? Aren't we, like, preemptively punishing somebody for something they haven't done? And the answer is absolutely. But the, there's a long standing tradition in doing this, and I guess the argument is, that they are protecting public safety. You know, you don't want people, who are likely to commit another violent crime out there, because it gives them the more opportunity to do so. I should mention that not all the data going into these risk recidivism risk algorithms are pertaining to violent crime. So that's one problem. But But the the more general thing is that, you know, putting aside the question of whether we should be sending prisoners to prison for longer because of their future risk, not their actual crimes, the way these these risk scores are are created is very, very biased. Now there's two sources of bias. There's two sources of data that go into risk of risk scores. The first is the is the arrest record that I talked about before, which we know is biased. And the second is these questionnaires that are given to the, to the defendants. And the questionnaires have a bunch of questions that are proxies for race and class. So the questions are things like, did you graduate from high school? Do you have a job? Of course. Have you gotten married? Do you live in a high crime neighborhood? So they're all like, but none of them say, are you black? None of them say, are you poor? But they might as

Speaker 1 17:00 – 17:22

well because, you know, data, that's how we figure out stuff with data. Yeah. And Facebook got in trouble for this a little bit too with, you know, their ethnic affinity. They, you know, kind of did everything but say, you know, black, white, Latino, and they came about with all these different, you know, sort of ways to figure out race. But, you know, so it sounds very similar but with far more severe outcomes here.

Speaker 0 17:24 – 19:55

Yeah. I mean, that's the recurring theme in my book is that, like, we were promised when we were given the Internet and when we were given big data, we were promised somehow that we were going to transcend race and class. Right. And the opposite is true. The great equalizing force. Yeah. That just hasn't worked out so well. We are we are, you know, we happen to be cleaving along those same old lines as we always have, which is race and class and and ethnicity. And it's it's it's really sad, and it's being propagated by the technology and the tools that we've created. You know, Facebook is a great example. Like, I I talk in one chapter, I talk about payday lending and, for profit colleges. And for profit colleges were specifically going after the ones I looked at in my book, Corinthian College, Everest College, you know, and all the ones that you know about ITT Tech and Mhmm. University of Phoenix. I guess, I should be precise. Like, the Corinthian College, which we got in trouble with the attorney general of California, got in trouble for going after single black mothers, who were desperate for a better life. They would get they would find these people online using the same sort of ad tailored ad technology that I was developing as a data scientist, I should say. Wow. It's a confession. Of course, on Facebook as well. Mhmm. And once they once these recruiters targeted these people, found you know, got them to give them their phone number, they would call them multiple times a day, finding and this is part of their user, their manual, their recruiting manual, find their pain points Oh, yeah. And and find find out, like, what hurt them the most in their daily lives and promising that those pain points would go away once they were signed up to to get online education mostly, with these for profit colleges. What they actually got were enormous, federal aid loans. So in other words, student debt and very little actual education. Even the the very few people who graduate from these places and, like, the graduation rates are abysmal, once they have a degree, a diploma from one of these online these for profit colleges, it's worth no more than a high school diploma when they're actually applying for for jobs. So it's like a very, very bad deal. And, again, highly tailored, highly targeted advertising, along race, class, it's very, very discouraging

Speaker 1 19:56 – 20:20

when you think about, like, you know, the promise of big data back way back when. Yeah. So let's go to the promise of big data because, you know, my hunch is, you know, like like me, you you still hold out some hope for this being able to do good. You you know, math, as you said before, is kind of, being leveraged in negative ways. Can it be leveraged in positive ways? Is there any hope for unlocking the Absolutely. The power of big data, so many people say?

Speaker 0 20:22 – 22:08

I mean, number one, absolutely. Number two, it's gonna be hard to get there. Okay. So the the absolutely, let's start there because I I don't I actually am a huge fan of big data. If we could use it not to screw people over, but to help people. You know, like the recidivism risk algorithms, I I find them fascinating. And the the fact that we're using all those proxies in the questionnaire, is is a testament to the fact that we, as a society, send black people and poor people to prison far too often. So the question we should be asking isn't can we blame individuals for these society wide problems. The question we should be asking is can we find interventions that will will, will address the issue of why black people and poor people find themselves in prison time after time? Like, how do we get like, how do we improve society? And I think the the results of these recidivism risk algorithms, the same exact algorithms could be used to investigate that question, to address a society's wide problem rather than just to assign personal blame. The the question of, you know, who needs who needs a boost in terms of education, you know, that could be that could be addressed using the same kind of targeting algorithms. The problem okay. So and then then it comes to the question of how hard is it gonna get be to get us to use these algorithms to help people rather than to exploit people. People. And there there, it's a little trickier because none of the things that I suggest we should do with these algorithms, like, to turn them around from exploitative to positive Mhmm. None of them involve, profiting. And so we have Good point. I'm

Speaker 1 22:12 – 22:12

point.

Speaker 0 22:13 – 22:43

We have a big we have, like, a lots of incentives for companies to to to prey upon people. You know, most the most vulnerable people in the in the country. We have very few incentives, for those same companies to say, hey. Can I give you an opportunity that will actually help you? So that that's where it's, you know, as I like to say in my last chapter, I think I said, the free market isn't gonna solve this problem. Yeah. Well, and the the funny thing is you see a lot of, you know, policymakers

Speaker 1 22:44 – 23:14

just saying, well, if the companies would change their practices, you know, on some level, it's it's you know, their motives are very different than the public sector. Sector. So having, you know, policies around this may be important. That's a couple more questions or just one more question, actually. So you have a new company coming out too, Orca, as you pronounce it in your blog. Are you is this gonna be your what you're focusing on moving forward, auditing companies and all that sort of stuff to make sure that their algorithms are fair and just or helping them with that?

Speaker 0 23:16 – 25:23

Yeah. So Orca, the idea of it is an algorithmic auditing company. And, I was I'm hoping to get companies to work with me, to help make sure that their algorithms are legal and fair and nondiscriminatory and meaningful. Because I you know, there's lots of different ways that algorithms can fail, and I don't have to focus singularly on racist or sexist algorithms. Like, just even algorithms that are not as meaningful as the designers would have hoped for them to be. So that that I wanted to use my expertise to help people to help clarify what algorithms are actually doing. Because another thing is that algorithms are often sort of they're often presented as complete black boxes that have there's just no way of understanding. They're just too complex and sophisticated. That's, of course, not actually true. There are ways of learning, sort of aggregate statistics about what an algorithm is doing, to get a crude but important, view on on whether that algorithm is is doing something harmful or helpful, depending on what you mean by harmful and helpful. So that's the idea of ORCA. Right. To be honest, like, in the in the Trump in in the world of president Trump, I'm not expecting that many, regulators to be on top of this question of, like, of racist algorithms. Because, you know, to be honest, like, I feel like my book is saying we need to do more than just pay lip service to fairness and, racism and sexism. And I feel like in the in in the world of president Trump, like, we might not even be paying lip service to this. I'm so sorry. On the other hand, like, there are individual situations where there's situations where there's laws on the book, anti discrimination laws, or the American Disability Act saying that you can't, force people to take, health exams, when they're getting hired for a job. You know, we have cases of of things that look very much like that Mhmm. Going on with personality tests. So I think there are individual cases where

Speaker 1 25:24 – 25:59

where companies will be like, holy crap. I need to make sure that my I'm not gonna get sued by an individual. That's good. And in that case, I would love them to come talk to me. Yeah. I was gonna ask that. You know, is it just an individual, you know, is there something if it's not gonna be government, which I agree with you, it's probably not gonna be a great atmosphere under, you know, the the Trump administration most likely, although maybe they'll surprise us, unlikely. What can an individual do who's feeling help helpless and wants to understand the algorithms that are around them, making decisions about them, making decisions Is there anything we can do?

Speaker 0 26:02 – 27:01

Well, I guess, it's it's an important to to to remark that many of the algorithms that are sizing us up and down are invisible to us or that we don't even know about them. But on the other hand, there are some some examples, like, for the teacher value added model where teachers are told this is your score out of, you know, zero to a 100. And we're not gonna explain it to you. And and the the stakes are gonna be high anyway. Those are examples where I want people to start pushing back. I want people to demand explanation. I want them to demand accountability. When there's a high stakes decision being made about you, then you should understand the the things that go into it. You should understand the data. You should understand the sensitivities to that data. And if you want to have a list of questions to ask, then please come to me, and I'll I'll help you ask those questions. That is exactly the kind of thing that I'm intending to try to work on for the rest of my life, which is That's awesome. Expanding accountability to these these kinds of algorithmic

Speaker 1 27:02 – 27:21

opaque systems. Well, Kathy, it seems like your work is more important than ever. A lot of people are saying that, but I think this is genuinely true for what you're doing. Thank you so much for joining Tech Talk, Weapons of Math Destruction. It's Kathy's book. Getting great reviews. It's a wonderful read. Go out and get yourself a copy. Thank you so much for joining us today, Kathy.

Speaker 0 27:22 – 27:24

It was really my pleasure. Thank you.

Speaker 1 27:30 – 27:47

That's it for this episode of Tech Talk. And if you're interested in the issue of algorithms, CDT is doing a lot of great work on algorithms and inequality led by our very own Ali Lang. Definitely reach out to us if you are interested in the issue. I'm Brian Wasilowski. Thanks so much for listening.

Weapons of Math Destruction –– Talking Tech w/ Cathy O’Neil

Top Keywords

Transcript

Listen