Cathy O’Neil calls herself a data skeptic. A former hedge fund analyst with a PhD in mathematics from Harvard University, the Occupy Wall Street activist left finance after witnessing the damage wrought by faulty math in the wake of the housing crash.

In her latest book, “Weapons of Math Destruction,” O’Neil warns that the statistical models hailed by big data evangelists as the solution to today’s societal problems, like which teachers to fire or which criminals to give longer prison terms, can codify biases and exacerbate inequalities. “Models are opinions embedded in mathematics,” she writes.

Although algorithms are everywhere, the most dangerous ones, according to O’Neil, have three characteristics: scale, secrecy and the capacity to do harm.

Recently reached by phone, O’Neil spoke about the prevalence of these “weapons of math destruction” across different industries. The conversation has been edited for length and clarity.

When did you first realize that big data could be used to perpetuate inequality?

I found out that the work I was doing on tailored advertising was a mechanism for for-profit colleges to find vulnerable, single black mothers. Find their pain points and promise them a better life if they signed up for online courses, which in the meantime loaded them up with debt and gave them a useless education. I was like, “That’s not helping anyone; that’s making their struggles worse, and it’s happening on my watch because I am the one building the technology for this to work very efficiently.”

What is a new example of a weapon of math destruction?

Recently, I was convinced by Mona Chalabi, who is a journalist at the Guardian but who also spent time at FiveThirtyEight, that political polls are actually weapons of math destruction. They’re very influential; people spend enormous amounts of time on them. They’re relatively opaque. But most importantly, they’re destructive in various ways. In particular, they actually affect people’s voting patterns. The day before the election, if people think their candidate is definitely going to win, then why bother voting? Polls can change people’s actual behavior, which disrupts democracy in a direct way.

People are trying to analyze how demographics shaped the election results. The answer is that we’re probably not going to know or have enough information to make an educated guess until much later.

That’s right. Also, there really were new things about this election cycle that we did not have data on, so we couldn’t account for them. But I’m not suggesting that all we need to do is correct the polls and next time they’ll be more accurate and therefore better. I’m actually trying to make the argument that we should just not do them. I honestly feel like if we had a thought experiment where nobody did polls and nobody talked about polls and we all just talked about the actual issues of the campaign, then we’d have a much better democracy.

In your book, you describe some relatively well-known examples of potentially harmful algorithms, such as value-added models that grade public schoolteachers based on student test scores. You tried to get the source code behind that model from the Department of Education in New York City, but you weren’t able to. Their defense was probably that if people knew how the scores were calculated, then teachers would be able to game the system to get higher scores.

Well, the very teachers whose jobs are on the line don’t understand how they’re being evaluated. I think that’s a question of justice. Everyone should have the right to know how they’re being evaluated at their job. And I should have the right to understand those models as well because I’m a taxpayer, and the job is a government position. The Freedom of Information Act should apply.

Also, if you use the word “gaming,” first you’re implying that there’s a bad actor involved, which sometimes there is. Second, you can really only game a model if it’s weak. The weakness of the teacher value-added model is that it’s statistically terrible. Anybody whose job is on the line deserves to understand that weakness. And deserves to, for that matter, take advantage of it if they can. But my goal isn’t for a bunch of teachers to sneakily get better scores. My goal is for the model itself to be held to high standards.

In some cases, the policymakers themselves probably don’t even know how the scores are calculated.

In the case that I wrote about in my book, nobody in New York City had access to that formula. Nobody. The Department of Education did not know how to explain the scores that they were giving out to teachers.

Los Angeles’s Department of Children and Family Services has been exploring a risk-modeling algorithm called AURA. It was developed by SAS, a private contractor, and it scores children according to their risk of being abused so that social workers can better target their efforts. Something like this could be a weapon of math destruction — it has scale, and the formula is secret — or it could be benign.

Or even positive. It really depends on what exactly they’re doing with those scores. It also depends on how those scores are created. Even if they’re being somewhat punitive, if they’re doing it in a way that has been discussed as morally fair, then that’s probably still OK. If they’re finding kids at risk of child abuse and they’re removing them from families when they have just cause, then we should think of that as a good thing. What would not be OK is if the score was elevated simply because somebody happened to be black or happened to be poor.

So you’re less worried about models that target people in order to help.