Feds use Rand formula to spot discrimination. The GOP calls it junk science

Marc Elliott, a Rand Corp. statistician who devised a system that guesses someone's race based on their address and last name, at his home in Sacramento.

By James Rufus Koren

Aug. 28, 2016 3 AM PT

Marc Elliott didn’t know he’d become a player in the financial world until he received an unexpected email from a friend.

It read simply, “Did you know you just cost Ally Financial $80 million?”

Until that moment nearly three years ago, the Rand Corp. statistician hadn’t known an algorithm he’d devised years earlier for healthcare research had found its way from Rand’s headquarters in Santa Monica to the halls of a powerful financial regulator in Washington, D.C.

Or that the agency, the Consumer Financial Protection Bureau, had used his breakthrough formula to underpin racial discrimination allegations against auto lending companies, starting with former General Motors lending arm Ally Financial, which paid $80 million to settle in 2013.

“My first reaction was just that it had really moved along,” said Elliott, 49, who has spent much of his nearly 21 years at Rand researching healthcare issues, not finance. “I hadn’t been aware at all.”

And it’s gone much further since then.

If you have a credit card, a car loan or almost any type of debt other than a mortgage, there’s a chance your name and address have been run through Elliott’s algorithm, a complex formula that crunches data from the Census Bureau.

But as it has become more widely used, Elliott’s work and the CFPB’s application of it have found their way into the middle of a fight between the federal consumer watchdog and politicians who want to scrap the agency. Some congressional Republicans have gone so far as to call the CFPB’s use of Elliott’s system “junk science.”

His algorithm is a tool that estimates the probability that someone is white, black, Asian or Hispanic based only on their address and last name. The CFPB has relied on it to accuse some of the country’s largest auto lenders, including the financing arms of Toyota and Honda, of discrimination.

Car dealerships often add an extra bit of interest, called a markup, on top of the rate charged by a lender, ostensibly to pay the dealership for its work arranging the loan. The CFPB, using Elliott’s system to look at tens of thousands of loans, has alleged that dealers charge larger markups to minority borrowers.

To Republicans who have fought to limit the agency created in the aftermath of the financial crisis, the algorithm encapsulates how the CFPB has overstepped its bounds, using a novel statistical method to indirectly regulate a class of businesses — car dealers — outside its jurisdiction.

To the auto lending industry, it’s a tool used to imply that it allows racist practices, a damning claim that lenders think ought to be backed up with more than a math equation.

“You’re using an imperfect tool to result in some pretty serious headlines,” said Scott Pearson, an attorney who represents lending firms. “That’s why they don’t like it. They think it’s unfair.”

‘Unsolvable problem’

Put Elliott in a lineup of bankers and he’d be the obvious outsider — the numbers man and policy wonk with the rumpled shirt and tousled gray hair.

Yet he has become a minor figure in modern finance.

Since that first settlement with Ally, reached in late 2013, the CFPB has employed Elliott’s system to reach multimillion-dollar settlements with other big auto lenders, most recently a $22-million deal with Toyota Motor Credit announced in February.

It is irresponsibly branding companies with the stigma of racial discrimination based on nothing more than junk science.

— Rep. Jeb Hensarling (R-Texas)

Soon after the CFPB said it was using the algorithm, lenders and consultants to the finance industry took notice. For instance, Wolters Kluwer Financial Services, a provider of compliance software for lenders, quickly integrated the algorithm into its programs.

“For anyone anticipating the possibility that the CFPB will be doing an examination, we use it,” said Stephen Cross, a senior director at Wolters Kluwer. “The fact we can do that is always a selling point to those clients.”

Among the firms that employ Elliott’s algorithm to look at whether they might be discriminating are banks that issue credit cards and even some online lending companies.

“Big players are absolutely spending time trying to make sure they aren’t going to be found to have violated fair-lending laws,” said Pearson, a partner in the Los Angeles office of law firm Ballard Spahr.

Elliott’s algorithm is what’s called a proxy method — a way to figure out something unknown by looking at things that are known.

A health insurer, for instance, might want to know if its black patients get the same treatment as white patients, but the insurer might not ask its members to identify themselves by race when signing up for a policy.

That very question is what led Rand researchers, 16 years ago, to start developing the system that Elliott would later help refine and complete.

Called Bayesian Improved Surname Geocoding, or BISG, Elliott’s system is built on top of two sets of census data: information about the ethnic makeup of individual neighborhoods and a list of last names broken down by how common they are among people of six racial categories.

SIGN UP for the free Essential Politics newsletter »

The algorithm combines the data sets to give the probability that someone falls into one of six categories: Asian, Hispanic, black, white, multiracial or American Indian/Alaska Native.

It’s complicated but intuitive. If your last name is Rodriguez and you live in a mostly Hispanic neighborhood, there’s an awfully good chance you’re Hispanic. If your name is Smith and you live in a mostly white neighborhood, there’s a good chance you’re white.

BISG combines two older, less accurate methods of guessing race: geocoding, which looks only at where someone lives, and surname analysis, which looks only at last names. Both systems have weaknesses that Elliott’s combined method sought to address.

Surname analysis works well for Asians and Hispanics, who have more distinctive last names, but it doesn’t work so well for blacks and whites, who share many last names.

For geocoding, the opposite is true, doing a better job of distinguishing between blacks and whites, who are more likely to live in heavily black or white neighborhoods, than picking out Asians and Hispanics.

Rand researcher Allen Fremont had started using geocoding to look for racial disparities in healthcare starting in 2000, and by 2004 was looking for a more accurate method of estimating patients’ race. A chance encounter with Rand demographer Peter Morrison over lunch led him to the idea of combining surname analysis with geocoding.

Scott Pearson, a partner in the L.A. office of law firm Ballard Spahr who specializes in consumer finance issues, said lenders think the Consumer Financial Protection Bureau's use of Elliott's algorithm is unfair.

Fremont and Morrison created their own system, but they needed a hardcore numbers guy who could refine what they were attempting.

In 2005, they called in Elliott, one of a team of statisticians at Rand and something of a star within the organization.

Elliott devised a system and kept refining it until, in 2009, he, Fremont, Morrison and other Rand researchers published a paper laying out Bayesian Surname Improved Geocoding.

“This is the way it goes with Marc and so many people here,” Fremont said. “You pose a perplexing, somewhat unsolvable problem, and they come up with a solution statistically.”

‘Going around a corner’

Elliott seems to relish that type of complex problem, seeing them not as obstacles but as opportunities to learn something new. That’s a constant aim for Elliott, a polymath with wide-ranging interests at work and at home.

During his two decades at Rand, he has worked on projects in fields as varied as military labor economics, social psychology and childhood obesity.

At home, he cooks, using recipes only as suggestions. He reads anything in sight. He taught himself to play the piano and sings Beatles tunes with his teenage son and daughter. He rarely sits still.

“He’s got a lot going on. That’s the way he keeps himself busy and intellectually stimulated,” said his wife, Megan. “We do not do cruises.”

As often as he can, Elliott hikes, whether in the Sierras — not far from his home in Sacramento — or wherever his travels take him.

“There’s a pleasure in not only being in a beautiful place but also going around a corner and not really knowing what you’re going to see,” he said.

That curiosity is what led Elliott decades ago to abandon graduate studies in psychology and instead pursue a master’s degree — and later a PhD — in statistics. He’d always been good at math, and even enjoyed it, but statistics, he said, isn’t just about numbers.

“There’s inherently some creativity involved,” he said. “The challenge is to take a complex problem in the real world and figure out the parts you can translate into the realm of numbers.”

Estimating race and measuring discrimination are just that type of complex problem, and Elliott said lenders are not the first group to balk at his algorithm.

Winston Wong, a doctor and executive at Kaiser Permanente who oversees projects aimed at addressing treatment disparities, said the healthcare provider uses BISG regularly, but was initially skeptical.

“People asked, ‘How trustworthy is the data?’ ” he recalled. “Are we going to draw conclusions from a model that uses mathematical algorithms to direct where our attention is going to be?”

But skeptics, Wong said, were won over once Kaiser’s own studies showed that Elliott’s system was reliably predictive.

Still, even for people who think they might have been overcharged for their car loans, the idea of predicting race based on last name and address seems odd. In cases against auto lenders, the bureau has used BISG to determine which customers should be compensated by lenders.

Joyce Jefferson, a Compton resident who read about February’s Toyota settlement, thought she might be a victim of discrimination but said she was nevertheless uncomfortable with the process.

“It is very weird. How are you going to know these people were overcharged?” she said.

Jefferson won’t receive a settlement — though she bought a Toyota, her car loan came from another lender — but her case is still instructive.

Given her last name and Compton address, the BISG system estimates that there’s a 97% chance she’s black, which she is. But she hasn’t always lived in Compton. Using two previous addresses, BISG makes a still accurate but much less certain guess.

If Jefferson still lived on Colorado Boulevard in Eagle Rock, BISG would give her a 69% chance of being black and a 20% chance of being white. At a previous address in the high desert city of Apple Valley, BISG would guess there’s a 63% chance that she’s black — and a 27% chance that she’s white.

Though the BISG estimate in all three instances indicates that Jefferson is most likely black, she worries that the system will miss others.

“I think they’re going to lose a lot of people that were overcharged,” Jefferson said.

That’s one of the same arguments raised by Republican members of the House Financial Services Committee. A January report written by GOP committee staff argued that using BISG to determine who should be compensated could result in money intended for minority borrowers ending up in the hands of white borrowers.

The Wall Street Journal found at least one such case, reporting last year that a white man in Alabama received a letter indicating he would soon receive a settlement check from Ally Financial. Elliott himself cautions that BISG was designed to look at large groups of people, not to guess the race of individuals.

“If you want to know the difference in the percent of people with diabetes among people who are black and people who are white, you can answer that question much more accurately than asking, ‘Is this particular person black or white?’” he said. “That’s an inherently harder question.”

Trump’s immigration pivot: Will he be the latest Republican to alienate the base? »

CFPB spokesman Sam Gilford said that while agency does use BISG to determine settlements, it also asks consumers to state their ethnicity.

Congressional Republicans have other complaints with the application of Elliott’s system.

In a statement last year, Rep. Jeb Hensarling (R-Texas), chairman of the House Financial Services Committee, said the bureau is using a flawed analysis to overstep its authority and extract huge settlements from car lenders.

“It is irresponsibly branding companies with the stigma of racial discrimination based on nothing more than junk science,” Hensarling said. “Why? To cudgel those companies into enormous monetary settlements without ever having to go to court.”

Part of Hensarling’s complaint goes to a larger issue: Republicans’ opposition to discrimination claims based on what’s known as disparate impact — the notion that policies that appear to be colorblind can be discriminatory if they harm minority groups.

The 2016 Republican party platform goes so far as to call for ending the use of disparate impact claims when enforcing federal lending laws. Hensarling last year said the CFPB’s use of disparate impact amounted to “inventing discrimination.”

Gilford, though, said that the agency is authorized to look at disparate impacts under federal fair lending laws and that other regulators have done so for decades.

Congressional Republican also argue that the CFPB doesn’t look at any other factors beyond a borrower’s race and the rate they paid — not at income or credit score or whether borrowers shopped around before going to a particular dealership.

An industry-sponsored report from consulting firm Charles River Associates said those factors ought to be taken into account because doing so would dramatically reduce the differences between white and minority borrowers.

The CFPB, though, has said those factors should figure into the initial interest rate borrowers are charged, not the additional markup tacked on by dealerships.

‘Dig deeper’

On most of these questions, Elliott studiously avoids taking a position. He developed an algorithm, not a regulatory action plan, and his expertise is in healthcare, not finance.

He had no idea how the CFPB would use his work, though he doesn’t seem bothered by the resulting uproar.

Rand researchers, Elliott said, are no strangers to their work finding its way into high-profile, often-controversial decisions — for instance, whether to end the military’s ban on openly gay troops, a subject Rand was asked to weigh in on in the early 1990s.

“If you’re going to do policy research, things that matter are inevitably in the political domain,” he said. “If you let that get to you too much, you can’t keep doing this.”

Still, in his own work, Elliott has used his algorithm to ask questions more nuanced than the ones the CFPB is looking into.

On one recent project, he and his team looked at flu vaccination rates, studying who gets them, who doesn’t and why. That involves looking at vaccination rates by race, as well as lots of other factors, such as how often patients go to the doctor or whether they believe themselves to be in excellent or poor health.

That’s because, Elliott said, simply finding that blacks or Latinos are less likely to get vaccinated than whites doesn’t present an obvious way forward. It’s a blunt tool.

“It tells you there’s a difference, but it doesn’t tell you what’s behind it,” he said. “You want to dig deeper and figure out why the differences you’re seeing exist, and you want to develop a plan to improve them.”

james.koren@latimes.com

Follow me: @jrkoren

ALSO

After court rules against parents, toddler is taken off life support

USPS employees accused of hoarding parcels, stealing veterans’ medication

Framed, Chapter 1: She was the PTA mom everyone knew. Who would want to harm her?