Q&A: Behind the scenes of The Times’ coronavirus tracking effort
As the novel coronavirus began hitting the United States — and with it, Southern California — the L.A. Times data and graphics team knew it needed to figure out how to visually represent the spread and scope of the outbreak.
The team of about 20 reporters, computer programmers and designers wanted to make sure that readers could contextualize the flood of information coming at them.
And for several weeks now they have been building and maintaining a page that collects data from more than 60 public health department websites and produces a comprehensive count of cases and deaths that is ahead of state tallies.
Readers have responded with an outpouring of interest. Even Barbara Ferrer, director of the Los Angeles County Department of Public Health, called the tracker impressive.
We asked four journalists — Ben Welsh, Sean Greene, Swetha Kannan and Casey Miller — behind the team effort about the challenges they faced and how their project works.
What makes this project stand out?
The tracker has a detailed map of Southern California in addition to county-by-county details on the virus’ spread, the latest testing efforts and closures for all of California’s counties.
The state reports and updates the number of confirmed cases most days, but the information can be days behind. Instead, journalists are following cases in each of California’s counties and assembling a data set that can’t be found anywhere else.
“Our tracker has been responding to what readers want to know,” Greene said. “We’ve gotten a lot of good feedback, and we try to add features and make this a page that responds to what people want to know. I like to think that we’re listening, and we want to hear what people have to say about it.”
Five times a day, journalists comb through websites to pull the latest numbers for updates. The team is also tracking demographic information about the victims of COVID-19 when available.
“Rightfully so, ours is much more focused on California,” said Casey Miller, referring to other organizations’ more national and global trackers. “That’s who our readers are.”
What problems is this tracker trying to solve?
In a time when information on the novel coronavirus is everywhere, Greene said that the data and graphics team is trying to give readers important context.
“There’s numbers flying at people all over the place,” Greene said. “What we’ve tried to do is take a careful look at them and organize them in a way they can find the right answers for themselves.”
Greene likes to think readers are finding comfort in the page.
Coronavirus is “all they hear about, all they read about, but they come here and they find a kind of odd peace with the data, and I find that kind of interesting,” he said. “There’s sort of comfort in knowledge.”
Welsh said he is inspired and heartened by the dozens and dozens of emails he receives about the project daily.
“There’s no one I’d rather work for than our subscribers,” he said.
What challenges did you face?
When the project launched, the daily tally of positive cases and victims was sourced from the California Department of Public Health website. The team quickly found that the state’s daily totals lagged behind reality. Instead, the team began collecting the data themselves by keeping a tab on each county’s public health website and compiling a spreadsheet with the numbers.
“Those agencies are routinely publishing totals on their websites that are days ahead of what the state puts out,” Welsh said.
Since each of California’s counties collect data differently, it was time-consuming figuring out how to count cases. Some jurisdictions counted nonresidents, such as people on cruise ships, in their tallies. Eventually, the decision was made to focus on residents of California.
Kannan said she didn’t realize how scattered the data really could be.
“It’s a huge challenge to keep collecting this data,” Kannan said.
How was the page put together?
“Like any good data journalism project, this one started with a spreadsheet,” Welsh said.
The team made a list of all the local agencies it needed to check for the latest numbers and bookmarked each website. Five times a day, a journalist will go through each website and log the new totals into a spreadsheet. There’s also a separate sheet where deaths are counted.
“That spreadsheet is essentially a database we’re building ourselves row by row and bookmark by bookmark,” Welsh said.
Once the data collection is complete, a Python computer programming script in a Jupyter Notebook downloads the information. The code transforms and prepares the data to be published.
“It’s using all the fundamentals that we bring to every project,” Welsh said.
How can other newsrooms learn from what The Times did?
“This is just a really good example of how coders, reporters and designers can come together and create something that is really useful and powerful for the public,” Greene said. “I think anyone could have gotten this data. It’s a matter of dedicating the time and staff power to do it and to keep up with it.”
Miller said the project has truly been a team effort.
“The whole team has really pitched in on this, and it’s all the better for it,” she said. “It’s not just three people who are stuck doing this. It’s kind of everybody doing this together.”
Get our free Coronavirus Today newsletter
Sign up for the latest news, best stories and what they mean for you, plus answers to your questions.
You may occasionally receive promotional content from the Los Angeles Times.