Readers have submitted hundreds of questions about The Times’ coronavirus tracker, which compiles the latest data on the spread of COVID-19 in California. Here are answers to the most common queries.
- Where does your data come from?
- Why does your count vary from what’s published by the state?
- Why does the number of new cases and deaths dip every weekend?
- Why don't you publish a statewide total of recoveries?
- Why do your numbers sometimes vary from other websites?
- Why do your totals differ from those announced by Los Angeles County?
- Why can't I find a nursing home I'm looking for?
- How is the doubling time calculated?
- Can I have access to your underlying data?
- Who can I contact if I find an error?
Where does your data come from?
Figures for cases and deaths are first identified by 61 county and city health departments across the state. These agencies independently compile lists of COVID patients in their area and post them on the web.
Reporters in The Times’ Data and Graphics Department visit each agency website, logging the latest totals for cases, deaths and recoveries in a Times database. A computer program developed by The Times draws from that database and updates the tracker’s totals, charts and maps.
The data collection effort is done in partnership with journalists at the San Francisco Chronicle, the San Diego Union-Tribune, KQED, KPCC, CapRadio and Calmatters. Also joining is Big Local News, a project of the Stanford Journalism and Democracy Initiative, which will help Stanford students and journalists at smaller outlets access the data.
Data tracking tests, hospitalizations, prisons and skilled-nursing facilities are drawn from a variety of government sources and consolidated into The Times system.
Why does your count vary from what’s published by the state?
The figures posted online by local health agencies — and gathered soon afterward by The Times — are then entered into a state database that compiles the grand totals announced by the governor and the state’s Department of Public Health. There are often delays in the process.
Because The Times collects data from the local agencies directly, its figures are typically days ahead of state tallies. Officials acknowledge this lag and do not dispute The Times’ method.
Why does the number of new cases and deaths dip every weekend?
Due to bottlenecks at testing labs and other delays in government bureaucracy, the totals posted by local agencies do not reflect the precise date when patients test positive or die. That is especially true on weekends and holidays, when the government’s compilation of new data slows due to reductions in staffing.
That leads to a predictable drop in the number of new cases and deaths announced on weekends and holidays. It also results in a predictable surge in new cases and deaths at the start of each work week.
For this reason, The Times looks to seven-day averages of new cases and deaths instead of single-day totals to evaluate the latest trends. Readers are advised to do the same.
Why don't you publish a statewide total of recoveries?
While most local health agencies publish a tally of how many patients have recovered from the virus, some of the most populous — like Los Angeles County — do not provide this information. The result is that there is no way to compile a reliable statewide figure.
The Times gathers recovered patient totals where available. Those figures are published on the more detailed pages for each of the state’s 58 counties.
Why do your numbers sometimes vary from other websites?
There are several reasons why the charts published by The Times can be different.
The Times’ count relies on the latest reports from the 61 local health agencies across the state, which are the most up-to-date figures available. Some agencies publish more precise charts that reflect the date when each patient tested positive, rather than the day cases were first announced.
That method offers a more accurate reflection of the spread of the virus, but it can take days or weeks to gather the necessary data. It is also not universally published by all agencies, making it impossible to compile statewide. Due to the delays and the lack of universal availability, The Times chooses to plot tallies using the date that cases and deaths are reported.
Additionally, some local agencies revise past tallies after errors are identified and corrected. In cases where these updates are announced and posted online, The Times seeks to reconcile its count to match.
Finally, some local agencies choose to exclude cases from jails and prisons. The Times includes them.
Why do your totals differ from those announced by Los Angeles County?
Cases and deaths in Los Angeles County are initially compiled by three local agencies: the L.A. County Department of Public Health, the Long Beach Department of Health and Human Services and the Pasadena Public Health Department.
The county department compiles figures from all three agencies and releases the combined total each day. The data it includes from Long Beach and Pasadena are often a day behind. The Times gathers more up-to-date totals from the websites of the two city agencies instead.
Why can't I find a nursing home I'm looking for?
Case and death counts for nursing homes and assisted-living facilities are drawn from two state sources: the California Department of Public Health, which oversees skilled-nursing facilities, and the California Department of Social Services, which oversees residential-care facilities for the elderly and other adult residential facilities.
The agencies post new totals most days, which are collected by Times reporters and entered into a combined database. To appear on our tracker, a facility must be licensed by one of the two agencies and appear on either list.
How is the doubling time calculated?
The growth in coronavirus cases has followed a similar pattern around the world. Left unchecked, the virus spreads in a predictable way, with each day’s count increasing by a reliable percentage over the previous day. In mathematics, this pattern is known as exponential growth.
One way to measure the speed of the spread is by calculating the amount of time it would take for an area’s total cases to double. For instance, if, over a series of days, an area reports case totals of 100, followed by 130, followed by 170, followed by 220, the case count is growing by about 30% per day. Looked at another way, the total number of cases doubled in less than three days.
To gauge recent trends, The Times follows this process for the state and all 58 counties with the last seven days of data. Using statistical software, those figures are fitted to an exponential curve to estimate the current rate of growth. With some algebra, that percentage growth can be translated to a doubling time. In the previous example, a 30% growth rate is equivalent to a doubling time of 2.6 days.
Can I have access to your underlying data?
In an effort to aid scientists and researchers in the fight against COVID-19, The Times has released its database of California coronavirus cases to the public.
The database is available on Github, a popular website for hosting data and computer code. The files are updated daily at github.com/datadesk/california-coronavirus-data.
Who can I contact if I find an error?
Compiling The Times database requires hours of data entry each day. Mistakes happen. If you see information that you believe is incorrect or out of date, please contact Data and Graphics Editor Ben Welsh at firstname.lastname@example.org.