Monday, April 6, 2020

All Models are Wrong But Some Models are Useful

With the title quotation from George Box ringing in our heads, a round up of useful articles:

Da Mail: Coronavirus peak death rate will strike U.S. in 11 days when 2,644 people will die in 24 hours as shocking graphs reveal grim state-by-state breakdown of when hospitals will be overwhelmed and how many will die

Many more graphs at the link. But don't just see the central trend. See the ranges too.  From the Peacock, Bill Gates calls coronavirus pandemic a ‘nightmare scenario,’ but predicts lower death toll than Trump. Well, my cynical political guess is that Fauci, Brix and Trump are all overstating the predicted deaths to claim victory when they come in lower. Based on that alone, I'm going to predict 50k. If it comes in lower than 50k, it probably wasn't worth killing the economy over.

Stacy McCain, Coronavirus ‘Myths’ and Real Numbers
There has been a lot of noise in the media about Republican governors in some states being reluctant to impose statewide stay-at-home orders, and people in rural America failing to follow “social distancing” guidelines. Supposedly, this is a result of their believing “myths” that they are somehow safe from coronavirus, but because I’m not a mindreader, I can’t presume to know what people believe. Certainly there have been outbreaks in rural communities, as in the case of Dougherty County, Georgia, were two large funerals Feb. 29 and March 7 acted as “super-spreading” events, infecting dozens of people, 90% of whom were black.
. . .
It must be understood that risk is a matter of statistical probability. Your infection risk is lower in rural Minnesota than it is in Detroit or in New York City, but being “low risk” is not the same thing as being “safe.” In a pandemic, nobody is at zero risk. But even the most drastic governmental restrictions will not lower the risk to zero. Italy has been under a nationwide lockdown order since March 11. Friday, they reported 4,585 new COVID-19 cases — and that’s good news, because the daily number of new cases has declined 30% since March 21, when Italy reported 6,557 new cases. So, after three weeks of lockdown, Italy has already “flattened the curve” (i.e., the number of new cases has already peaked), but they’re still reporting thousands of new cases daily and people continue to die. That’s simply the reality, and nothing we can do in the United States will prevent our outbreak from following the same trajectory. Once we have passed the crisis point — once the outbreaks in New York, in Detroit and other “danger zones” have passed their peak, straining available medical resources — it’s not as if we will then return to a condition of “safety.”

The pandemic will run its course, and a certain number of people will die, because the death toll for May is pretty much already baked into the pie, so to speak. Once you have imposed the most drastic lockdown measures, there is really nothing more you can do, in terms of “flattening the curve,” but at some point the pandemic will peak — reaching the “apex,” as New York Gov. Andrew Cuomo says repeatedly in his daily briefings — and then you should be prepared to begin a return to normal. Italy has already passed its “apex,” but their hospital system is still overwhelmed and they are still recording more than 700 COVID-19 deaths daily.
 From Maggies Farm, Covid-19: What to do when you have no meaningful data yet

Five Thirty Eight, Coronavirus Case Counts Are Meaningless*
If you’re a regular reader of FiveThirtyEight, you’re probably used to looking at data in sports — where basically everything that happens on a basketball court or a baseball diamond is recorded — or in electoral politics, when polls (in theory, anyway) survey a random sample of the population. COVID-19 statistics, especially the number of reported cases, are not at all like that. The data, at best, is highly incomplete, and often the tip of the iceberg for much larger problems. And data on tests and the number of reported cases is highly nonrandom. In many parts of the world today, health authorities are still trying to triage the situation with a limited number of tests available. Their goal in testing is often to allocate scarce medical care to the patients who most need it — rather than to create a comprehensive dataset for epidemiologists and statisticians to study.

But if you’re not accounting for testing patterns, it can throw your conclusions entirely out of whack. You don’t just run the risk of being a little bit wrong: Your analysis could be off by an order of magnitude. Or even worse, you might be led in the opposite direction of what is actually happening. A country where the case count is increasing because it’s doing more testing, for instance, might actually be getting its epidemic under control. Alternatively, in a country where the reported number of new cases is declining, the situation could actually be getting worse, either because its system is too overwhelmed to do adequate testing or because it’s ramping down on testing for PR reasons.

Failure to account for testing strategies can also render comparisons between states and countries meaningless. According to two recent epidemiological studies, which tried to infer the true number of infected people from the reported number of deaths, there is roughly a 20-fold difference in case detection rates between the countries that are doing the best job of it, such as Norway and the worst job, such as the United Kingdom. (The United States is probably somewhere in the middle of the pack by this standard.) That means, for example, that in one country that reports 1,000 COVID-19 cases, there could actually be 5,000 infected people, and in another country that reports 1,000 cases, there might be 100,000!

There is also a lot of uncertainty about the true numbers of infections within a given country. According to an expert survey published by FiveThirtyEight, the number of detected cases in the United States could underestimate the true number of infected people by anywhere from a multiple of two times to 100 times. The same holds in other countries. A recent paper published by Imperial College London estimated that the true number of people who had been infected with the coronavirus in the U.K. as of March 30 was somewhere between 800,000 and 3.7 million — as compared to a reported case count through that date of just 22,141.
Locomotive Breath, Morning Coffee
Almost 287% Wuhan Corona Virus-free. Because without a hard numerator/denominator, it's all just academic chicken scratch.
Analyst Discovers a Major Flaw in IHME Model Used by White House; Actual Numbers Are a Fraction of Expected
"Davis writes that if we’re going to shut down the entire nation’s economy to “flatten the curve” based on the projections of a single model, it shouldn’t be too much to ask that the model approximate reality when it comes to hospitalizations."
Models are notoriously inaccurate. IE: economic forecasts (unexpectedly!), global warming, etc. Yet, gross over-reactions are often the result. And people frequently prefer tribalism to rational debate about these numbers. I get called everything but a child of God when I simply ask, "How is it every year tens of millions of flu infections, hundreds of thousands of hospitalizations for flu, and tens of thousands of flu deaths, do not overwhelm our medical system, or our economy, but this mysterious Wuhan Corona Virus with far fewer numbers has plunged the entire world into medieval darkness?"

"Shut up!," they yell.
And Anthony Watts from WUWT, #coronavirus How to analyze and not analyze #COVID-19 deaths
Don’t look just at deaths from coronavirus, look at cumulative deaths from comorbidities. Since most people dying from coronavirus also exhibit comorbidities,[1] and it is unclear how deaths are assigned to the former rather than one of the co-morbidities and whether there is a uniform accepted methodology from one doctor to another (or one hospital to another or one country to another) in the assignments, it is not clear how much credence can be given to coronavirus death estimates at this time.

This also means that we shouldn’t attempt cross-country and cross-jurisdictional comparisons because they could mislead. It is best to look at (and compare) aggregate excess deaths from all co-morbidities rather than just one or another co-morbidity. I would suggest looking at excess deaths against an average over the last 5-10 years for both all-cause deaths and deaths from all coronavirus-plus- comorbidities to get an idea about how devastating coronavirus has been versus an average year.

To compare deaths between jurisdictions, don’t look at absolute deaths, look at death rates, based on population sizes. It makes no sense to compare absolute numbers of deaths in Italy, UK, San Marino, and Sweden against those in the U.S.

Each area is different. From where I sit — in Northern Virginia — New York is another country. And from upstate New York, New York City is also another country. Risk factors such as population density, use of mass transit, presence of people who have recently travelled elsewhere, norms regarding appropriate social distance, household size, age composition of households, and all the other coronavirus risk factors are likely to be different in each area. One should, therefore, expect each location would have its own curve that would have to be flattened. Some areas may literally be “ahead of the curve” since these areas have had some advance warning before the virus was brought into their communities and may not need to take drastic measures to flatten the curve. Aggregating data across urban and rural areas does not make much sense.
When Mom died at 93 years old, from complications from a fall that caused her leg to become anoxic, the death certificate listed diabetes, COPD and congenital heart failure as contributing to her death. Does that mean they get to count those for government statistics, and use them in political propaganda? My guess is yes.

So bottom line? We still don't know how serious this is, but don't worry, we're about to find out.

No comments:

Post a Comment