posts: 35911
Data license: CC-BY
This data as json
id | title | slug | type | status | content | archieml | archieml_update_statistics | published_at | updated_at | gdocSuccessorId | authors | excerpt | created_at_in_wordpress | updated_at_in_wordpress | featured_image | formattingOptions | markdown | wpApiSnapshot |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
35911 | How epidemiological models of COVID-19 help us estimate the true number of infections | covid-models | post | publish | <!-- wp:html --> <div class="blog-info"> <p>Our World in Data presents the data and research to make progress against the world’s largest problems.<br>Our main publication on the pandemic is here: <strong><a href="https://ourworldindata.org/coronavirus" target="_blank" rel="noopener noreferrer">Coronavirus Pandemic (COVID-19)</a></strong>.<br><br>We are grateful to the researchers whose work we cover in this post for giving helpful feedback and suggestions. Thank you.</p> </div> <!-- /wp:html --> <!-- wp:owid/last-updated --> <!-- wp:paragraph {"placeholder":"Enter last updated information..."} --> <p>We update the model estimates with the latest available data each week. Last update: <strong>16 January 2022</strong>.</p> <!-- /wp:paragraph --> <!-- /wp:owid/last-updated --> <!-- wp:paragraph --> <p>A key limitation in our understanding of the COVID-19 pandemic is that we do not know the <em>true</em> number of infections. Instead, we only know of infections that have been confirmed by a test – the confirmed cases. But because many infected people never get tested,{ref}Infected people might not get tested for several reasons, such as not having easy access to testing or not even knowing they are infected because they have no symptoms (though they are still able to transmit the virus). Such asymptomatic infections are estimated to be 10–70% of total infections. Source: <a rel="noreferrer noopener" href="https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html" target="_blank">CDC COVID-19 Pandemic Planning Scenarios</a>.{/ref} we know that confirmed cases are only a fraction of true infections. How small a fraction though?</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>To answer this question, several research groups have developed epidemiological models of COVID-19. These models use the data we have – confirmed cases and deaths, testing rates, and more – plus a range of assumptions and epidemiological knowledge to estimate true infections and other important metrics.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":4} --> <h4></h4> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The chart here shows the mean estimates of the true number of daily new infections in the United States from four of the most prominent models.{ref}There are many models in use besides these four, including other ones by the research groups we cover here. We chose these four models because they are prominent, have been used by policymakers, and have been updated regularly. We use them more for illustration than completeness.{/ref} For comparison, the number of confirmed cases is also shown.</p> <!-- /wp:paragraph --> <!-- wp:list --> <ul><li><a href="http://ourworldindata.org/covid-models#imperial-college-london-icl">Imperial College London (ICL)</a></li><li><a href="http://ourworldindata.org/covid-models#institute-for-health-metrics-and-evaluation-ihme">The Institute for Health Metrics and Evaluation (IHME)</a></li><li><a href="http://ourworldindata.org/covid-models#youyang-gu-yyg">Youyang Gu (YYG)</a></li><li><a href="http://ourworldindata.org/covid-models#london-school-of-hygiene-tropical-medicine-lshtm">The London School of Hygiene & Tropical Medicine (LSHTM)</a></li></ul> <!-- /wp:list --> <!-- wp:html --> <iframe src="https://ourworldindata.org/grapher/daily-new-estimated-infections-of-covid-19" loading="lazy" style="width: 100%; height: 1000px; border: 0px none;"></iframe> <!-- /wp:html --> <!-- wp:paragraph --> <p>Two things are clear from this chart: All four models agree that true infections <em>far outnumber</em> confirmed cases. But the models disagree by how much, and how infections have changed over time.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>When the number of confirmed cases in the US reached a peak in late July 2020, the IHME and LSHTM models estimated that the true number of infections was about twice as high as confirmed cases, the ICL model estimated it was nearly three times as high, and Youyang Gu's model estimated it was more than <em>six times</em> as high. Back in March the estimated discrepancy between confirmed cases and true infections was even many times higher.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>In this post we examine these four models and how they differ by unpacking their essential elements: what they are used for, how they work, the data they are based on, and the assumptions they make.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>We also aim to make the model estimates easily accessible in our interactive charts, allowing you to quickly explore different models of the pandemic for most countries in the world. To do this simply click "Change country" on each chart.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Three of the four models we look at are “SEIR”{ref}Pronounced by saying each letter, “S-E-I-R.”{/ref} models,{ref}The London School model is not an SEIR model.{/ref} which simulate how individuals in a population move through four states of a COVID-19 infection: being <strong>S</strong>usceptible, <strong>E</strong>xposed, <strong>I</strong>nfectious, and <strong>R</strong>ecovered (or deceased). How individuals move through these states is determined by different model “parameters,” of which there are many. Two key ones are the effective reproduction number (Rt){ref}Also called "time-varying" reproduction number.{/ref} – how many other people a person with COVID-19 infects at a given time – and the infection fatality rate (IFR) – the percent of people infected with a disease who die from it.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>You can learn more about how SEIR models work by exploring these resources:</p> <!-- /wp:paragraph --> <!-- wp:list --> <ul><li><a href="https://covid19-projections.com/model-details/" target="_blank" rel="noreferrer noopener">Youyang Gu’s Model Details</a> (for a brief read)</li><li><a href="https://youtu.be/Lcx2a1jXISc" target="_blank" rel="noreferrer noopener">COVID Act Now’s COVID Data 101: What is an SEIR model?</a> (for a brief video)</li><li><a href="https://medium.com/data-for-science/epidemic-modeling-102-all-covid-19-models-are-wrong-but-some-are-useful-c81202cc6ee9" target="_blank" rel="noreferrer noopener">Bruno Gonçalves’s Epidemic Modeling 102: All CoVID-19 models are wrong, but some are useful</a> (for a more in-depth read)</li></ul> <!-- /wp:list --> <!-- wp:heading {"level":3} --> <h3>Imperial College London (ICL)</h3> <!-- /wp:heading --> <!-- wp:heading {"level":5} --> <h5>Age-structured SEIR model focused on low- and middle-income countries (details as of 23 August 2020)</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>This chart shows the ICL model’s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click "Change country." The lines labeled “upper” and “lower” show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown.</p> <!-- /wp:paragraph --> <!-- wp:html --> <iframe src="https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-icl-model" loading="lazy" style="width: 100%; height: 1000px; border: 0px none;"></iframe> <!-- /wp:html --> <!-- wp:heading {"level":5} --> <h5><strong>Website</strong></h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p><a rel="noreferrer noopener" href="https://mrc-ide.github.io/global-lmic-reports/" target="_blank">https://mrc-ide.github.io/global-lmic-reports/</a></p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>Regions covered</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>164 countries and territories across the world</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>Time covered</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The first date covered is the estimated start of the pandemic for each country. The model makes projections that extend 90 days past the latest date of update.{ref}While projections are an important aspect of what this and some other models are used for, we do not cover them in this article.{/ref}</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>Update frequency</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>About 2–3 times per week</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What is the model?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The model is a stochastic SEIR variant with multiple infectious states to reflect different COVID-19 severities, such as mild or asymptomatic versus severe.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What is the model used for?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>ICL describes its model as a tool to help countries understand at what stage the country is in its epidemic (e.g., before or after a peak) and how healthcare demand might change in the future under three policy scenarios. These scenarios are designed to provide a counterfactual of what could happen if current interventions were maintained, increased, or relaxed and are therefore not intended to forecast future mortality.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>ICL uses the model estimates to write reports for individual low- and middle-income countries (LMICs) that are relatively early in their epidemics; these reports are focused on the next 28 days. The downloadable model estimates additionally include data for some high-income countries later in their epidemics (e.g., the US and EU countries) and projections 90 days into the future.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Based on the model ICL publishes estimates of the following metrics:</p> <!-- /wp:paragraph --> <!-- wp:list --> <ul><li>True infections (to-date and projected)</li><li>Confirmed deaths (projected)</li><li>Hospital and ICU demand (to-date and projected)</li><li>Effective reproduction number, Rt (to-date and projected)</li></ul> <!-- /wp:list --> <!-- wp:heading {"level":5} --> <h5>What data is the model based on?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The model is “fit” to data on confirmed deaths{ref}As reported by the European Centre for Disease Prevention and Control (ECDC).{/ref} by using an estimated IFR to “back-calculate” how many infections would have been likely over the previous weeks to produce that number of deaths. It uses mobility data – from <a rel="noreferrer noopener" href="https://ourworldindata.org/covid-mobility-trends" target="_blank">Google</a> or, if unavailable, inferred from <a rel="noreferrer noopener" href="https://www.acaps.org/covid19-government-measures-dataset" target="_blank">ACAPS government measures data</a> – to modulate the Rt, the key parameter on how transmission is changing.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Additionally, the model uses age- and country-specific data on demographics, patterns of social contact, hospital availability, and the risk of hospitalization and death, though the availability of this data varies by country.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What are key assumptions and potential limitations?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The model uses an estimated IFR for each country calculated by applying age-specific IFRs observed in China and Europe (of about 0.6–1%) to that country’s age distribution. In countries like many LMICs with younger populations than in China and Europe, this results in IFR estimates of typically 0.2–0.3% because younger populations have lower associated mortality rates. These lower mortality rates, however, assume access to sufficient healthcare, which might not always be the case in LMICs. Differences between the estimated and true IFRs could impact the accuracy of model estimates.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The model assumes that the number of confirmed deaths is equal to the true number of deaths. But <a rel="noreferrer noopener" href="https://ourworldindata.org/excess-mortality-covid" target="_blank">research on excess mortality</a> and known limitations to testing and reporting capacity suggest that confirmed deaths are often fewer than true deaths. Where this is the case the model likely underestimates the true health burden.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The model assumes that the change in transmission over time is a function of average mobility trends for places like stores and workplaces but not parks and residential areas.{ref}The model assumes that in parks “significant contact events are negligible” and that an “increase in residential movement will not change household contacts.”{/ref} If these assumptions about mobility and transmission do not hold, the model might not accurately track the pandemic.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Like all models, this one makes many assumptions, and we cover only a few key ones here. For a full list see <a rel="noreferrer noopener" href="https://mrc-ide.github.io/global-lmic-reports/parameters.html" target="_blank">the model methods description</a>.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":3} --> <h3>Institute for Health Metrics and Evaluation (IHME)</h3> <!-- /wp:heading --> <!-- wp:heading {"level":5} --> <h5>Hybrid statistical/SEIR model (details as of 23 August 2020)</h5> <!-- /wp:heading --> <!-- wp:heading {"level":5} --> <h5>Update: IHME <a rel="noreferrer noopener" href="https://www.healthdata.org/covid/data-downloads" data-type="URL" target="_blank">announced</a> that "after December 16, 2022, IHME will pause its COVID-19 modeling for the foreseeable future."</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>This chart shows the IHME model’s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click "Change country." The lines labeled “upper” and “lower” show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown.</p> <!-- /wp:paragraph --> <!-- wp:html --> <iframe src="https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-ihme-model" loading="lazy" style="width: 100%; height: 1000px; border: 0px none;"></iframe> <!-- /wp:html --> <!-- wp:heading {"level":5} --> <h5><strong>Website</strong></h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p><a href="https://covid19.healthdata.org/" target="_blank" rel="noreferrer noopener">https://covid19.healthdata.org/</a></p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>Regions covered</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>159 countries and territories across the world including subnational data for the US and several other countries</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>Time covered</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The first date covered varies by country<strong>.</strong> The model makes projections that extend approximately 90–120 days past the latest date of update.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>Update frequency</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>About once a week (though not all countries are updated each time)</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What is the model?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The model is a hybrid with two main components: a statistical “death model” component produces death estimates that are used to fit an SEIR model component.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Note that the model has had two significant updates since its initial publication:</p> <!-- /wp:paragraph --> <!-- wp:list --> <ul><li><a href="http://www.healthdata.org/sites/default/files/files/Projects/COVID/Estimation_update_050420.pdf" target="_blank" rel="noreferrer noopener">The SEIR component was added on 4 May 2020</a></li><li><a href="http://www.healthdata.org/sites/default/files/files/Projects/COVID/Estimation_update_05.30.2020.pdf" target="_blank" rel="noreferrer noopener">The death model component was updated on 29 May 2020</a></li></ul> <!-- /wp:list --> <!-- wp:heading {"level":5} --> <h5>What is the model used for?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>IHME describes its model as a tool to help government officials understand how different policy decisions could impact the course of the pandemic and to plan for changing healthcare demand.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The model makes deaths projections that have been highly publicized and sometimes criticized.{ref}For example: Sharon Begley (2020, 17 Apr.) “<a rel="noreferrer noopener" href="https://www.statnews.com/2020/04/17/influential-covid-19-model-uses-flawed-methods-shouldnt-guide-policies-critics-say/" target="_blank">Influential Covid-19 model uses flawed methods and shouldn’t guide U.S. policies, critics say.</a>” STAT News.{/ref} Though much of the criticism was leveled at a previous version of the model, known as “CurveFit,” that was used before the SEIR component was added on 4 May. The projections are made under currently three scenarios.{ref}For more details about the scenarios see the <a rel="noreferrer noopener" href="http://www.healthdata.org/covid/faqs" target="_blank">model FAQs</a>.{/ref}</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Based on the model IHME publishes estimates of the following metrics:</p> <!-- /wp:paragraph --> <!-- wp:list --> <ul><li>True infections (to-date and projected)</li><li>Confirmed deaths (projected)</li><li>Hospital, ICU, and ventilator demand (to-date and projected)</li><li>Effective reproduction number, Rt (to-date and projected)</li><li>Testing levels (projected)</li><li>Mobility, as a proxy for social distancing (projected)</li></ul> <!-- /wp:list --> <!-- wp:heading {"level":5} --> <h5>What data is the model based on?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The death model uses data on confirmed cases, confirmed deaths,{ref}Confirmed cases and deaths data as reported by Johns Hopkins University and several official sources.{/ref} and testing.{ref}As reported by the COVID Tracking Project (for US), official sources (Brazil and Dominican Republic), and Our World in Data (all other countries).{/ref}</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The SEIR model is fit to the output of the death model by using an estimated IFR to back-calculate the true number of infections.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The model uses several other types of data to simulate transmission and disease progression: mobility, social distancing policies, population density, pneumonia seasonality and death rate, air pollution, altitude, smoking rates, and self-reported contacts and mask use. Details on the sources of these data can be found on the <a rel="noreferrer noopener" href="http://www.healthdata.org/covid/faqs" target="_blank">model FAQs</a> and <a rel="noreferrer noopener" href="http://www.healthdata.org/covid/updates" target="_blank">estimation updates</a> pages.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What are key assumptions and potential limitations?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The model uses an estimated IFR based on data from the Diamond Princess cruise ship and New Zealand. Though IHME does not give numbers for these, the Diamond Princess IFR has been estimated at 0.6% (95% uncertainty interval of 0.2–1.3%).{ref}Russell et al (2020). Estimating the infection and case fatality ratio for coronavirus disease (COVID-19) using age-adjusted data from the outbreak on the Diamond Princess cruise ship. Eurosurveillance, 25(12). <a href="https://doi.org/10.2807/1560-7917.ES.2020.25.12.2000256" target="_blank" rel="noreferrer noopener">https://doi.org/10.2807/1560-7917.ES.2020.25.12.2000256</a>{/ref} Differences between the estimated and true IFRs could impact the accuracy of model estimates.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The death model makes several assumptions about the relationship between confirmed deaths, confirmed cases, and testing levels. For example, that a decreasing <em>case</em> fatality rate (CFR) – the ratio of <em>confirmed</em> deaths to <em>confirmed</em> cases{ref}The CFR is similar to the IFR but uses the <em>confirmed</em> deaths and cases reported by countries. In contrast, the IFR uses true deaths and infections, which are generally not known and have to be estimated.{/ref} – is reflective of increasing testing and a shift toward testing mild or asymptomatic cases. But the CFR could also decrease for other reasons, such as improved treatment or a decline in the average age of infected people.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The model assumes that the change in transmission over time is a function of several data inputs (listed above), like mobility and population density. If these assumptions do not hold – for example, because the data is less relevant or its relationship with transmission is misspecified – the model might not accurately track the pandemic.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>More details are discussed in the <a href="http://www.healthdata.org/covid/faqs" target="_blank" rel="noreferrer noopener">model FAQs</a> and in different <a href="http://www.healthdata.org/covid/updates" target="_blank" rel="noreferrer noopener">estimation update reports</a>.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":3} --> <h3>Youyang Gu (YYG)</h3> <!-- /wp:heading --> <!-- wp:heading {"level":5} --> <h5>SEIR model with machine learning layer (details as of 23 August 2020)</h5> <!-- /wp:heading --> <!-- wp:heading {"level":5} --> <h5>Update: Youyang Gu <a rel="noreferrer noopener" href="https://youyanggu.com/blog/six-months-later" data-type="URL" data-id="https://youyanggu.com/blog/six-months-later" target="_blank">announced</a> that 5 October 2020 is the final model update</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>This chart shows the YYG model’s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click "Change country." The lines labeled “upper” and “lower” show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown.</p> <!-- /wp:paragraph --> <!-- wp:html --> <iframe src="https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-yyg-model" loading="lazy" style="width: 100%; height: 1000px; border: 0px none;"></iframe> <!-- /wp:html --> <!-- wp:heading {"level":5} --> <h5><strong>Website</strong></h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p><a href="https://covid19-projections.com/" target="_blank" rel="noreferrer noopener">https://covid19-projections.com/</a></p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>Regions covered</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>71 countries across the world including subnational data for the US and Canada</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>Time covered</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The first date covered varies by country<strong>.</strong> The model makes projections that extend approximately 90 days past the latest date of update.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>Update frequency</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>Daily</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What is the model?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The model consists of an SEIR base with a machine learning layer on top to search for the parameters that minimize the error between the model estimates and the observed data.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What is the model used for?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>Youyang describes his model as making projections of true infections and deaths that optimize for forecast accuracy. Though he also stresses that his projections cover a range of possible outcomes, and that projections are not “wrong” if they help shape a different outcome in the future.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Based on the model Youyang publishes estimates of the following metrics:</p> <!-- /wp:paragraph --> <!-- wp:list --> <ul><li>True infections (to-date and projected)</li><li>Confirmed deaths (projected)</li><li>Effective reproduction number, Rt (to-date and projected)</li><li>Tests per day targets (projected)</li></ul> <!-- /wp:list --> <!-- wp:paragraph --> <p>The model does not focus on projections under different scenarios, but has explored what would have happened if the US had mandated social distancing <a href="https://covid19-projections.com/us-1weekearlier" target="_blank" rel="noreferrer noopener">one week earlier</a> or <a href="https://covid19-projections.com/us-1weeklater" target="_blank" rel="noreferrer noopener">one week later</a>, or <a href="https://covid19-projections.com/us-self-quarantine" target="_blank" rel="noreferrer noopener">if 20% of infected people immediately self-quarantined</a>.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What data is the model based on?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The model is fit to data on confirmed deaths{ref}As reported by Johns Hopkins University. The data is smoothed before fitting.{/ref} by using an estimated IFR to back-calculate the true number of infections. Confirmed cases and hospitalization data are sometimes used to help set bounds for the machine learning parameter search.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What are key assumptions and potential limitations?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The model uses an estimated IFR for each region based initially on that region’s observed CFR. The IFR is then decreased{ref}Except in “later-impacted regions like Latin America, we wait an additional 3 months before beginning to decrease the IFR.”{/ref} linearly over the span of three months until it is 30% of its initial value to reflect the lower average age of infections and improving treatments. Currently, the IFR is estimated to be 0.2–0.4% in most of the US and Europe. Differences between the estimated and true IFRs could impact the accuracy of model estimates.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The model assumes there will be unreported deaths for the "first few weeks” of a region’s pandemic, and that this underreporting will decrease until the number of confirmed deaths equals true deaths. As noted before, this is often not the case, and thus the model might underestimate the true health burden.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The model makes assumptions about how reopening will affect social distancing and ultimately transmission. For example, if reopening causes a resurgence of infections, the model assumes regions will take action to reduce transmission, which is modeled by limiting the Rt. It also assumes a reopening date for regions (especially outside the US and Europe) where the true date is unknown.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The model was created and optimized for the US. Thus for other countries the model estimates might be less accurate.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>For a full list of assumptions and limitations see <a href="https://covid19-projections.com/about/#assumptions" target="_blank" rel="noreferrer noopener">the model "About" page</a>.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":3} --> <h3>London School of Hygiene & Tropical Medicine (LSHTM)</h3> <!-- /wp:heading --> <!-- wp:heading {"level":5} --> <h5>Statistical model estimating underreporting of infections (details as of 23 August 2020)</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>This chart shows the LSHTM model’s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click "Change country." The lines labeled “upper” and “lower” show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown.</p> <!-- /wp:paragraph --> <!-- wp:html --> <iframe src="https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-lshtm-model" loading="lazy" style="width: 100%; height: 1000px; border: 0px none;"></iframe> <!-- /wp:html --> <!-- wp:heading {"level":5} --> <h5><strong>Website</strong></h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p><a rel="noreferrer noopener" href="https://cmmid.github.io/topics/covid19/global_cfr_estimates.html" target="_blank">https://cmmid.github.io/topics/covid19/global_cfr_estimates.html</a></p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>Regions covered</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>159 countries and territories across the world (those with at least 10 confirmed deaths out of a total of 210)</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>Time covered</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The first date covered varies by country<strong>. </strong>The model does not make projections.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>Update frequency</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>About once a week</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What is the model?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The model starts with a country’s CFR and adjusts it for the fact that there is a delay of roughly 2–3 weeks between case confirmation and death (or recovery).{ref}The typical CFR calculation divides confirmed deaths by confirmed cases <em>reported on the same day</em>, but those deaths were actually caused by cases confirmed roughly 2–3 weeks before.{/ref} This delay-adjusted CFR is then compared to a baseline, delay-adjusted CFR to estimate the "ascertainment rate" – the proportion of all <em>symptomatic</em> infections that have actually been confirmed.{ref}All but a trivial number of confirmed cases are assumed to be symptomatic.{/ref}</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>This estimated ascertainment rate is then used to adjust the number of confirmed cases{ref}This data is first smoothed.{/ref} to estimate the true number of symptomatic infections. To finally estimate <em>total</em> infections, the symptomatic infections estimate is adjusted to include <em>asymptomatic</em> infections, which are estimated to compose between 10–70% (median 50%) of total infections.{ref}In accordance with this methodology and in consultation with the LSHTM researchers, we perform these calculations to produce the estimates of total infections presented here.{/ref}</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What is the model used for?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>LSHTM describes its model as a tool to help understand the level of undetected epidemic progression and to aid response planning, such as when to introduce and relax control measures.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Based on the model LSHTM publishes estimates of the ascertainment rate.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What data is the model based on?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The model is based on data on confirmed deaths and confirmed cases.{ref}Both as reported by the ECDC.{/ref}</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":5} --> <h5>What are key assumptions and potential limitations?</h5> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The model assumes a baseline, delay-adjusted CFR of 1.4% and that any difference between that and a country’s delay-adjusted CFR is entirely due to under-ascertainment. But many other factors likely play a role, such as the burden on the healthcare system, COVID-19 risk factors in the population, the ages of those infected, and more.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The assumed baseline CFR is based on data from China and does not account for different age distributions outside China. This causes the ascertainment rate to be overestimated in countries with younger populations and underestimated in countries with older populations.{ref}In a secondary analysis the LSHTM researchers do adjust the baseline CFR for different age distributions. But this has its own assumptions and limitations and is thus not clearly a better approach. More details can be found in <a rel="noreferrer noopener" href="https://cmmid.github.io/topics/covid19/reports/UnderReporting.pdf" target="_blank">the full report</a>.{/ref}</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The model assumes that the number of confirmed deaths is equal to the true number of deaths. As noted before, this is often not the case, and thus the model might underestimate the true health burden.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Reported deaths data is sometimes changed retroactively, which can be challenging for the model and might affect its estimates.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>More assumptions and limitations are discussed in <a rel="noreferrer noopener" href="https://cmmid.github.io/topics/covid19/reports/UnderReporting.pdf" target="_blank">the full report</a>.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":3} --> <h3>How should we think about these models and their estimates?</h3> <!-- /wp:heading --> <!-- wp:paragraph --> <p>All four models we looked at agree that true infections far outnumber confirmed cases, but they disagree by how much. We now have some insight into these differences: The models all differ to some degree in what they are used for, how they work, the data they are based on, and the assumptions they make.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Making these differences transparent helps us understand how we should think about these models and their estimates. For example, understanding that some models are used for scenario planning and not forecasting (like ICL’s) while others are optimized for forecast accuracy (like Youyang’s) puts their estimates in context. And the models all make different assumptions that each have limitations; we can decide if those limitations are relevant to a given situation.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>In the end, though, we still want to have confidence that models can track the pandemic accurately. We can calibrate our confidence in different models by giving their estimates a reality check.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>One way to do this is to compare model estimates against some observed “ground truth” data. For example, if a model is forecasting the number of deaths four weeks from now, we can wait four weeks and compare the forecast to the deaths that actually occur.{ref}Though we still need to consider that such forecasts might not track what actually occurs if they help shape a different outcome in the future.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Some current efforts to score forecasts for accuracy are by <a rel="noreferrer noopener" href="https://github.com/youyanggu/covid19-forecast-hub-evaluation" target="_blank">Youyang Gu</a>, <a rel="noreferrer noopener" href="http://www.healthdata.org/research-article/predictive-performance-international-covid-19-mortality-forecasting-models" target="_blank">IHME</a>, <a rel="noreferrer noopener" href="https://zoltardata.com/about" target="_blank">The Zoltar Project</a>, and <a href="https://covidcompare.io/" target="_blank" rel="noreferrer noopener">Covid Compare</a>.{/ref}</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>But sometimes the ground truth is not easily observed, as is the case with the true number of infections. Here we have to look for <em>converging evidence</em> from other research, such as from seroprevalence studies that test for COVID-19 antibodies in the blood serum to estimate how many people have ever been infected.{ref}The LSHTM researchers, for example, compared their model estimates to seroprevalence estimates and found good agreement. You can read more about this in <a rel="noreferrer noopener" href="https://cmmid.github.io/topics/covid19/Under-Reporting.html" target="_blank">their full report.</a>{/ref}</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>By gaining a deeper, more nuanced understanding of these models and their strengths and weaknesses, we can use them as valuable tools to help make progress against the pandemic.</p> <!-- /wp:paragraph --> | { "id": "wp-35911", "slug": "covid-models", "content": { "toc": [], "body": [ { "type": "text", "value": [ { "text": "Our World in Data presents the data and research to make progress against the world\u2019s largest problems.", "spanType": "span-simple-text" }, { "spanType": "span-newline" }, { "text": "Our main publication on the pandemic is here: ", "spanType": "span-simple-text" }, { "children": [ { "url": "https://ourworldindata.org/coronavirus", "children": [ { "text": "Coronavirus Pandemic (COVID-19)", "spanType": "span-simple-text" } ], "spanType": "span-link" } ], "spanType": "span-bold" }, { "text": ".", "spanType": "span-simple-text" }, { "spanType": "span-newline" }, { "spanType": "span-newline" }, { "text": "We are grateful to the researchers whose work we cover in this post for giving helpful feedback and suggestions. Thank you.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "We update the model estimates with the latest available data each week. Last update:\u00a0", "spanType": "span-simple-text" }, { "children": [ { "text": "16 January 2022", "spanType": "span-simple-text" } ], "spanType": "span-bold" }, { "text": ".", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "A key limitation in our understanding of the COVID-19 pandemic is that we do not know the ", "spanType": "span-simple-text" }, { "children": [ { "text": "true", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": " number of infections. Instead, we only know of infections that have been confirmed by a test \u2013 the confirmed cases. But because many infected people never get tested,{ref}Infected people might not get tested for several reasons, such as not having easy access to testing or not even knowing they are infected because they have no symptoms (though they are still able to transmit the virus). Such asymptomatic infections are estimated to be 10\u201370% of total infections. Source: ", "spanType": "span-simple-text" }, { "url": "https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html", "children": [ { "text": "CDC COVID-19 Pandemic Planning Scenarios", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ".{/ref} we know that confirmed cases are only a fraction of true infections. How small a fraction though?", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "To answer this question, several research groups have developed epidemiological models of COVID-19. These models use the data we have \u2013 confirmed cases and deaths, testing rates, and more \u2013 plus a range of assumptions and epidemiological knowledge to estimate true infections and other important metrics.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [], "type": "heading", "level": 3, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The chart here shows the mean estimates of the true number of daily new infections in the United States from four of the most prominent models.{ref}There are many models in use besides these four, including other ones by the research groups we cover here. We chose these four models because they are prominent, have been used by policymakers, and have been updated regularly. We use them more for illustration than completeness.{/ref} For comparison, the number of confirmed cases is also shown.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "list", "items": [ { "type": "text", "value": [ { "url": "http://ourworldindata.org/covid-models#imperial-college-london-icl", "children": [ { "text": "Imperial College London (ICL)", "spanType": "span-simple-text" } ], "spanType": "span-link" } ], "parseErrors": [] }, { "type": "text", "value": [ { "url": "http://ourworldindata.org/covid-models#institute-for-health-metrics-and-evaluation-ihme", "children": [ { "text": "The Institute for Health Metrics and Evaluation (IHME)", "spanType": "span-simple-text" } ], "spanType": "span-link" } ], "parseErrors": [] }, { "type": "text", "value": [ { "url": "http://ourworldindata.org/covid-models#youyang-gu-yyg", "children": [ { "text": "Youyang Gu (YYG)", "spanType": "span-simple-text" } ], "spanType": "span-link" } ], "parseErrors": [] }, { "type": "text", "value": [ { "url": "http://ourworldindata.org/covid-models#london-school-of-hygiene-tropical-medicine-lshtm", "children": [ { "text": "The London School of Hygiene & Tropical Medicine (LSHTM)", "spanType": "span-simple-text" } ], "spanType": "span-link" } ], "parseErrors": [] } ], "parseErrors": [] }, { "url": "https://ourworldindata.org/grapher/daily-new-estimated-infections-of-covid-19", "type": "chart", "parseErrors": [] }, { "type": "text", "value": [ { "text": "Two things are clear from this chart: All four models agree that true infections ", "spanType": "span-simple-text" }, { "children": [ { "text": "far outnumber", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": " confirmed cases. But the models disagree by how much, and how infections have changed over time.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "When the number of confirmed cases in the US reached a peak in late July 2020, the IHME and LSHTM models estimated that the true number of infections was about twice as high as confirmed cases, the ICL model estimated it was nearly three times as high, and Youyang Gu's model estimated it was more than ", "spanType": "span-simple-text" }, { "children": [ { "text": "six times", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": " as high. Back in March the estimated discrepancy between confirmed cases and true infections was even many times higher.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "In this post we examine these four models and how they differ by unpacking their essential elements: what they are used for, how they work, the data they are based on, and the assumptions they make.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "We also aim to make the model estimates easily accessible in our interactive charts, allowing you to quickly explore different models of the pandemic for most countries in the world. To do this simply click \"Change country\" on each chart.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Three of the four models we look at are \u201cSEIR\u201d{ref}Pronounced by saying each letter, \u201cS-E-I-R.\u201d{/ref} models,{ref}The London School model is not an SEIR model.{/ref} which simulate how individuals in a population move through four states of a COVID-19 infection: being ", "spanType": "span-simple-text" }, { "children": [ { "text": "S", "spanType": "span-simple-text" } ], "spanType": "span-bold" }, { "text": "usceptible, ", "spanType": "span-simple-text" }, { "children": [ { "text": "E", "spanType": "span-simple-text" } ], "spanType": "span-bold" }, { "text": "xposed, ", "spanType": "span-simple-text" }, { "children": [ { "text": "I", "spanType": "span-simple-text" } ], "spanType": "span-bold" }, { "text": "nfectious, and ", "spanType": "span-simple-text" }, { "children": [ { "text": "R", "spanType": "span-simple-text" } ], "spanType": "span-bold" }, { "text": "ecovered (or deceased). How individuals move through these states is determined by different model \u201cparameters,\u201d of which there are many. Two key ones are the effective reproduction number (Rt){ref}Also called \"time-varying\" reproduction number.{/ref} \u2013\u00a0how many other people a person with COVID-19 infects at a given time \u2013 and the infection fatality rate (IFR) \u2013 the percent of people infected with a disease who die from it.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "You can learn more about how SEIR models work by exploring these resources:", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "list", "items": [ { "type": "text", "value": [ { "url": "https://covid19-projections.com/model-details/", "children": [ { "text": "Youyang Gu\u2019s Model Details", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " (for a brief read)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "url": "https://youtu.be/Lcx2a1jXISc", "children": [ { "text": "COVID Act Now\u2019s COVID Data 101: What is an SEIR model?", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " (for a brief video)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "url": "https://medium.com/data-for-science/epidemic-modeling-102-all-covid-19-models-are-wrong-but-some-are-useful-c81202cc6ee9", "children": [ { "text": "Bruno Gon\u00e7alves\u2019s Epidemic Modeling 102: All CoVID-19 models are wrong, but some are useful", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " (for a more in-depth read)", "spanType": "span-simple-text" } ], "parseErrors": [] } ], "parseErrors": [] }, { "text": [ { "text": "Imperial College London (ICL)", "spanType": "span-simple-text" } ], "type": "heading", "level": 2, "parseErrors": [] }, { "text": [ { "text": "Age-structured SEIR model focused on low- and middle-income countries (details as of 23 August 2020)", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "This chart shows the ICL model\u2019s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click \"Change country.\" The lines labeled \u201cupper\u201d and \u201clower\u201d show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "url": "https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-icl-model", "type": "chart", "parseErrors": [] }, { "text": [ { "children": [ { "text": "Website", "spanType": "span-simple-text" } ], "spanType": "span-bold" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "url": "https://mrc-ide.github.io/global-lmic-reports/", "children": [ { "text": "https://mrc-ide.github.io/global-lmic-reports/", "spanType": "span-simple-text" } ], "spanType": "span-link" } ], "parseErrors": [] }, { "text": [ { "text": "Regions covered", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "164 countries and territories across the world", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Time covered", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The first date covered is the estimated start of the pandemic for each country. The model makes projections that extend 90 days past the latest date of update.{ref}While projections are an important aspect of what this and some other models are used for, we do not cover them in this article.{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Update frequency", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "About 2\u20133 times per week", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What is the model?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model is a stochastic SEIR variant with multiple infectious states to reflect different COVID-19 severities, such as mild or asymptomatic versus severe.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What is the model used for?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "ICL describes its model as a tool to help countries understand at what stage the country is in its epidemic (e.g., before or after a peak) and how healthcare demand might change in the future under three policy scenarios. These scenarios are designed to provide a counterfactual of what could happen if current interventions were maintained, increased, or relaxed and are therefore not intended to forecast future mortality.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "ICL uses the model estimates to write reports for individual low- and middle-income countries (LMICs) that are relatively early in their epidemics; these reports are focused on the next 28 days. The downloadable model estimates additionally include data for some high-income countries later in their epidemics (e.g., the US and EU countries) and projections 90 days into the future.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Based on the model ICL publishes estimates of the following metrics:", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "list", "items": [ { "type": "text", "value": [ { "text": "True infections (to-date and projected)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Confirmed deaths (projected)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Hospital and ICU demand (to-date and projected)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Effective reproduction number, Rt (to-date and projected)", "spanType": "span-simple-text" } ], "parseErrors": [] } ], "parseErrors": [] }, { "text": [ { "text": "What data is the model based on?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model is \u201cfit\u201d to data on confirmed deaths{ref}As reported by the European Centre for Disease Prevention and Control (ECDC).{/ref} by using an estimated IFR to \u201cback-calculate\u201d how many infections would have been likely over the previous weeks to produce that number of deaths. It uses mobility data \u2013 from ", "spanType": "span-simple-text" }, { "url": "https://ourworldindata.org/covid-mobility-trends", "children": [ { "text": "Google", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " or, if unavailable, inferred from ", "spanType": "span-simple-text" }, { "url": "https://www.acaps.org/covid19-government-measures-dataset", "children": [ { "text": "ACAPS government measures data", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " \u2013 to modulate the Rt, the key parameter on how transmission is changing.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Additionally, the model uses age- and country-specific data on demographics, patterns of social contact, hospital availability, and the risk of hospitalization and death, though the availability of this data varies by country.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What are key assumptions and potential limitations?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model uses an estimated IFR for each country calculated by applying age-specific IFRs observed in China and Europe (of about 0.6\u20131%) to that country\u2019s age distribution. In countries like many LMICs with younger populations than in China and Europe, this results in IFR estimates of typically 0.2\u20130.3% because younger populations have lower associated mortality rates. These lower mortality rates, however, assume access to sufficient healthcare, which might not always be the case in LMICs. Differences between the estimated and true IFRs could impact the accuracy of model estimates.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model assumes that the number of confirmed deaths is equal to the true number of deaths. But ", "spanType": "span-simple-text" }, { "url": "https://ourworldindata.org/excess-mortality-covid", "children": [ { "text": "research on excess mortality", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " and known limitations to testing and reporting capacity suggest that confirmed deaths are often fewer than true deaths. Where this is the case the model likely underestimates the true health burden.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model assumes that the change in transmission over time is a function of average mobility trends for places like stores and workplaces but not parks and residential areas.{ref}The model assumes that in parks \u201csignificant contact events are negligible\u201d and that an \u201cincrease in residential movement will not change household contacts.\u201d{/ref} If these assumptions about mobility and transmission do not hold, the model might not accurately track the pandemic.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Like all models, this one makes many assumptions, and we cover only a few key ones here. For a full list see ", "spanType": "span-simple-text" }, { "url": "https://mrc-ide.github.io/global-lmic-reports/parameters.html", "children": [ { "text": "the model methods description", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ".", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Institute for Health Metrics and Evaluation (IHME)", "spanType": "span-simple-text" } ], "type": "heading", "level": 2, "parseErrors": [] }, { "text": [ { "text": "Hybrid statistical/SEIR model (details as of 23 August 2020)", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "text": [ { "text": "Update: IHME ", "spanType": "span-simple-text" }, { "url": "https://www.healthdata.org/covid/data-downloads", "children": [ { "text": "announced", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " that \"after December 16, 2022, IHME will pause its COVID-19 modeling for the foreseeable future.\"", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "This chart shows the IHME model\u2019s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click \"Change country.\" The lines labeled \u201cupper\u201d and \u201clower\u201d show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "url": "https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-ihme-model", "type": "chart", "parseErrors": [] }, { "text": [ { "children": [ { "text": "Website", "spanType": "span-simple-text" } ], "spanType": "span-bold" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "url": "https://covid19.healthdata.org/", "children": [ { "text": "https://covid19.healthdata.org/", "spanType": "span-simple-text" } ], "spanType": "span-link" } ], "parseErrors": [] }, { "text": [ { "text": "Regions covered", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "159 countries and territories across the world including subnational data for the US and several other countries", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Time covered", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The first date covered varies by country", "spanType": "span-simple-text" }, { "children": [ { "text": ".", "spanType": "span-simple-text" } ], "spanType": "span-bold" }, { "text": " The model makes projections that extend approximately 90\u2013120 days past the latest date of update.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Update frequency", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "About once a week (though not all countries are updated each time)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What is the model?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model is a hybrid with two main components: a statistical \u201cdeath model\u201d component produces death estimates that are used to fit an SEIR model component.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Note that the model has had two significant updates since its initial publication:", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "list", "items": [ { "type": "text", "value": [ { "url": "http://www.healthdata.org/sites/default/files/files/Projects/COVID/Estimation_update_050420.pdf", "children": [ { "text": "The SEIR component was added on 4 May 2020", "spanType": "span-simple-text" } ], "spanType": "span-link" } ], "parseErrors": [] }, { "type": "text", "value": [ { "url": "http://www.healthdata.org/sites/default/files/files/Projects/COVID/Estimation_update_05.30.2020.pdf", "children": [ { "text": "The death model component was updated on 29 May 2020", "spanType": "span-simple-text" } ], "spanType": "span-link" } ], "parseErrors": [] } ], "parseErrors": [] }, { "text": [ { "text": "What is the model used for?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "IHME describes its model as a tool to help government officials understand how different policy decisions could impact the course of the pandemic and to plan for changing healthcare demand.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model makes deaths projections that have been highly publicized and sometimes criticized.{ref}For example: Sharon Begley (2020, 17 Apr.) \u201c", "spanType": "span-simple-text" }, { "url": "https://www.statnews.com/2020/04/17/influential-covid-19-model-uses-flawed-methods-shouldnt-guide-policies-critics-say/", "children": [ { "text": "Influential Covid-19 model uses flawed methods and shouldn\u2019t guide U.S. policies, critics say.", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": "\u201d STAT News.{/ref} Though much of the criticism was leveled at a previous version of the model, known as \u201cCurveFit,\u201d that was used before the SEIR component was added on 4 May. The projections are made under currently three scenarios.{ref}For more details about the scenarios see the ", "spanType": "span-simple-text" }, { "url": "http://www.healthdata.org/covid/faqs", "children": [ { "text": "model FAQs", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ".{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Based on the model IHME publishes estimates of the following metrics:", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "list", "items": [ { "type": "text", "value": [ { "text": "True infections (to-date and projected)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Confirmed deaths (projected)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Hospital, ICU, and ventilator demand (to-date and projected)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Effective reproduction number, Rt (to-date and projected)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Testing levels (projected)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Mobility, as a proxy for social distancing (projected)", "spanType": "span-simple-text" } ], "parseErrors": [] } ], "parseErrors": [] }, { "text": [ { "text": "What data is the model based on?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The death model uses data on confirmed cases, confirmed deaths,{ref}Confirmed cases and deaths data as reported by Johns Hopkins University and several official sources.{/ref} and testing.{ref}As reported by the COVID Tracking Project (for US), official sources (Brazil and Dominican Republic), and Our World in Data (all other countries).{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The SEIR model is fit to the output of the death model by using an estimated IFR to back-calculate the true number of infections.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model uses several other types of data to simulate transmission and disease progression: mobility, social distancing policies, population density, pneumonia seasonality and death rate, air pollution, altitude, smoking rates, and self-reported contacts and mask use. Details on the sources of these data can be found on the ", "spanType": "span-simple-text" }, { "url": "http://www.healthdata.org/covid/faqs", "children": [ { "text": "model FAQs", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " and ", "spanType": "span-simple-text" }, { "url": "http://www.healthdata.org/covid/updates", "children": [ { "text": "estimation updates", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " pages.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What are key assumptions and potential limitations?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model uses an estimated IFR based on data from the Diamond Princess cruise ship and New Zealand. Though IHME does not give numbers for these, the Diamond Princess IFR has been estimated at 0.6% (95% uncertainty interval of 0.2\u20131.3%).{ref}Russell et al (2020). Estimating the infection and case fatality ratio for coronavirus disease (COVID-19) using age-adjusted data from the outbreak on the Diamond Princess cruise ship. Eurosurveillance, 25(12). ", "spanType": "span-simple-text" }, { "url": "https://doi.org/10.2807/1560-7917.ES.2020.25.12.2000256", "children": [ { "text": "https://doi.org/10.2807/1560-7917.ES.2020.25.12.2000256", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": "{/ref} Differences between the estimated and true IFRs could impact the accuracy of model estimates.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The death model makes several assumptions about the relationship between confirmed deaths, confirmed cases, and testing levels. For example, that a decreasing ", "spanType": "span-simple-text" }, { "children": [ { "text": "case", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": " fatality rate (CFR) \u2013 the ratio of ", "spanType": "span-simple-text" }, { "children": [ { "text": "confirmed", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": " deaths to ", "spanType": "span-simple-text" }, { "children": [ { "text": "confirmed", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": " cases{ref}The CFR is similar to the IFR but uses the ", "spanType": "span-simple-text" }, { "children": [ { "text": "confirmed", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": " deaths and cases reported by countries. In contrast, the IFR uses true deaths and infections, which are generally not known and have to be estimated.{/ref} \u2013 is reflective of increasing testing and a shift toward testing mild or asymptomatic cases. But the CFR could also decrease for other reasons, such as improved treatment or a decline in the average age of infected people.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model assumes that the change in transmission over time is a function of several data inputs (listed above), like mobility and population density. If these assumptions do not hold \u2013 for example, because the data is less relevant or its relationship with transmission is misspecified \u2013 the model might not accurately track the pandemic.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "More details are discussed in the ", "spanType": "span-simple-text" }, { "url": "http://www.healthdata.org/covid/faqs", "children": [ { "text": "model FAQs", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " and in different ", "spanType": "span-simple-text" }, { "url": "http://www.healthdata.org/covid/updates", "children": [ { "text": "estimation update reports", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ".", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Youyang Gu (YYG)", "spanType": "span-simple-text" } ], "type": "heading", "level": 2, "parseErrors": [] }, { "text": [ { "text": "SEIR model with machine learning layer (details as of 23 August 2020)", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "text": [ { "text": "Update: Youyang Gu ", "spanType": "span-simple-text" }, { "url": "https://youyanggu.com/blog/six-months-later", "children": [ { "text": "announced", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " that 5 October 2020 is the final model update", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "This chart shows the YYG model\u2019s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click \"Change country.\" The lines labeled \u201cupper\u201d and \u201clower\u201d show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "url": "https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-yyg-model", "type": "chart", "parseErrors": [] }, { "text": [ { "children": [ { "text": "Website", "spanType": "span-simple-text" } ], "spanType": "span-bold" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "url": "https://covid19-projections.com/", "children": [ { "text": "https://covid19-projections.com/", "spanType": "span-simple-text" } ], "spanType": "span-link" } ], "parseErrors": [] }, { "text": [ { "text": "Regions covered", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "71 countries across the world including subnational data for the US and Canada", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Time covered", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The first date covered varies by country", "spanType": "span-simple-text" }, { "children": [ { "text": ".", "spanType": "span-simple-text" } ], "spanType": "span-bold" }, { "text": " The model makes projections that extend approximately 90 days past the latest date of update.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Update frequency", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "Daily", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What is the model?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model consists of an SEIR base with a machine learning layer on top to search for the parameters that minimize the error between the model estimates and the observed data.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What is the model used for?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "Youyang describes his model as making projections of true infections and deaths that optimize for forecast accuracy. Though he also stresses that his projections cover a range of possible outcomes, and that projections are not \u201cwrong\u201d if they help shape a different outcome in the future.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Based on the model Youyang publishes estimates of the following metrics:", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "list", "items": [ { "type": "text", "value": [ { "text": "True infections (to-date and projected)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Confirmed deaths (projected)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Effective reproduction number, Rt (to-date and projected)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Tests per day targets (projected)", "spanType": "span-simple-text" } ], "parseErrors": [] } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model does not focus on projections under different scenarios, but has explored what would have happened if the US had mandated social distancing ", "spanType": "span-simple-text" }, { "url": "https://covid19-projections.com/us-1weekearlier", "children": [ { "text": "one week earlier", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " or ", "spanType": "span-simple-text" }, { "url": "https://covid19-projections.com/us-1weeklater", "children": [ { "text": "one week later", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ", or ", "spanType": "span-simple-text" }, { "url": "https://covid19-projections.com/us-self-quarantine", "children": [ { "text": "if 20% of infected people immediately self-quarantined", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ".", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What data is the model based on?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model is fit to data on confirmed deaths{ref}As reported by Johns Hopkins University. The data is smoothed before fitting.{/ref} by using an estimated IFR to back-calculate the true number of infections. Confirmed cases and hospitalization data are sometimes used to help set bounds for the machine learning parameter search.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What are key assumptions and potential limitations?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model uses an estimated IFR for each region based initially on that region\u2019s observed CFR. The IFR is then decreased{ref}Except in \u201clater-impacted regions like Latin America, we wait an additional 3 months before beginning to decrease the IFR.\u201d{/ref} linearly over the span of three months until it is 30% of its initial value to reflect the lower average age of infections and improving treatments. Currently, the IFR is estimated to be 0.2\u20130.4% in most of the US and Europe. Differences between the estimated and true IFRs could impact the accuracy of model estimates.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model assumes there will be unreported deaths for the \"first few weeks\u201d of a region\u2019s pandemic, and that this underreporting will decrease until the number of confirmed deaths equals true deaths. As noted before, this is often not the case, and thus the model might underestimate the true health burden.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model makes assumptions about how reopening will affect social distancing and ultimately transmission. For example, if reopening causes a resurgence of infections, the model assumes regions will take action to reduce transmission, which is modeled by limiting the Rt. It also assumes a reopening date for regions (especially outside the US and Europe) where the true date is unknown.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model was created and optimized for the US. Thus for other countries the model estimates might be less accurate.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "For a full list of assumptions and limitations see ", "spanType": "span-simple-text" }, { "url": "https://covid19-projections.com/about/#assumptions", "children": [ { "text": "the model \"About\" page", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ".", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "London School of Hygiene & Tropical Medicine (LSHTM)", "spanType": "span-simple-text" } ], "type": "heading", "level": 2, "parseErrors": [] }, { "text": [ { "text": "Statistical model estimating underreporting of infections (details as of 23 August 2020)", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "This chart shows the LSHTM model\u2019s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click \"Change country.\" The lines labeled \u201cupper\u201d and \u201clower\u201d show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "url": "https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-lshtm-model", "type": "chart", "parseErrors": [] }, { "text": [ { "children": [ { "text": "Website", "spanType": "span-simple-text" } ], "spanType": "span-bold" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "url": "https://cmmid.github.io/topics/covid19/global_cfr_estimates.html", "children": [ { "text": "https://cmmid.github.io/topics/covid19/global_cfr_estimates.html", "spanType": "span-simple-text" } ], "spanType": "span-link" } ], "parseErrors": [] }, { "text": [ { "text": "Regions covered", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "159 countries and territories across the world (those with at least 10 confirmed deaths out of a total of 210)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Time covered", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The first date covered varies by country", "spanType": "span-simple-text" }, { "children": [ { "text": ". ", "spanType": "span-simple-text" } ], "spanType": "span-bold" }, { "text": "The model does not make projections.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Update frequency", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "About once a week", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What is the model?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model starts with a country\u2019s CFR and adjusts it for the fact that there is a delay of roughly 2\u20133 weeks between case confirmation and death (or recovery).{ref}The typical CFR calculation divides confirmed deaths by confirmed cases ", "spanType": "span-simple-text" }, { "children": [ { "text": "reported on the same day", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": ", but those deaths were actually caused by cases confirmed roughly 2\u20133 weeks before.{/ref} This delay-adjusted CFR is then compared to a baseline, delay-adjusted CFR to estimate the \"ascertainment rate\" \u2013 the proportion of all ", "spanType": "span-simple-text" }, { "children": [ { "text": "symptomatic", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": " infections that have actually been confirmed.{ref}All but a trivial number of confirmed cases are assumed to be symptomatic.{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "This estimated ascertainment rate is then used to adjust the number of confirmed cases{ref}This data is first smoothed.{/ref} to estimate the true number of symptomatic infections. To finally estimate ", "spanType": "span-simple-text" }, { "children": [ { "text": "total", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": " infections, the symptomatic infections estimate is adjusted to include ", "spanType": "span-simple-text" }, { "children": [ { "text": "asymptomatic", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": " infections, which are estimated to compose between 10\u201370% (median 50%) of total infections.{ref}In accordance with this methodology and in consultation with the LSHTM researchers, we perform these calculations to produce the estimates of total infections presented here.{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What is the model used for?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "LSHTM describes its model as a tool to help understand the level of undetected epidemic progression and to aid response planning, such as when to introduce and relax control measures.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Based on the model LSHTM publishes estimates of the ascertainment rate.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What data is the model based on?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model is based on data on confirmed deaths and confirmed cases.{ref}Both as reported by the ECDC.{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "What are key assumptions and potential limitations?", "spanType": "span-simple-text" } ], "type": "heading", "level": 4, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model assumes a baseline, delay-adjusted CFR of 1.4% and that any difference between that and a country\u2019s delay-adjusted CFR is entirely due to under-ascertainment. But many other factors likely play a role, such as the burden on the healthcare system, COVID-19 risk factors in the population, the ages of those infected, and more.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The assumed baseline CFR is based on data from China and does not account for different age distributions outside China. This causes the ascertainment rate to be overestimated in countries with younger populations and underestimated in countries with older populations.{ref}In a secondary analysis the LSHTM researchers do adjust the baseline CFR for different age distributions. But this has its own assumptions and limitations and is thus not clearly a better approach. More details can be found in ", "spanType": "span-simple-text" }, { "url": "https://cmmid.github.io/topics/covid19/reports/UnderReporting.pdf", "children": [ { "text": "the full report", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ".{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The model assumes that the number of confirmed deaths is equal to the true number of deaths. As noted before, this is often not the case, and thus the model might underestimate the true health burden.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Reported deaths data is sometimes changed retroactively, which can be challenging for the model and might affect its estimates.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "More assumptions and limitations are discussed in ", "spanType": "span-simple-text" }, { "url": "https://cmmid.github.io/topics/covid19/reports/UnderReporting.pdf", "children": [ { "text": "the full report", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ".", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "How should we think about these models and their estimates?", "spanType": "span-simple-text" } ], "type": "heading", "level": 2, "parseErrors": [] }, { "type": "text", "value": [ { "text": "All four models we looked at agree that true infections far outnumber confirmed cases, but they disagree by how much. We now have some insight into these differences: The models all differ to some degree in what they are used for, how they work, the data they are based on, and the assumptions they make.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Making these differences transparent helps us understand how we should think about these models and their estimates. For example, understanding that some models are used for scenario planning and not forecasting (like ICL\u2019s) while others are optimized for forecast accuracy (like Youyang\u2019s) puts their estimates in context. And the models all make different assumptions that each have limitations; we can decide if those limitations are relevant to a given situation.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "In the end, though, we still want to have confidence that models can track the pandemic accurately. We can calibrate our confidence in different models by giving their estimates a reality check.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "One way to do this is to compare model estimates against some observed \u201cground truth\u201d data. For example, if a model is forecasting the number of deaths four weeks from now, we can wait four weeks and compare the forecast to the deaths that actually occur.{ref}Though we still need to consider that such forecasts might not track what actually occurs if they help shape a different outcome in the future.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Some current efforts to score forecasts for accuracy are by ", "spanType": "span-simple-text" }, { "url": "https://github.com/youyanggu/covid19-forecast-hub-evaluation", "children": [ { "text": "Youyang Gu", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ", ", "spanType": "span-simple-text" }, { "url": "http://www.healthdata.org/research-article/predictive-performance-international-covid-19-mortality-forecasting-models", "children": [ { "text": "IHME", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ", ", "spanType": "span-simple-text" }, { "url": "https://zoltardata.com/about", "children": [ { "text": "The Zoltar Project", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ", and ", "spanType": "span-simple-text" }, { "url": "https://covidcompare.io/", "children": [ { "text": "Covid Compare", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ".{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "But sometimes the ground truth is not easily observed, as is the case with the true number of infections. Here we have to look for ", "spanType": "span-simple-text" }, { "children": [ { "text": "converging evidence", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": " from other research, such as from seroprevalence studies that test for COVID-19 antibodies in the blood serum to estimate how many people have ever been infected.{ref}The LSHTM researchers, for example, compared their model estimates to seroprevalence estimates and found good agreement. You can read more about this in ", "spanType": "span-simple-text" }, { "url": "https://cmmid.github.io/topics/covid19/Under-Reporting.html", "children": [ { "text": "their full report.", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": "{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "By gaining a deeper, more nuanced understanding of these models and their strengths and weaknesses, we can use them as valuable tools to help make progress against the pandemic.", "spanType": "span-simple-text" } ], "parseErrors": [] } ], "type": "article", "title": "How epidemiological models of COVID-19 help us estimate the true number of infections", "authors": [ "Charlie Giattino" ], "excerpt": "We know that confirmed COVID-19 cases are only a fraction of true infections. How small a fraction though?", "dateline": "August 24, 2020", "subtitle": "We know that confirmed COVID-19 cases are only a fraction of true infections. How small a fraction though?", "sidebar-toc": false, "featured-image": "covid-models.png" }, "createdAt": "2021-07-20T14:57:57.000Z", "published": false, "updatedAt": "2023-03-17T08:37:44.000Z", "revisionId": null, "publishedAt": "2020-08-24T11:00:25.000Z", "relatedCharts": [], "publicationContext": "listed" } |
{ "errors": [ { "name": "unexpected wp component tag", "details": "Found unhandled wp:comment tag owid/last-updated" }, { "name": "exepcted a single plain text element, got zero", "details": "Found 0 elements after transforming to archieml" }, { "name": "unexpected wp component tag", "details": "Found unhandled wp:comment tag list" }, { "name": "unexpected wp component tag", "details": "Found unhandled wp:comment tag list" }, { "name": "unexpected wp component tag", "details": "Found unhandled wp:comment tag list" }, { "name": "unexpected wp component tag", "details": "Found unhandled wp:comment tag list" }, { "name": "unexpected wp component tag", "details": "Found unhandled wp:comment tag list" }, { "name": "unexpected wp component tag", "details": "Found unhandled wp:comment tag list" } ], "numBlocks": 135, "numErrors": 8, "wpTagCounts": { "html": 6, "list": 6, "heading": 44, "paragraph": 79, "owid/last-updated": 1 }, "htmlTagCounts": { "p": 80, "h3": 5, "h4": 1, "h5": 38, "ul": 6, "div": 1, "iframe": 5 } } |
2020-08-24 11:00:25 | 2024-02-16 14:22:50 | 1zqQ2XFnpAruU88eep_E_HKb4VluUdUNnS2Z6r12Fpwc | [ "Charlie Giattino" ] |
We know that confirmed COVID-19 cases are only a fraction of true infections. How small a fraction though? | 2021-07-20 14:57:57 | 2023-03-17 08:37:44 | https://ourworldindata.org/wp-content/uploads/2020/08/covid-models.png | {} |
Our World in Data presents the data and research to make progress against the world’s largest problems. Our main publication on the pandemic is here: **[Coronavirus Pandemic (COVID-19)](https://ourworldindata.org/coronavirus)**. We are grateful to the researchers whose work we cover in this post for giving helpful feedback and suggestions. Thank you. We update the model estimates with the latest available data each week. Last update: **16 January 2022**. A key limitation in our understanding of the COVID-19 pandemic is that we do not know the _true_ number of infections. Instead, we only know of infections that have been confirmed by a test – the confirmed cases. But because many infected people never get tested,{ref}Infected people might not get tested for several reasons, such as not having easy access to testing or not even knowing they are infected because they have no symptoms (though they are still able to transmit the virus). Such asymptomatic infections are estimated to be 10–70% of total infections. Source: [CDC COVID-19 Pandemic Planning Scenarios](https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html).{/ref} we know that confirmed cases are only a fraction of true infections. How small a fraction though? To answer this question, several research groups have developed epidemiological models of COVID-19. These models use the data we have – confirmed cases and deaths, testing rates, and more – plus a range of assumptions and epidemiological knowledge to estimate true infections and other important metrics. ### The chart here shows the mean estimates of the true number of daily new infections in the United States from four of the most prominent models.{ref}There are many models in use besides these four, including other ones by the research groups we cover here. We chose these four models because they are prominent, have been used by policymakers, and have been updated regularly. We use them more for illustration than completeness.{/ref} For comparison, the number of confirmed cases is also shown. * [Imperial College London (ICL)](http://ourworldindata.org/covid-models#imperial-college-london-icl) * [The Institute for Health Metrics and Evaluation (IHME)](http://ourworldindata.org/covid-models#institute-for-health-metrics-and-evaluation-ihme) * [Youyang Gu (YYG)](http://ourworldindata.org/covid-models#youyang-gu-yyg) * [The London School of Hygiene & Tropical Medicine (LSHTM)](http://ourworldindata.org/covid-models#london-school-of-hygiene-tropical-medicine-lshtm) <Chart url="https://ourworldindata.org/grapher/daily-new-estimated-infections-of-covid-19"/> Two things are clear from this chart: All four models agree that true infections _far outnumber_ confirmed cases. But the models disagree by how much, and how infections have changed over time. When the number of confirmed cases in the US reached a peak in late July 2020, the IHME and LSHTM models estimated that the true number of infections was about twice as high as confirmed cases, the ICL model estimated it was nearly three times as high, and Youyang Gu's model estimated it was more than _six times_ as high. Back in March the estimated discrepancy between confirmed cases and true infections was even many times higher. In this post we examine these four models and how they differ by unpacking their essential elements: what they are used for, how they work, the data they are based on, and the assumptions they make. We also aim to make the model estimates easily accessible in our interactive charts, allowing you to quickly explore different models of the pandemic for most countries in the world. To do this simply click "Change country" on each chart. Three of the four models we look at are “SEIR”{ref}Pronounced by saying each letter, “S-E-I-R.”{/ref} models,{ref}The London School model is not an SEIR model.{/ref} which simulate how individuals in a population move through four states of a COVID-19 infection: being **S**usceptible, **E**xposed, **I**nfectious, and **R**ecovered (or deceased). How individuals move through these states is determined by different model “parameters,” of which there are many. Two key ones are the effective reproduction number (Rt){ref}Also called "time-varying" reproduction number.{/ref} – how many other people a person with COVID-19 infects at a given time – and the infection fatality rate (IFR) – the percent of people infected with a disease who die from it. You can learn more about how SEIR models work by exploring these resources: * [Youyang Gu’s Model Details](https://covid19-projections.com/model-details/) (for a brief read) * [COVID Act Now’s COVID Data 101: What is an SEIR model?](https://youtu.be/Lcx2a1jXISc) (for a brief video) * [Bruno Gonçalves’s Epidemic Modeling 102: All CoVID-19 models are wrong, but some are useful](https://medium.com/data-for-science/epidemic-modeling-102-all-covid-19-models-are-wrong-but-some-are-useful-c81202cc6ee9) (for a more in-depth read) ## Imperial College London (ICL) #### Age-structured SEIR model focused on low- and middle-income countries (details as of 23 August 2020) This chart shows the ICL model’s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click "Change country." The lines labeled “upper” and “lower” show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown. <Chart url="https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-icl-model"/> #### **Website** [https://mrc-ide.github.io/global-lmic-reports/](https://mrc-ide.github.io/global-lmic-reports/) #### Regions covered 164 countries and territories across the world #### Time covered The first date covered is the estimated start of the pandemic for each country. The model makes projections that extend 90 days past the latest date of update.{ref}While projections are an important aspect of what this and some other models are used for, we do not cover them in this article.{/ref} #### Update frequency About 2–3 times per week #### What is the model? The model is a stochastic SEIR variant with multiple infectious states to reflect different COVID-19 severities, such as mild or asymptomatic versus severe. #### What is the model used for? ICL describes its model as a tool to help countries understand at what stage the country is in its epidemic (e.g., before or after a peak) and how healthcare demand might change in the future under three policy scenarios. These scenarios are designed to provide a counterfactual of what could happen if current interventions were maintained, increased, or relaxed and are therefore not intended to forecast future mortality. ICL uses the model estimates to write reports for individual low- and middle-income countries (LMICs) that are relatively early in their epidemics; these reports are focused on the next 28 days. The downloadable model estimates additionally include data for some high-income countries later in their epidemics (e.g., the US and EU countries) and projections 90 days into the future. Based on the model ICL publishes estimates of the following metrics: * True infections (to-date and projected) * Confirmed deaths (projected) * Hospital and ICU demand (to-date and projected) * Effective reproduction number, Rt (to-date and projected) #### What data is the model based on? The model is “fit” to data on confirmed deaths{ref}As reported by the European Centre for Disease Prevention and Control (ECDC).{/ref} by using an estimated IFR to “back-calculate” how many infections would have been likely over the previous weeks to produce that number of deaths. It uses mobility data – from [Google](https://ourworldindata.org/covid-mobility-trends) or, if unavailable, inferred from [ACAPS government measures data](https://www.acaps.org/covid19-government-measures-dataset) – to modulate the Rt, the key parameter on how transmission is changing. Additionally, the model uses age- and country-specific data on demographics, patterns of social contact, hospital availability, and the risk of hospitalization and death, though the availability of this data varies by country. #### What are key assumptions and potential limitations? The model uses an estimated IFR for each country calculated by applying age-specific IFRs observed in China and Europe (of about 0.6–1%) to that country’s age distribution. In countries like many LMICs with younger populations than in China and Europe, this results in IFR estimates of typically 0.2–0.3% because younger populations have lower associated mortality rates. These lower mortality rates, however, assume access to sufficient healthcare, which might not always be the case in LMICs. Differences between the estimated and true IFRs could impact the accuracy of model estimates. The model assumes that the number of confirmed deaths is equal to the true number of deaths. But [research on excess mortality](https://ourworldindata.org/excess-mortality-covid) and known limitations to testing and reporting capacity suggest that confirmed deaths are often fewer than true deaths. Where this is the case the model likely underestimates the true health burden. The model assumes that the change in transmission over time is a function of average mobility trends for places like stores and workplaces but not parks and residential areas.{ref}The model assumes that in parks “significant contact events are negligible” and that an “increase in residential movement will not change household contacts.”{/ref} If these assumptions about mobility and transmission do not hold, the model might not accurately track the pandemic. Like all models, this one makes many assumptions, and we cover only a few key ones here. For a full list see [the model methods description](https://mrc-ide.github.io/global-lmic-reports/parameters.html). ## Institute for Health Metrics and Evaluation (IHME) #### Hybrid statistical/SEIR model (details as of 23 August 2020) #### Update: IHME [announced](https://www.healthdata.org/covid/data-downloads) that "after December 16, 2022, IHME will pause its COVID-19 modeling for the foreseeable future." This chart shows the IHME model’s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click "Change country." The lines labeled “upper” and “lower” show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown. <Chart url="https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-ihme-model"/> #### **Website** [https://covid19.healthdata.org/](https://covid19.healthdata.org/) #### Regions covered 159 countries and territories across the world including subnational data for the US and several other countries #### Time covered The first date covered varies by country**.** The model makes projections that extend approximately 90–120 days past the latest date of update. #### Update frequency About once a week (though not all countries are updated each time) #### What is the model? The model is a hybrid with two main components: a statistical “death model” component produces death estimates that are used to fit an SEIR model component. Note that the model has had two significant updates since its initial publication: * [The SEIR component was added on 4 May 2020](http://www.healthdata.org/sites/default/files/files/Projects/COVID/Estimation_update_050420.pdf) * [The death model component was updated on 29 May 2020](http://www.healthdata.org/sites/default/files/files/Projects/COVID/Estimation_update_05.30.2020.pdf) #### What is the model used for? IHME describes its model as a tool to help government officials understand how different policy decisions could impact the course of the pandemic and to plan for changing healthcare demand. The model makes deaths projections that have been highly publicized and sometimes criticized.{ref}For example: Sharon Begley (2020, 17 Apr.) “[Influential Covid-19 model uses flawed methods and shouldn’t guide U.S. policies, critics say.](https://www.statnews.com/2020/04/17/influential-covid-19-model-uses-flawed-methods-shouldnt-guide-policies-critics-say/)” STAT News.{/ref} Though much of the criticism was leveled at a previous version of the model, known as “CurveFit,” that was used before the SEIR component was added on 4 May. The projections are made under currently three scenarios.{ref}For more details about the scenarios see the [model FAQs](http://www.healthdata.org/covid/faqs).{/ref} Based on the model IHME publishes estimates of the following metrics: * True infections (to-date and projected) * Confirmed deaths (projected) * Hospital, ICU, and ventilator demand (to-date and projected) * Effective reproduction number, Rt (to-date and projected) * Testing levels (projected) * Mobility, as a proxy for social distancing (projected) #### What data is the model based on? The death model uses data on confirmed cases, confirmed deaths,{ref}Confirmed cases and deaths data as reported by Johns Hopkins University and several official sources.{/ref} and testing.{ref}As reported by the COVID Tracking Project (for US), official sources (Brazil and Dominican Republic), and Our World in Data (all other countries).{/ref} The SEIR model is fit to the output of the death model by using an estimated IFR to back-calculate the true number of infections. The model uses several other types of data to simulate transmission and disease progression: mobility, social distancing policies, population density, pneumonia seasonality and death rate, air pollution, altitude, smoking rates, and self-reported contacts and mask use. Details on the sources of these data can be found on the [model FAQs](http://www.healthdata.org/covid/faqs) and [estimation updates](http://www.healthdata.org/covid/updates) pages. #### What are key assumptions and potential limitations? The model uses an estimated IFR based on data from the Diamond Princess cruise ship and New Zealand. Though IHME does not give numbers for these, the Diamond Princess IFR has been estimated at 0.6% (95% uncertainty interval of 0.2–1.3%).{ref}Russell et al (2020). Estimating the infection and case fatality ratio for coronavirus disease (COVID-19) using age-adjusted data from the outbreak on the Diamond Princess cruise ship. Eurosurveillance, 25(12). [https://doi.org/10.2807/1560-7917.ES.2020.25.12.2000256](https://doi.org/10.2807/1560-7917.ES.2020.25.12.2000256){/ref} Differences between the estimated and true IFRs could impact the accuracy of model estimates. The death model makes several assumptions about the relationship between confirmed deaths, confirmed cases, and testing levels. For example, that a decreasing _case_ fatality rate (CFR) – the ratio of _confirmed_ deaths to _confirmed_ cases{ref}The CFR is similar to the IFR but uses the _confirmed_ deaths and cases reported by countries. In contrast, the IFR uses true deaths and infections, which are generally not known and have to be estimated.{/ref} – is reflective of increasing testing and a shift toward testing mild or asymptomatic cases. But the CFR could also decrease for other reasons, such as improved treatment or a decline in the average age of infected people. The model assumes that the change in transmission over time is a function of several data inputs (listed above), like mobility and population density. If these assumptions do not hold – for example, because the data is less relevant or its relationship with transmission is misspecified – the model might not accurately track the pandemic. More details are discussed in the [model FAQs](http://www.healthdata.org/covid/faqs) and in different [estimation update reports](http://www.healthdata.org/covid/updates). ## Youyang Gu (YYG) #### SEIR model with machine learning layer (details as of 23 August 2020) #### Update: Youyang Gu [announced](https://youyanggu.com/blog/six-months-later) that 5 October 2020 is the final model update This chart shows the YYG model’s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click "Change country." The lines labeled “upper” and “lower” show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown. <Chart url="https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-yyg-model"/> #### **Website** [https://covid19-projections.com/](https://covid19-projections.com/) #### Regions covered 71 countries across the world including subnational data for the US and Canada #### Time covered The first date covered varies by country**.** The model makes projections that extend approximately 90 days past the latest date of update. #### Update frequency Daily #### What is the model? The model consists of an SEIR base with a machine learning layer on top to search for the parameters that minimize the error between the model estimates and the observed data. #### What is the model used for? Youyang describes his model as making projections of true infections and deaths that optimize for forecast accuracy. Though he also stresses that his projections cover a range of possible outcomes, and that projections are not “wrong” if they help shape a different outcome in the future. Based on the model Youyang publishes estimates of the following metrics: * True infections (to-date and projected) * Confirmed deaths (projected) * Effective reproduction number, Rt (to-date and projected) * Tests per day targets (projected) The model does not focus on projections under different scenarios, but has explored what would have happened if the US had mandated social distancing [one week earlier](https://covid19-projections.com/us-1weekearlier) or [one week later](https://covid19-projections.com/us-1weeklater), or [if 20% of infected people immediately self-quarantined](https://covid19-projections.com/us-self-quarantine). #### What data is the model based on? The model is fit to data on confirmed deaths{ref}As reported by Johns Hopkins University. The data is smoothed before fitting.{/ref} by using an estimated IFR to back-calculate the true number of infections. Confirmed cases and hospitalization data are sometimes used to help set bounds for the machine learning parameter search. #### What are key assumptions and potential limitations? The model uses an estimated IFR for each region based initially on that region’s observed CFR. The IFR is then decreased{ref}Except in “later-impacted regions like Latin America, we wait an additional 3 months before beginning to decrease the IFR.”{/ref} linearly over the span of three months until it is 30% of its initial value to reflect the lower average age of infections and improving treatments. Currently, the IFR is estimated to be 0.2–0.4% in most of the US and Europe. Differences between the estimated and true IFRs could impact the accuracy of model estimates. The model assumes there will be unreported deaths for the "first few weeks” of a region’s pandemic, and that this underreporting will decrease until the number of confirmed deaths equals true deaths. As noted before, this is often not the case, and thus the model might underestimate the true health burden. The model makes assumptions about how reopening will affect social distancing and ultimately transmission. For example, if reopening causes a resurgence of infections, the model assumes regions will take action to reduce transmission, which is modeled by limiting the Rt. It also assumes a reopening date for regions (especially outside the US and Europe) where the true date is unknown. The model was created and optimized for the US. Thus for other countries the model estimates might be less accurate. For a full list of assumptions and limitations see [the model "About" page](https://covid19-projections.com/about/#assumptions). ## London School of Hygiene & Tropical Medicine (LSHTM) #### Statistical model estimating underreporting of infections (details as of 23 August 2020) This chart shows the LSHTM model’s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click "Change country." The lines labeled “upper” and “lower” show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown. <Chart url="https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-lshtm-model"/> #### **Website** [https://cmmid.github.io/topics/covid19/global_cfr_estimates.html](https://cmmid.github.io/topics/covid19/global_cfr_estimates.html) #### Regions covered 159 countries and territories across the world (those with at least 10 confirmed deaths out of a total of 210) #### Time covered The first date covered varies by country**. **The model does not make projections. #### Update frequency About once a week #### What is the model? The model starts with a country’s CFR and adjusts it for the fact that there is a delay of roughly 2–3 weeks between case confirmation and death (or recovery).{ref}The typical CFR calculation divides confirmed deaths by confirmed cases _reported on the same day_, but those deaths were actually caused by cases confirmed roughly 2–3 weeks before.{/ref} This delay-adjusted CFR is then compared to a baseline, delay-adjusted CFR to estimate the "ascertainment rate" – the proportion of all _symptomatic_ infections that have actually been confirmed.{ref}All but a trivial number of confirmed cases are assumed to be symptomatic.{/ref} This estimated ascertainment rate is then used to adjust the number of confirmed cases{ref}This data is first smoothed.{/ref} to estimate the true number of symptomatic infections. To finally estimate _total_ infections, the symptomatic infections estimate is adjusted to include _asymptomatic_ infections, which are estimated to compose between 10–70% (median 50%) of total infections.{ref}In accordance with this methodology and in consultation with the LSHTM researchers, we perform these calculations to produce the estimates of total infections presented here.{/ref} #### What is the model used for? LSHTM describes its model as a tool to help understand the level of undetected epidemic progression and to aid response planning, such as when to introduce and relax control measures. Based on the model LSHTM publishes estimates of the ascertainment rate. #### What data is the model based on? The model is based on data on confirmed deaths and confirmed cases.{ref}Both as reported by the ECDC.{/ref} #### What are key assumptions and potential limitations? The model assumes a baseline, delay-adjusted CFR of 1.4% and that any difference between that and a country’s delay-adjusted CFR is entirely due to under-ascertainment. But many other factors likely play a role, such as the burden on the healthcare system, COVID-19 risk factors in the population, the ages of those infected, and more. The assumed baseline CFR is based on data from China and does not account for different age distributions outside China. This causes the ascertainment rate to be overestimated in countries with younger populations and underestimated in countries with older populations.{ref}In a secondary analysis the LSHTM researchers do adjust the baseline CFR for different age distributions. But this has its own assumptions and limitations and is thus not clearly a better approach. More details can be found in [the full report](https://cmmid.github.io/topics/covid19/reports/UnderReporting.pdf).{/ref} The model assumes that the number of confirmed deaths is equal to the true number of deaths. As noted before, this is often not the case, and thus the model might underestimate the true health burden. Reported deaths data is sometimes changed retroactively, which can be challenging for the model and might affect its estimates. More assumptions and limitations are discussed in [the full report](https://cmmid.github.io/topics/covid19/reports/UnderReporting.pdf). ## How should we think about these models and their estimates? All four models we looked at agree that true infections far outnumber confirmed cases, but they disagree by how much. We now have some insight into these differences: The models all differ to some degree in what they are used for, how they work, the data they are based on, and the assumptions they make. Making these differences transparent helps us understand how we should think about these models and their estimates. For example, understanding that some models are used for scenario planning and not forecasting (like ICL’s) while others are optimized for forecast accuracy (like Youyang’s) puts their estimates in context. And the models all make different assumptions that each have limitations; we can decide if those limitations are relevant to a given situation. In the end, though, we still want to have confidence that models can track the pandemic accurately. We can calibrate our confidence in different models by giving their estimates a reality check. One way to do this is to compare model estimates against some observed “ground truth” data. For example, if a model is forecasting the number of deaths four weeks from now, we can wait four weeks and compare the forecast to the deaths that actually occur.{ref}Though we still need to consider that such forecasts might not track what actually occurs if they help shape a different outcome in the future. Some current efforts to score forecasts for accuracy are by [Youyang Gu](https://github.com/youyanggu/covid19-forecast-hub-evaluation), [IHME](http://www.healthdata.org/research-article/predictive-performance-international-covid-19-mortality-forecasting-models), [The Zoltar Project](https://zoltardata.com/about), and [Covid Compare](https://covidcompare.io/).{/ref} But sometimes the ground truth is not easily observed, as is the case with the true number of infections. Here we have to look for _converging evidence_ from other research, such as from seroprevalence studies that test for COVID-19 antibodies in the blood serum to estimate how many people have ever been infected.{ref}The LSHTM researchers, for example, compared their model estimates to seroprevalence estimates and found good agreement. You can read more about this in [their full report.](https://cmmid.github.io/topics/covid19/Under-Reporting.html){/ref} By gaining a deeper, more nuanced understanding of these models and their strengths and weaknesses, we can use them as valuable tools to help make progress against the pandemic. | { "id": 35911, "date": "2020-08-24T12:00:25", "guid": { "rendered": "https://owid.cloud/?p=35911" }, "link": "https://owid.cloud/covid-models", "meta": { "owid_publication_context_meta_field": { "latest": true, "homepage": true, "immediate_newsletter": true } }, "slug": "covid-models", "tags": [], "type": "post", "title": { "rendered": "How epidemiological models of COVID-19 help us estimate the true number of infections" }, "_links": { "self": [ { "href": "https://owid.cloud/wp-json/wp/v2/posts/35911" } ], "about": [ { "href": "https://owid.cloud/wp-json/wp/v2/types/post" } ], "author": [ { "href": "https://owid.cloud/wp-json/wp/v2/users/44", "embeddable": true } ], "curies": [ { "href": "https://api.w.org/{rel}", "name": "wp", "templated": true } ], "replies": [ { "href": "https://owid.cloud/wp-json/wp/v2/comments?post=35911", "embeddable": true } ], "wp:term": [ { "href": "https://owid.cloud/wp-json/wp/v2/categories?post=35911", "taxonomy": "category", "embeddable": true }, { "href": "https://owid.cloud/wp-json/wp/v2/tags?post=35911", "taxonomy": "post_tag", "embeddable": true } ], "collection": [ { "href": "https://owid.cloud/wp-json/wp/v2/posts" } ], "wp:attachment": [ { "href": "https://owid.cloud/wp-json/wp/v2/media?parent=35911" } ], "version-history": [ { "href": "https://owid.cloud/wp-json/wp/v2/posts/35911/revisions", "count": 29 } ], "wp:featuredmedia": [ { "href": "https://owid.cloud/wp-json/wp/v2/media/36208", "embeddable": true } ], "predecessor-version": [ { "id": 56345, "href": "https://owid.cloud/wp-json/wp/v2/posts/35911/revisions/56345" } ] }, "author": 44, "format": "standard", "status": "publish", "sticky": false, "content": { "rendered": "\n<div class=\"blog-info\">\n<p>Our World in Data presents the data and research to make progress against the world\u2019s largest problems.<br>Our main publication on the pandemic is here: <strong><a href=\"https://ourworldindata.org/coronavirus\" target=\"_blank\" rel=\"noopener noreferrer\">Coronavirus Pandemic (COVID-19)</a></strong>.<br><br>We are grateful to the researchers whose work we cover in this post for giving helpful feedback and suggestions. Thank you.</p>\n</div>\n\n\n\t<div class=\"wp-block-last-updated\">\n\t\t\n\n<p>We update the model estimates with the latest available data each week. Last update: <strong>16 January 2022</strong>.</p>\n\n\n\t</div>\n\n\n<p>A key limitation in our understanding of the COVID-19 pandemic is that we do not know the <em>true</em> number of infections. Instead, we only know of infections that have been confirmed by a test \u2013 the confirmed cases. But because many infected people never get tested,{ref}Infected people might not get tested for several reasons, such as not having easy access to testing or not even knowing they are infected because they have no symptoms (though they are still able to transmit the virus). Such asymptomatic infections are estimated to be 10\u201370% of total infections. Source: <a rel=\"noreferrer noopener\" href=\"https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html\" target=\"_blank\">CDC COVID-19 Pandemic Planning Scenarios</a>.{/ref} we know that confirmed cases are only a fraction of true infections. How small a fraction though?</p>\n\n\n\n<p>To answer this question, several research groups have developed epidemiological models of COVID-19. These models use the data we have \u2013 confirmed cases and deaths, testing rates, and more \u2013 plus a range of assumptions and epidemiological knowledge to estimate true infections and other important metrics.</p>\n\n\n\n<h4></h4>\n\n\n\n<p>The chart here shows the mean estimates of the true number of daily new infections in the United States from four of the most prominent models.{ref}There are many models in use besides these four, including other ones by the research groups we cover here. We chose these four models because they are prominent, have been used by policymakers, and have been updated regularly. We use them more for illustration than completeness.{/ref} For comparison, the number of confirmed cases is also shown.</p>\n\n\n\n<ul><li><a href=\"http://ourworldindata.org/covid-models#imperial-college-london-icl\">Imperial College London (ICL)</a></li><li><a href=\"http://ourworldindata.org/covid-models#institute-for-health-metrics-and-evaluation-ihme\">The Institute for Health Metrics and Evaluation (IHME)</a></li><li><a href=\"http://ourworldindata.org/covid-models#youyang-gu-yyg\">Youyang Gu (YYG)</a></li><li><a href=\"http://ourworldindata.org/covid-models#london-school-of-hygiene-tropical-medicine-lshtm\">The London School of Hygiene & Tropical Medicine (LSHTM)</a></li></ul>\n\n\n\n<iframe src=\"https://ourworldindata.org/grapher/daily-new-estimated-infections-of-covid-19\" loading=\"lazy\" style=\"width: 100%; height: 1000px; border: 0px none;\"></iframe>\n\n\n\n<p>Two things are clear from this chart: All four models agree that true infections <em>far outnumber</em> confirmed cases. But the models disagree by how much, and how infections have changed over time.</p>\n\n\n\n<p>When the number of confirmed cases in the US reached a peak in late July 2020, the IHME and LSHTM models estimated that the true number of infections was about twice as high as confirmed cases, the ICL model estimated it was nearly three times as high, and Youyang Gu’s model estimated it was more than <em>six times</em> as high. Back in March the estimated discrepancy between confirmed cases and true infections was even many times higher.</p>\n\n\n\n<p>In this post we examine these four models and how they differ by unpacking their essential elements: what they are used for, how they work, the data they are based on, and the assumptions they make.</p>\n\n\n\n<p>We also aim to make the model estimates easily accessible in our interactive charts, allowing you to quickly explore different models of the pandemic for most countries in the world. To do this simply click “Change country” on each chart.</p>\n\n\n\n<p>Three of the four models we look at are \u201cSEIR\u201d{ref}Pronounced by saying each letter, \u201cS-E-I-R.\u201d{/ref} models,{ref}The London School model is not an SEIR model.{/ref} which simulate how individuals in a population move through four states of a COVID-19 infection: being <strong>S</strong>usceptible, <strong>E</strong>xposed, <strong>I</strong>nfectious, and <strong>R</strong>ecovered (or deceased). How individuals move through these states is determined by different model \u201cparameters,\u201d of which there are many. Two key ones are the effective reproduction number (Rt){ref}Also called “time-varying” reproduction number.{/ref} \u2013 how many other people a person with COVID-19 infects at a given time \u2013 and the infection fatality rate (IFR) \u2013 the percent of people infected with a disease who die from it.</p>\n\n\n\n<p>You can learn more about how SEIR models work by exploring these resources:</p>\n\n\n\n<ul><li><a href=\"https://covid19-projections.com/model-details/\" target=\"_blank\" rel=\"noreferrer noopener\">Youyang Gu\u2019s Model Details</a> (for a brief read)</li><li><a href=\"https://youtu.be/Lcx2a1jXISc\" target=\"_blank\" rel=\"noreferrer noopener\">COVID Act Now\u2019s COVID Data 101: What is an SEIR model?</a> (for a brief video)</li><li><a href=\"https://medium.com/data-for-science/epidemic-modeling-102-all-covid-19-models-are-wrong-but-some-are-useful-c81202cc6ee9\" target=\"_blank\" rel=\"noreferrer noopener\">Bruno Gon\u00e7alves\u2019s Epidemic Modeling 102: All CoVID-19 models are wrong, but some are useful</a> (for a more in-depth read)</li></ul>\n\n\n\n<h3>Imperial College London (ICL)</h3>\n\n\n\n<h5>Age-structured SEIR model focused on low- and middle-income countries (details as of 23 August 2020)</h5>\n\n\n\n<p>This chart shows the ICL model\u2019s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click “Change country.” The lines labeled \u201cupper\u201d and \u201clower\u201d show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown.</p>\n\n\n\n<iframe src=\"https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-icl-model\" loading=\"lazy\" style=\"width: 100%; height: 1000px; border: 0px none;\"></iframe>\n\n\n\n<h5><strong>Website</strong></h5>\n\n\n\n<p><a rel=\"noreferrer noopener\" href=\"https://mrc-ide.github.io/global-lmic-reports/\" target=\"_blank\">https://mrc-ide.github.io/global-lmic-reports/</a></p>\n\n\n\n<h5>Regions covered</h5>\n\n\n\n<p>164 countries and territories across the world</p>\n\n\n\n<h5>Time covered</h5>\n\n\n\n<p>The first date covered is the estimated start of the pandemic for each country. The model makes projections that extend 90 days past the latest date of update.{ref}While projections are an important aspect of what this and some other models are used for, we do not cover them in this article.{/ref}</p>\n\n\n\n<h5>Update frequency</h5>\n\n\n\n<p>About 2\u20133 times per week</p>\n\n\n\n<h5>What is the model?</h5>\n\n\n\n<p>The model is a stochastic SEIR variant with multiple infectious states to reflect different COVID-19 severities, such as mild or asymptomatic versus severe.</p>\n\n\n\n<h5>What is the model used for?</h5>\n\n\n\n<p>ICL describes its model as a tool to help countries understand at what stage the country is in its epidemic (e.g., before or after a peak) and how healthcare demand might change in the future under three policy scenarios. These scenarios are designed to provide a counterfactual of what could happen if current interventions were maintained, increased, or relaxed and are therefore not intended to forecast future mortality.</p>\n\n\n\n<p>ICL uses the model estimates to write reports for individual low- and middle-income countries (LMICs) that are relatively early in their epidemics; these reports are focused on the next 28 days. The downloadable model estimates additionally include data for some high-income countries later in their epidemics (e.g., the US and EU countries) and projections 90 days into the future.</p>\n\n\n\n<p>Based on the model ICL publishes estimates of the following metrics:</p>\n\n\n\n<ul><li>True infections (to-date and projected)</li><li>Confirmed deaths (projected)</li><li>Hospital and ICU demand (to-date and projected)</li><li>Effective reproduction number, Rt (to-date and projected)</li></ul>\n\n\n\n<h5>What data is the model based on?</h5>\n\n\n\n<p>The model is \u201cfit\u201d to data on confirmed deaths{ref}As reported by the European Centre for Disease Prevention and Control (ECDC).{/ref} by using an estimated IFR to \u201cback-calculate\u201d how many infections would have been likely over the previous weeks to produce that number of deaths. It uses mobility data \u2013 from <a rel=\"noreferrer noopener\" href=\"https://ourworldindata.org/covid-mobility-trends\" target=\"_blank\">Google</a> or, if unavailable, inferred from <a rel=\"noreferrer noopener\" href=\"https://www.acaps.org/covid19-government-measures-dataset\" target=\"_blank\">ACAPS government measures data</a> \u2013 to modulate the Rt, the key parameter on how transmission is changing.</p>\n\n\n\n<p>Additionally, the model uses age- and country-specific data on demographics, patterns of social contact, hospital availability, and the risk of hospitalization and death, though the availability of this data varies by country.</p>\n\n\n\n<h5>What are key assumptions and potential limitations?</h5>\n\n\n\n<p>The model uses an estimated IFR for each country calculated by applying age-specific IFRs observed in China and Europe (of about 0.6\u20131%) to that country\u2019s age distribution. In countries like many LMICs with younger populations than in China and Europe, this results in IFR estimates of typically 0.2\u20130.3% because younger populations have lower associated mortality rates. These lower mortality rates, however, assume access to sufficient healthcare, which might not always be the case in LMICs. Differences between the estimated and true IFRs could impact the accuracy of model estimates.</p>\n\n\n\n<p>The model assumes that the number of confirmed deaths is equal to the true number of deaths. But <a rel=\"noreferrer noopener\" href=\"https://ourworldindata.org/excess-mortality-covid\" target=\"_blank\">research on excess mortality</a> and known limitations to testing and reporting capacity suggest that confirmed deaths are often fewer than true deaths. Where this is the case the model likely underestimates the true health burden.</p>\n\n\n\n<p>The model assumes that the change in transmission over time is a function of average mobility trends for places like stores and workplaces but not parks and residential areas.{ref}The model assumes that in parks \u201csignificant contact events are negligible\u201d and that an \u201cincrease in residential movement will not change household contacts.\u201d{/ref} If these assumptions about mobility and transmission do not hold, the model might not accurately track the pandemic.</p>\n\n\n\n<p>Like all models, this one makes many assumptions, and we cover only a few key ones here. For a full list see <a rel=\"noreferrer noopener\" href=\"https://mrc-ide.github.io/global-lmic-reports/parameters.html\" target=\"_blank\">the model methods description</a>.</p>\n\n\n\n<h3>Institute for Health Metrics and Evaluation (IHME)</h3>\n\n\n\n<h5>Hybrid statistical/SEIR model (details as of 23 August 2020)</h5>\n\n\n\n<h5>Update: IHME <a rel=\"noreferrer noopener\" href=\"https://www.healthdata.org/covid/data-downloads\" data-type=\"URL\" target=\"_blank\">announced</a> that “after December 16, 2022, IHME will pause its COVID-19 modeling for the foreseeable future.”</h5>\n\n\n\n<p>This chart shows the IHME model\u2019s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click “Change country.” The lines labeled \u201cupper\u201d and \u201clower\u201d show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown.</p>\n\n\n\n<iframe src=\"https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-ihme-model\" loading=\"lazy\" style=\"width: 100%; height: 1000px; border: 0px none;\"></iframe>\n\n\n\n<h5><strong>Website</strong></h5>\n\n\n\n<p><a href=\"https://covid19.healthdata.org/\" target=\"_blank\" rel=\"noreferrer noopener\">https://covid19.healthdata.org/</a></p>\n\n\n\n<h5>Regions covered</h5>\n\n\n\n<p>159 countries and territories across the world including subnational data for the US and several other countries</p>\n\n\n\n<h5>Time covered</h5>\n\n\n\n<p>The first date covered varies by country<strong>.</strong> The model makes projections that extend approximately 90\u2013120 days past the latest date of update.</p>\n\n\n\n<h5>Update frequency</h5>\n\n\n\n<p>About once a week (though not all countries are updated each time)</p>\n\n\n\n<h5>What is the model?</h5>\n\n\n\n<p>The model is a hybrid with two main components: a statistical \u201cdeath model\u201d component produces death estimates that are used to fit an SEIR model component.</p>\n\n\n\n<p>Note that the model has had two significant updates since its initial publication:</p>\n\n\n\n<ul><li><a href=\"http://www.healthdata.org/sites/default/files/files/Projects/COVID/Estimation_update_050420.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">The SEIR component was added on 4 May 2020</a></li><li><a href=\"http://www.healthdata.org/sites/default/files/files/Projects/COVID/Estimation_update_05.30.2020.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">The death model component was updated on 29 May 2020</a></li></ul>\n\n\n\n<h5>What is the model used for?</h5>\n\n\n\n<p>IHME describes its model as a tool to help government officials understand how different policy decisions could impact the course of the pandemic and to plan for changing healthcare demand.</p>\n\n\n\n<p>The model makes deaths projections that have been highly publicized and sometimes criticized.{ref}For example: Sharon Begley (2020, 17 Apr.) \u201c<a rel=\"noreferrer noopener\" href=\"https://www.statnews.com/2020/04/17/influential-covid-19-model-uses-flawed-methods-shouldnt-guide-policies-critics-say/\" target=\"_blank\">Influential Covid-19 model uses flawed methods and shouldn\u2019t guide U.S. policies, critics say.</a>\u201d STAT News.{/ref} Though much of the criticism was leveled at a previous version of the model, known as \u201cCurveFit,\u201d that was used before the SEIR component was added on 4 May. The projections are made under currently three scenarios.{ref}For more details about the scenarios see the <a rel=\"noreferrer noopener\" href=\"http://www.healthdata.org/covid/faqs\" target=\"_blank\">model FAQs</a>.{/ref}</p>\n\n\n\n<p>Based on the model IHME publishes estimates of the following metrics:</p>\n\n\n\n<ul><li>True infections (to-date and projected)</li><li>Confirmed deaths (projected)</li><li>Hospital, ICU, and ventilator demand (to-date and projected)</li><li>Effective reproduction number, Rt (to-date and projected)</li><li>Testing levels (projected)</li><li>Mobility, as a proxy for social distancing (projected)</li></ul>\n\n\n\n<h5>What data is the model based on?</h5>\n\n\n\n<p>The death model uses data on confirmed cases, confirmed deaths,{ref}Confirmed cases and deaths data as reported by Johns Hopkins University and several official sources.{/ref} and testing.{ref}As reported by the COVID Tracking Project (for US), official sources (Brazil and Dominican Republic), and Our World in Data (all other countries).{/ref}</p>\n\n\n\n<p>The SEIR model is fit to the output of the death model by using an estimated IFR to back-calculate the true number of infections.</p>\n\n\n\n<p>The model uses several other types of data to simulate transmission and disease progression: mobility, social distancing policies, population density, pneumonia seasonality and death rate, air pollution, altitude, smoking rates, and self-reported contacts and mask use. Details on the sources of these data can be found on the <a rel=\"noreferrer noopener\" href=\"http://www.healthdata.org/covid/faqs\" target=\"_blank\">model FAQs</a> and <a rel=\"noreferrer noopener\" href=\"http://www.healthdata.org/covid/updates\" target=\"_blank\">estimation updates</a> pages.</p>\n\n\n\n<h5>What are key assumptions and potential limitations?</h5>\n\n\n\n<p>The model uses an estimated IFR based on data from the Diamond Princess cruise ship and New Zealand. Though IHME does not give numbers for these, the Diamond Princess IFR has been estimated at 0.6% (95% uncertainty interval of 0.2\u20131.3%).{ref}Russell et al (2020). Estimating the infection and case fatality ratio for coronavirus disease (COVID-19) using age-adjusted data from the outbreak on the Diamond Princess cruise ship. Eurosurveillance, 25(12). <a href=\"https://doi.org/10.2807/1560-7917.ES.2020.25.12.2000256\" target=\"_blank\" rel=\"noreferrer noopener\">https://doi.org/10.2807/1560-7917.ES.2020.25.12.2000256</a>{/ref} Differences between the estimated and true IFRs could impact the accuracy of model estimates.</p>\n\n\n\n<p>The death model makes several assumptions about the relationship between confirmed deaths, confirmed cases, and testing levels. For example, that a decreasing <em>case</em> fatality rate (CFR) \u2013 the ratio of <em>confirmed</em> deaths to <em>confirmed</em> cases{ref}The CFR is similar to the IFR but uses the <em>confirmed</em> deaths and cases reported by countries. In contrast, the IFR uses true deaths and infections, which are generally not known and have to be estimated.{/ref} \u2013 is reflective of increasing testing and a shift toward testing mild or asymptomatic cases. But the CFR could also decrease for other reasons, such as improved treatment or a decline in the average age of infected people.</p>\n\n\n\n<p>The model assumes that the change in transmission over time is a function of several data inputs (listed above), like mobility and population density. If these assumptions do not hold \u2013 for example, because the data is less relevant or its relationship with transmission is misspecified \u2013 the model might not accurately track the pandemic.</p>\n\n\n\n<p>More details are discussed in the <a href=\"http://www.healthdata.org/covid/faqs\" target=\"_blank\" rel=\"noreferrer noopener\">model FAQs</a> and in different <a href=\"http://www.healthdata.org/covid/updates\" target=\"_blank\" rel=\"noreferrer noopener\">estimation update reports</a>.</p>\n\n\n\n<h3>Youyang Gu (YYG)</h3>\n\n\n\n<h5>SEIR model with machine learning layer (details as of 23 August 2020)</h5>\n\n\n\n<h5>Update: Youyang Gu <a rel=\"noreferrer noopener\" href=\"https://youyanggu.com/blog/six-months-later\" data-type=\"URL\" data-id=\"https://youyanggu.com/blog/six-months-later\" target=\"_blank\">announced</a> that 5 October 2020 is the final model update</h5>\n\n\n\n<p>This chart shows the YYG model\u2019s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click “Change country.” The lines labeled \u201cupper\u201d and \u201clower\u201d show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown.</p>\n\n\n\n<iframe src=\"https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-yyg-model\" loading=\"lazy\" style=\"width: 100%; height: 1000px; border: 0px none;\"></iframe>\n\n\n\n<h5><strong>Website</strong></h5>\n\n\n\n<p><a href=\"https://covid19-projections.com/\" target=\"_blank\" rel=\"noreferrer noopener\">https://covid19-projections.com/</a></p>\n\n\n\n<h5>Regions covered</h5>\n\n\n\n<p>71 countries across the world including subnational data for the US and Canada</p>\n\n\n\n<h5>Time covered</h5>\n\n\n\n<p>The first date covered varies by country<strong>.</strong> The model makes projections that extend approximately 90 days past the latest date of update.</p>\n\n\n\n<h5>Update frequency</h5>\n\n\n\n<p>Daily</p>\n\n\n\n<h5>What is the model?</h5>\n\n\n\n<p>The model consists of an SEIR base with a machine learning layer on top to search for the parameters that minimize the error between the model estimates and the observed data.</p>\n\n\n\n<h5>What is the model used for?</h5>\n\n\n\n<p>Youyang describes his model as making projections of true infections and deaths that optimize for forecast accuracy. Though he also stresses that his projections cover a range of possible outcomes, and that projections are not \u201cwrong\u201d if they help shape a different outcome in the future.</p>\n\n\n\n<p>Based on the model Youyang publishes estimates of the following metrics:</p>\n\n\n\n<ul><li>True infections (to-date and projected)</li><li>Confirmed deaths (projected)</li><li>Effective reproduction number, Rt (to-date and projected)</li><li>Tests per day targets (projected)</li></ul>\n\n\n\n<p>The model does not focus on projections under different scenarios, but has explored what would have happened if the US had mandated social distancing <a href=\"https://covid19-projections.com/us-1weekearlier\" target=\"_blank\" rel=\"noreferrer noopener\">one week earlier</a> or <a href=\"https://covid19-projections.com/us-1weeklater\" target=\"_blank\" rel=\"noreferrer noopener\">one week later</a>, or <a href=\"https://covid19-projections.com/us-self-quarantine\" target=\"_blank\" rel=\"noreferrer noopener\">if 20% of infected people immediately self-quarantined</a>.</p>\n\n\n\n<h5>What data is the model based on?</h5>\n\n\n\n<p>The model is fit to data on confirmed deaths{ref}As reported by Johns Hopkins University. The data is smoothed before fitting.{/ref} by using an estimated IFR to back-calculate the true number of infections. Confirmed cases and hospitalization data are sometimes used to help set bounds for the machine learning parameter search.</p>\n\n\n\n<h5>What are key assumptions and potential limitations?</h5>\n\n\n\n<p>The model uses an estimated IFR for each region based initially on that region\u2019s observed CFR. The IFR is then decreased{ref}Except in \u201clater-impacted regions like Latin America, we wait an additional 3 months before beginning to decrease the IFR.\u201d{/ref} linearly over the span of three months until it is 30% of its initial value to reflect the lower average age of infections and improving treatments. Currently, the IFR is estimated to be 0.2\u20130.4% in most of the US and Europe. Differences between the estimated and true IFRs could impact the accuracy of model estimates.</p>\n\n\n\n<p>The model assumes there will be unreported deaths for the “first few weeks\u201d of a region\u2019s pandemic, and that this underreporting will decrease until the number of confirmed deaths equals true deaths. As noted before, this is often not the case, and thus the model might underestimate the true health burden.</p>\n\n\n\n<p>The model makes assumptions about how reopening will affect social distancing and ultimately transmission. For example, if reopening causes a resurgence of infections, the model assumes regions will take action to reduce transmission, which is modeled by limiting the Rt. It also assumes a reopening date for regions (especially outside the US and Europe) where the true date is unknown.</p>\n\n\n\n<p>The model was created and optimized for the US. Thus for other countries the model estimates might be less accurate.</p>\n\n\n\n<p>For a full list of assumptions and limitations see <a href=\"https://covid19-projections.com/about/#assumptions\" target=\"_blank\" rel=\"noreferrer noopener\">the model “About” page</a>.</p>\n\n\n\n<h3>London School of Hygiene & Tropical Medicine (LSHTM)</h3>\n\n\n\n<h5>Statistical model estimating underreporting of infections (details as of 23 August 2020)</h5>\n\n\n\n<p>This chart shows the LSHTM model\u2019s estimates of the true number of daily new infections in the United States. To see the estimates for other countries click “Change country.” The lines labeled \u201cupper\u201d and \u201clower\u201d show the bounds of a 95% uncertainty interval. For comparison, the number of confirmed cases is also shown.</p>\n\n\n\n<iframe src=\"https://ourworldindata.org/grapher/daily-new-estimated-covid-19-infections-lshtm-model\" loading=\"lazy\" style=\"width: 100%; height: 1000px; border: 0px none;\"></iframe>\n\n\n\n<h5><strong>Website</strong></h5>\n\n\n\n<p><a rel=\"noreferrer noopener\" href=\"https://cmmid.github.io/topics/covid19/global_cfr_estimates.html\" target=\"_blank\">https://cmmid.github.io/topics/covid19/global_cfr_estimates.html</a></p>\n\n\n\n<h5>Regions covered</h5>\n\n\n\n<p>159 countries and territories across the world (those with at least 10 confirmed deaths out of a total of 210)</p>\n\n\n\n<h5>Time covered</h5>\n\n\n\n<p>The first date covered varies by country<strong>. </strong>The model does not make projections.</p>\n\n\n\n<h5>Update frequency</h5>\n\n\n\n<p>About once a week</p>\n\n\n\n<h5>What is the model?</h5>\n\n\n\n<p>The model starts with a country\u2019s CFR and adjusts it for the fact that there is a delay of roughly 2\u20133 weeks between case confirmation and death (or recovery).{ref}The typical CFR calculation divides confirmed deaths by confirmed cases <em>reported on the same day</em>, but those deaths were actually caused by cases confirmed roughly 2\u20133 weeks before.{/ref} This delay-adjusted CFR is then compared to a baseline, delay-adjusted CFR to estimate the “ascertainment rate” \u2013 the proportion of all <em>symptomatic</em> infections that have actually been confirmed.{ref}All but a trivial number of confirmed cases are assumed to be symptomatic.{/ref}</p>\n\n\n\n<p>This estimated ascertainment rate is then used to adjust the number of confirmed cases{ref}This data is first smoothed.{/ref} to estimate the true number of symptomatic infections. To finally estimate <em>total</em> infections, the symptomatic infections estimate is adjusted to include <em>asymptomatic</em> infections, which are estimated to compose between 10\u201370% (median 50%) of total infections.{ref}In accordance with this methodology and in consultation with the LSHTM researchers, we perform these calculations to produce the estimates of total infections presented here.{/ref}</p>\n\n\n\n<h5>What is the model used for?</h5>\n\n\n\n<p>LSHTM describes its model as a tool to help understand the level of undetected epidemic progression and to aid response planning, such as when to introduce and relax control measures.</p>\n\n\n\n<p>Based on the model LSHTM publishes estimates of the ascertainment rate.</p>\n\n\n\n<h5>What data is the model based on?</h5>\n\n\n\n<p>The model is based on data on confirmed deaths and confirmed cases.{ref}Both as reported by the ECDC.{/ref}</p>\n\n\n\n<h5>What are key assumptions and potential limitations?</h5>\n\n\n\n<p>The model assumes a baseline, delay-adjusted CFR of 1.4% and that any difference between that and a country\u2019s delay-adjusted CFR is entirely due to under-ascertainment. But many other factors likely play a role, such as the burden on the healthcare system, COVID-19 risk factors in the population, the ages of those infected, and more.</p>\n\n\n\n<p>The assumed baseline CFR is based on data from China and does not account for different age distributions outside China. This causes the ascertainment rate to be overestimated in countries with younger populations and underestimated in countries with older populations.{ref}In a secondary analysis the LSHTM researchers do adjust the baseline CFR for different age distributions. But this has its own assumptions and limitations and is thus not clearly a better approach. More details can be found in <a rel=\"noreferrer noopener\" href=\"https://cmmid.github.io/topics/covid19/reports/UnderReporting.pdf\" target=\"_blank\">the full report</a>.{/ref}</p>\n\n\n\n<p>The model assumes that the number of confirmed deaths is equal to the true number of deaths. As noted before, this is often not the case, and thus the model might underestimate the true health burden.</p>\n\n\n\n<p>Reported deaths data is sometimes changed retroactively, which can be challenging for the model and might affect its estimates.</p>\n\n\n\n<p>More assumptions and limitations are discussed in <a rel=\"noreferrer noopener\" href=\"https://cmmid.github.io/topics/covid19/reports/UnderReporting.pdf\" target=\"_blank\">the full report</a>.</p>\n\n\n\n<h3>How should we think about these models and their estimates?</h3>\n\n\n\n<p>All four models we looked at agree that true infections far outnumber confirmed cases, but they disagree by how much. We now have some insight into these differences: The models all differ to some degree in what they are used for, how they work, the data they are based on, and the assumptions they make.</p>\n\n\n\n<p>Making these differences transparent helps us understand how we should think about these models and their estimates. For example, understanding that some models are used for scenario planning and not forecasting (like ICL\u2019s) while others are optimized for forecast accuracy (like Youyang\u2019s) puts their estimates in context. And the models all make different assumptions that each have limitations; we can decide if those limitations are relevant to a given situation.</p>\n\n\n\n<p>In the end, though, we still want to have confidence that models can track the pandemic accurately. We can calibrate our confidence in different models by giving their estimates a reality check.</p>\n\n\n\n<p>One way to do this is to compare model estimates against some observed \u201cground truth\u201d data. For example, if a model is forecasting the number of deaths four weeks from now, we can wait four weeks and compare the forecast to the deaths that actually occur.{ref}Though we still need to consider that such forecasts might not track what actually occurs if they help shape a different outcome in the future.</p>\n\n\n\n<p>Some current efforts to score forecasts for accuracy are by <a rel=\"noreferrer noopener\" href=\"https://github.com/youyanggu/covid19-forecast-hub-evaluation\" target=\"_blank\">Youyang Gu</a>, <a rel=\"noreferrer noopener\" href=\"http://www.healthdata.org/research-article/predictive-performance-international-covid-19-mortality-forecasting-models\" target=\"_blank\">IHME</a>, <a rel=\"noreferrer noopener\" href=\"https://zoltardata.com/about\" target=\"_blank\">The Zoltar Project</a>, and <a href=\"https://covidcompare.io/\" target=\"_blank\" rel=\"noreferrer noopener\">Covid Compare</a>.{/ref}</p>\n\n\n\n<p>But sometimes the ground truth is not easily observed, as is the case with the true number of infections. Here we have to look for <em>converging evidence</em> from other research, such as from seroprevalence studies that test for COVID-19 antibodies in the blood serum to estimate how many people have ever been infected.{ref}The LSHTM researchers, for example, compared their model estimates to seroprevalence estimates and found good agreement. You can read more about this in <a rel=\"noreferrer noopener\" href=\"https://cmmid.github.io/topics/covid19/Under-Reporting.html\" target=\"_blank\">their full report.</a>{/ref}</p>\n\n\n\n<p>By gaining a deeper, more nuanced understanding of these models and their strengths and weaknesses, we can use them as valuable tools to help make progress against the pandemic.</p>\n", "protected": false }, "excerpt": { "rendered": "We know that confirmed COVID-19 cases are only a fraction of true infections. How small a fraction though?", "protected": false }, "date_gmt": "2020-08-24T11:00:25", "modified": "2023-03-17T08:37:44", "template": "", "categories": [ 1 ], "ping_status": "closed", "authors_name": [ "Charlie Giattino" ], "modified_gmt": "2023-03-17T08:37:44", "comment_status": "closed", "featured_media": 36208, "featured_media_paths": { "thumbnail": "/app/uploads/2020/08/covid-models-150x79.png", "medium_large": "/app/uploads/2020/08/covid-models-768x402.png" } } |