Here we provide estimates of the total number of COVID-19 cases imported into each port of entry airport in the US and from each country currently flying to the US for the month of November 2020.

The prediction model is fit to the relationship between detected importations to states in the US from Italy during February and March 2020 and the estimated flow of infectious individuals through air travel (See Importation Risk Model Tab). A forecast is made for the current month based on the prior month’s international infections and the current month’s international flight bookings.

Because these estimates are based on the number of cases detected entering the US they are certainly an under-estimate. Additional calibration is required to scale this value appropriately but at present we do not have access to data for this calibration. Nonetheless, the relative risk of ports of entry and countries represented here remains valid. Note that current importations are likely inconsequential in comparison to local transmission in most parts of the US.

Predicted Imports for November 2020

Highest Risk Port of Entry Airports

We visualize the number of projected importations into the highest risk airports for November 2020 alongside the international air traffic passenger volume into each airport below. Error bars on the predicted number of importations represent +/- SE based on the estimate of \(a\). The Spearman rank correlation between predicted importations and estimated number of passengers at a given airport is 0.98.

Highest Risk Source Countries

We visualize the number of projected importations from the highest risk countries alongside the international air traffic flow from each country and estimated prevalence of that country below. Error bars on the predicted number of importations represent +/- SE based on the estimate of \(a\). The Spearman rank correlation between predicted importations and estimated number of passengers from a given country is 0.62 while the correlation with prevalence in the source country is 0.24.

Importation Risk Model

Overview

The goal of this analysis is to estimate the number of COVID-19 importations at Port of Entry (POE) airports in the United States, based on the amount of international passenger flow to these airports, and the prevalence of COVID-19 in the population of the country of origin of the passengers.

Due to the nature of case reporting in the United States, we do not have data on the number of cases that were imported at each airport. Instead, we have data on the number of imported cases that were detected at the state level. So in order to quantify the relationship between air travel passenger volume and international COVID-19 prevalence on importation risk, we first fit the model at the US state level.

We fit a Poisson regression model to importations from Italy to the US in early 2020. This model fit suggests that importations are directly proportional to the prevalence of COVID in the source country’s population * number of passengers from that country. With a coefficient roughly equal to one, this suggests that the population with COVID is no more or less likely to travel than the general population of that country.

Since the relative risk of COVID importation at each POE airport is of use to policy makers, we use the Poisson model fit described above to predict importations from all countries outside of the United States to each POE airport. We estimate flight passenger volume at each US POE airport based on future booking itineraries collected by OAG. We estimate COVID prevalence in each source country by using the previous month’s reported cases.

Across all POE airports, the relative risk of importation is strongly correlated to the overall international passenger volume to each POE airport. However, for the airports that have the highest risk of importations, the correlation is not as strong, and risk also depends on each airport’s passenger volume from countries with high COVID-19 prevalence.

Limitations:

The estimates of number of imported cases are likely underestimates due to the fact that the model is built on detected cases upon arrival into the United States, and underdetection of cases from Italy during January-March 2020 is not built into the model.

Additionally, the model assumes that the prevalence of COVID-19 in Italy, and any other source country, is equal to the number of reported cases divided by its population. This is an underestimate of the true infection rate, and does not take into account changes in ascertainment rates through time, or from country to country.

Additionally, the model was built on actual passenger volume from OAG, where as the predictive model is using booking itinerary data that is subject to change. This also may bias the relative risk that is assigned to each airport, due to unpredictable discrepancies to planned itineraries compared to true passenger volume. There is a two month delay in getting the true passenger volume from OAG, so this is something that we can potentially correct for going forward, but due to the ongoing changing nature of the pandemic and flight usage behavior, it is unclear how stable the relationship between booking data and passenger volume will be.

Importation Predictions Based on Italy

Specifically, we fit a Poisson regression model to data on imported cases from Italy to the United States. We chose Italy for a number of reasons

  1. The reporting of COVID prevalence in Italy is likely more reliable than in China at early stages in each country’s epidemics
  2. Italy was well known as an early hot-spot of COVID, so travel histories from Italy were often publicly reported in the media and by local public health officials.
  3. Imported cases from countries other than China and Italy, since they tended to occur later, are not publicly reported for most of the United States, with the exception of Florida. Additionally, later in the outbreak, cases with travel histories are often detected in locations with ongoing community spread, so it is difficult to conclude the source of infection is from overseas.

Including flight passenger volume data from January, February, and March 2020, and prevalence of COVID-19 cases in Italy, the expected number of cases in destination state \(i\) = \(C_i\)

\[E[C_i] = \lambda = \beta_0 \frac{C_cn_{ic}}{p_c}\]

where \(n_{ic}\) is the monthly passenger volume from Italy with destination i, \(p_c\) is the population of Italy, and \(C_c\) is the number of new reported cases in Italy.

In order to ensure that imported cases had COVID-19 exposure in Italy, and not any other countries, we only included cases that had travel history to Italy alone. Importations with travel history to multiple international destinations were excluded from the analysis.

## 
## Call:
## glm(formula = total_cases_month_single_source ~ x_monthly - 1, 
##     family = poisson(link = "identity"), data = data_cases_monthly)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -5.6627  -0.3727   1.0234   2.1417   5.3003  
## 
## Coefficients:
##           Estimate Std. Error z value Pr(>|z|)    
## x_monthly   1.0322     0.1418    7.28 3.34e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance:    Inf  on 13  degrees of freedom
## Residual deviance: 101.79  on 12  degrees of freedom
## AIC: 138.6
## 
## Number of Fisher Scoring iterations: 3

The following graphs visualize the predictions of our model fit to importations from Italy when Italy was the only international destination visited. They include bootstrapped prediction intervals. Black is the mean prediction based on the poisson regression model. Dashed lines are corresponding prediction intervals, run with 1000 simulations here to save time.

  • The first graph also plots the raw data for Italian importations.
  • The second graph plots data for all states and countries with recorded importations besides Italy and China.
  • The third graph is similar to the second graph but includes only importations to the state of Florida. Florida appears to have continued to record travel/importation history more consistently over time than other states. In particular They record substantial importations from Spain, a major country hotspot that broke later than Italy.

The international data are inherently much shakier than the Italy and China data because (i) outbreaks in these countries occurred later making recording of importation by health departments in the US less likely (ii) many of these countries had smaller outbreaks at the time that importations were still regularly recorded.

Translating this to Importation Predictions

Here we estimate the number of COVID-19 importations into the United States according to Port of Entry (POE) airport and source country. We have found that predicted number of importations is roughly equal to (prevalence at source country)*(passenger flow from that country to location of interest). See the next tab for additional info. Since CDC intervention needs to happen at POE airports rather than in states we can use this relationship to estimate the predicted number of cases flowing into a POE airport per month. We specify the importation risk to all POE Airports (\(\mathbf{R}\)) as

\[\mathbf{R} = a (\mathbf{C} \oslash \mathbf{P}) \bullet \mathbf{N}\] Where n = number of countries with flights into the US, \(\mathbf{C}\) is a 1 x n matrix of the total number of new COVID-19 cases recorded in source countries in the prior month, \(\mathbf{P}\) is a 1 x n matrix of population of source countries and \(\oslash\) represents item by item division of the matrices.

\(\mathbf{N}\) is a matrix representing the number of passenger bookings from all countries into US POE airports in the prediction month. The constant \(a\) is a coefficient of proportionality between the number of importations and the rest of the right half of the above equation. It is estimated based on importations from Italy in the analysis on the next tab.

Historical Importation Risk

Historical Importation Risk to Port of Entry Airports

We visualize the number of projected importations into the highest risk airports alongside the international air traffic passenger volume into each airport by month of the pandemic. Error bars on the predicted number of importations represent +/- SE based on the estimate of \(a\). Only months with airports with >1 predicted importation are presented

These plots are followed by a heatmap of the Spearman rank correlation of predicted number of importations at POE airports between months.

Historical Importation Risk from Source Countries

We visualize the number of projected importations from the highest risk countries alongside the international air traffic passenger booking data from each country and estimated prevalence by month of the pandemic. Error bars on the predicted number of importations represent +/- SE based on the estimate of \(a\). Only months with airports with >1 predicted importation are presented

These plots are followed by a heatmap of the Spearman rank correlation of importation risk from countries between months.