-
Notifications
You must be signed in to change notification settings - Fork 3
Support hospitalizations in data pipeline #120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,20 +32,30 @@ state_geos = locations %>% | |
filter(nchar(.data$geo_value) == 2) %>% | ||
pull(.data$geo_value) | ||
signals = c("confirmed_incidence_num", | ||
"deaths_incidence_num") | ||
"deaths_incidence_num", | ||
"confirmed_admissions_covid_1d") | ||
|
||
predictions_cards = get_covidhub_predictions(forecasters, | ||
signal = signals, | ||
ahead = 1:28, | ||
geo_values = state_geos, | ||
verbose = TRUE, | ||
use_disk = TRUE) | ||
use_disk = TRUE) %>% | ||
filter(!(incidence_period == "epiweek" & ahead > 4)) | ||
|
||
predictions_cards = predictions_cards %>% | ||
filter(!is.na(predictions_cards$target_end_date)) | ||
predictions_cards = predictions_cards %>% filter(target_end_date < today()) | ||
filter(!is.na(predictions_cards$target_end_date)) %>% | ||
filter(target_end_date < today()) | ||
|
||
# Only accept forecasts made Monday or earlier | ||
# For epiweek predictions, only accept forecasts made Monday or earlier. | ||
# target_end_date is the date of the last day (Saturday) in the epiweek | ||
# For daily predictions, accept any forecast where the target_end_date is later | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reason we aren't using the "Monday or earlier" cutoff for hospitalization data? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The hospitalization forecasts are produced for every day following the forecast date; the target is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This approach matches Dan's understanding. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it, thanks. |
||
# than the forecast_date. | ||
predictions_cards = predictions_cards %>% | ||
filter(target_end_date - (forecast_date + 7 * ahead) >= -2) | ||
filter( | ||
(incidence_period == "epiweek" & target_end_date - (forecast_date + 7 * ahead) >= -2) | | ||
(incidence_period == "day" & target_end_date > forecast_date) | ||
) | ||
|
||
# And only a forecaster's last forecast if multiple were made | ||
predictions_cards = predictions_cards %>% | ||
|
@@ -91,22 +101,52 @@ state_scores = evaluate_covid_predictions(state_predictions, | |
geo_type = "state") | ||
|
||
source("score.R") | ||
print("Saving state confirmed incidence...") | ||
save_score_cards(state_scores, "state", signal_name = "confirmed_incidence_num", | ||
output_dir = opt$dir) | ||
print("Saving state deaths incidence...") | ||
save_score_cards(state_scores, "state", signal_name = "deaths_incidence_num", | ||
output_dir = opt$dir) | ||
if ( "confirmed_incidence_num" %in% unique(state_scores$signal)) { | ||
print("Saving state confirmed incidence...") | ||
save_score_cards(state_scores, "state", signal_name = "confirmed_incidence_num", | ||
output_dir = opt$dir) | ||
} else { | ||
warning("State confirmed incidence should generally be available. Please | ||
verify that you expect not to have any cases incidence forecasts") | ||
} | ||
if ( "deaths_incidence_num" %in% unique(state_scores$signal)) { | ||
print("Saving state deaths incidence...") | ||
save_score_cards(state_scores, "state", signal_name = "deaths_incidence_num", | ||
output_dir = opt$dir) | ||
} else { | ||
warning("State deaths incidence should generally be available. Please | ||
verify that you expect not to have any deaths incidence forecasts") | ||
} | ||
if ( "confirmed_admissions_covid_1d" %in% unique(state_scores$signal)) { | ||
print("Saving state hospitalizations...") | ||
save_score_cards(state_scores, "state", signal_name = "confirmed_admissions_covid_1d", | ||
output_dir = opt$dir) | ||
} | ||
|
||
print("Evaluating national forecasts") | ||
# COVIDcast does not return national level data, using CovidHubUtils instead | ||
nation_scores = evaluate_chu(nation_predictions, signals, err_measures) | ||
|
||
print("Saving nation confirmed incidence...") | ||
save_score_cards(nation_scores, "nation", | ||
signal_name = "confirmed_incidence_num", output_dir = opt$dir) | ||
print("Saving nation deaths incidence...") | ||
save_score_cards(nation_scores, "nation", signal_name = "deaths_incidence_num", | ||
output_dir = opt$dir) | ||
if ( "confirmed_incidence_num" %in% unique(state_scores$signal)) { | ||
print("Saving nation confirmed incidence...") | ||
save_score_cards(nation_scores, "nation", | ||
signal_name = "confirmed_incidence_num", output_dir = opt$dir) | ||
} else { | ||
warning("Nation confirmed incidence should generally be available. Please | ||
verify that you expect not to have any cases incidence forecasts") | ||
} | ||
if ( "deaths_incidence_num" %in% unique(state_scores$signal)) { | ||
print("Saving nation deaths incidence...") | ||
save_score_cards(nation_scores, "nation", signal_name = "deaths_incidence_num", | ||
output_dir = opt$dir) | ||
} else { | ||
warning("Nation deaths incidence should generally be available. Please | ||
verify that you expect not to have any deaths incidence forecasts") | ||
} | ||
if ( "confirmed_admissions_covid_1d" %in% unique(state_scores$signal)) { | ||
print("Saving nation hospitalizations...") | ||
save_score_cards(nation_scores, "nation", signal_name = "confirmed_admissions_covid_1d", | ||
output_dir = opt$dir) | ||
} | ||
|
||
print("Done") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious about this addition. It looks like this wasn't here before, yet we were still only saving aheads 1-4 for epiweek predictions (cases and deaths). I thought Jed made this cutoff elsewhere.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default,
get_covidhub_predictions
usesahead = 1:4
for both day and epiweek forecasts. However, daily forecasts actually go up to aheads of 28. To get those without getting epiweek forecasts more than 4 weeks ahead, I switched theahead
setting to 1-28 and added the filter.We could do two separate calls to
get_covidhub_predictions
here, one for cases + deaths and one for hospitalizations, with differentahead
settings. However the underlyingget_forecaster_predictions_alt
downloads all forecast files every time it's run (are you aware of any particular reason for this?), so the memory/speed tradeoff is poor at the moment.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, perhaps because this was originally intended to be run in the GitHub Actions, the files wouldn't persist between sessions anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that makes sense. And yes I believe that is the case re: downloading files.