diff --git a/dashboard/about.md b/dashboard/about.md
new file mode 100644
index 0000000..819f19d
--- /dev/null
+++ b/dashboard/about.md
@@ -0,0 +1,69 @@
+### Who We Are
+
+The Forecast Evaluation Research Collaborative was founded by the [Reich Lab](https://reichlab.io/) at University of Massachusetts Amherst and the Carnegie Mellon University [Delphi Group](https://delphi.cmu.edu). Both groups are funded by the CDC as Centers of Excellence for Influenza and COVID-19 Forecasting. We have partnered together on this project to focus on providing a robust set of tools and methods for evaluating the performance of epidemic forecasts.
+
+The collaborative’s mission is to help epidemiological researchers gain insights into the performance of their forecasts, and ultimately lead to more accurate forecasting of epidemics.
+
+Both groups have led initiatives related to COVID-19 data and forecast curation. The Reich Lab has created the [COVID-19 Forecast Hub](https://covid19forecasthub.org/), a collaborative effort with over 80 groups submitting forecasts to be part of the official [CDC COVID-19 ensemble forecast](https://www.cdc.gov/coronavirus/2019-ncov/covid-data/mathematical-modeling.html). The Delphi Group has created COVIDcast, a platform for [epidemiological surveillance data](https://delphi.cmu.edu/covidcast/), and runs the [Delphi Pandemic Survey via Facebook](https://delphi.cmu.edu/covidcast/surveys/), which is a [valuable signal](https://delphi.cmu.edu/blog/2020/09/21/can-symptoms-surveys-improve-covid-19-forecasts/) for Delphi’s participation in the ensemble forecast.
+
+The Forecaster Evaluation Dashboard is a collaborative project, which has been made possible by the 13 pro bono Google.org Fellows who have spent 6 months working full-time with the Delphi Group. Google.org is [committed](https://www.google.org/covid-19/) to the recovery of lives and communities that have been impacted by COVID-19 and investing in developing the science to mitigate the damage of future pandemics.
+
+#### **Collaborators**
+
+From the Forecast Hub: Estee Cramer, NIcholas Reich, [the COVID-19 Forecast Hub Team](https://covid19forecasthub.org/doc/team/)
+From the Delphi Research Group: Jed Grabman, Kate Harwood, Chris Scott, Jacob Bien, Daniel McDonald, Logan Brooks
+
+### About the Data
+
+#### **Sources**
+
+**Observed values** are from the [COVID-19 Data Repository](https://github.com/CSSEGISandData/COVID-19) by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University.
+
+**Forecaster predictions** are drawn from the [COVID-19 Forecast Hub GitHub repository](https://github.com/reichlab/covid19-forecast-hub/)
+
+Data for the dashboard is pulled once a week from these sources, on Tuesdays.
+
+#### **Terms**
+
+* **Forecaster**
+
+ A model producing quantile predictions
+
+* **Forecast**
+
+ A set of data that, for all locales in a geo type, includes predictions for a target variable for each of a certain number of quantiles for each of a certain number of horizons
+
+* **Target Variable**
+
+ What the forecast is predicting, ie: “weekly incident cases”
+
+* **Horizon**
+
+ The duration of time between when the prediction was made and the predicted event, typically in units of epidemiological weeks.
+
+* **Epidemiological Week (Epi-week)**
+
+ Week that starts on a Sunday. If it is Sunday or Monday, the next epi-week is the week that starts on that Sunday (going back a day if it is Monday). If it is Tuesday-Saturday, it is the week that starts on the subsequent Sunday, following [CDC convention](https://wwwn.cdc.gov/nndss/document/MMWR_week_overview.pdf).
+
+* **Point Forecast**
+
+ The value that each forecaster picks as their “most likely” prediction. For many forecasters this is the 50th quantile of the predictive distribution, for others it might be the mean of the distribution.
+
+* **Geo Type**
+
+ States or U.S. as a nation
+
+#### **Dashboard Inclusion Criteria**
+
+* Includes only weekly deaths incidence and weekly case incidence target variables
+* Includes only horizon < 5 weeks ahead
+* Includes only geo values that are 2 characters (states / territories / nation)
+* Includes only non-NA target dates (if the date is not in yyyy/mm/dd, the prediction will not be included)
+* Includes only predictions with at least 3 quantile values
+* Includes only one file per forecaster per week (according to forecast date). That file must be from a Sunday or Monday. If both are present, we keep the Monday data.
+* If a forecaster updates a file after that Monday, we do not include the new predictions
+
+#### **Notes on the Data**
+
+* When totaling over all locations, these locations include states and territories and do not include nationwide forecasts. We only include states and territories common to the selected forecasters (over all time) that have data for at least one location.
+* We do include revisions of observed values, meaning the scores for forecasts made in the past can change. Scores change as our understanding of the truth changes.
\ No newline at end of file
diff --git a/dashboard/ae.md b/dashboard/ae.md
new file mode 100644
index 0000000..881e526
--- /dev/null
+++ b/dashboard/ae.md
@@ -0,0 +1 @@
+The **absolute error** of a forecast is calculated from the Point Forecast. Usually this is the 50% quantile prediction, but forecasters can specify their own Point Forecast value. When none is provided explicity, we use the 50% quantile prediction.
\ No newline at end of file
diff --git a/dashboard/app.R b/dashboard/app.R
index 6a46030..2cb76ad 100644
--- a/dashboard/app.R
+++ b/dashboard/app.R
@@ -28,17 +28,29 @@ s3bucket = tryCatch(
# Get and prepare data
getData <- function(filename){
if(!is.null(s3bucket)) {
- s3readRDS(object = filename, bucket = s3bucket)
- } else {
- path = ifelse(
- file.exists(filename),
- filename,
- file.path("../dist/",filename)
+ tryCatch(
+ {
+ s3readRDS(object = filename, bucket = s3bucket)
+ },
+ error = function(e) {
+ e
+ getFallbackData(filename)
+ }
)
- readRDS(path)
+ } else {
+ getFallbackData(filename)
}
}
+getFallbackData = function(filename) {
+ path = ifelse(
+ file.exists(filename),
+ filename,
+ file.path("../dist/",filename)
+ )
+ readRDS(path)
+}
+
dfStateCases <- getData("score_cards_state_cases.rds")
dfStateDeaths <- getData("score_cards_state_deaths.rds")
dfNationCases = getData("score_cards_nation_cases.rds")
@@ -59,103 +71,15 @@ locationChoices = locationChoices[c(length(locationChoices), (1:length(locationC
coverageChoices = intersect(colnames(df), COVERAGE_INTERVALS)
# Score explanations
-wisExplanation = "
The
weighted interval score (WIS) is a proper score that combines a set of interval scores.
-See
this preprint about the WIS method for a more in depth explanation.
-TODO: How is it actually calculated from the intervals?
"
-aeExplanation = "
- The absolute error of a forecast is calculated from the Point Forecast.
- Usually this is the 50% quantile prediction, but forecasters can specify their own Point Forecast value.
- When none is provided explicity, we use the 50% quantile prediction.
-
"
-coverageExplanation = "
- The coverage plot shows how well a forecaster's confidence intervals performed on a given week, across all locations.
- The horizontal black line is the selected confidence interval, and the y-values are the percentage of time that the observed
- values of the target variable value fell into that confidence interval.
- A perfect forecaster on this measure would follow the black line.
-
- For example, a forecaster wants the observed values to be within the 50% confidence interval in 50% of locations for the given week.
- If the y-value is above the horizontal line, it means that the observed values fell within the forecaster's 50% CI more than 50% of
- the time, aka the forecaster's 50% CI was under-confident that week, or too wide. Conversely, if the y-value is below the line,
- it means that the forecaster's 50% CI was over-confident that week, or too narrow.
-
"
-
+wisExplanation = includeMarkdown("wis.md")
+aeExplanation = includeMarkdown("ae.md")
+coverageExplanation = includeMarkdown("coverageplot.md")
+# Truth data disclaimer
+observedValueDisclaimer =
+ "All forecasts are evaluated against the latest version of observed data. Scores of pasts forecasts may change as observed data is revised."
# About page content
-aboutPageText = HTML("
-
-
Who We Are
-This app was conceived and built in a collaboration between the Reich Lab's
Forecast Hub
-and Carnegie Mellon's
Delphi Research Group.
-
TODO: should there be more here about what each group is, and why we are collaborating (sharing resources and expertise).
-For instance, something
-about how the Forecast Hub gathers all the weekly forecasts, and Delphi's evalcast scores them?
-
-
Collaborators
-TODO: how should these be displayed?
-
-From the Forecast Hub: Nick Reich, Estee Cramer, Johannes Bracher, anyone else?
-From the Delphi Research Group: Jed Grabman, Kate Harwood, Chris Scott, Jacob Bien, Daniel McDonald, Logan Brooks, anyone else?
-
-
Our Mission
-
-The goal of the Forecast Evaluation Working Group is to provide a robust set of tools and methods for evaluating the
-performance of COVID-19 forecasting models to help epidemiological researchers gain insights into the models' performance,
-and ultimately lead to more accurate forecasting of COVID-19 and other diseases. TODO: obviously this needs work.
-
About the Data
-
Sources
-
Observed values are from the
-
COVID-19 Data Repository
-by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University.
-
Forecaster predictions are drawn from the Forecast Hub. TODO is there a good link this should go to? Git repo?
-
Data for the dashboard is pulled once a week from these sources, on Tuesdays.
-
-
Terms
-
- Forecaster
-
A model producing quantile predictions
-- Forecast
-
A set of data that, for all locales in a geo type,
-includes predictions for a target variable for each of a certain number of quantiles
-for each of a certain number of horizons
-- Target Variable
-
What the forecast is predicting, ie: “weekly incident cases”
-- Horizon
-
1 epi-week, some number of epi-weeks ahead of the current week
-- Epi-week
-
Week that starts on a Sunday. If it is Sunday or Monday,
-the next epi-week is the week that starts on that Sunday (going back a day if it is Monday).
-If it is Tuesday-Saturday, it is the week that starts on the subsequent Sunday.
-- Point Forecast
-
The value that each forecaster picks as their “most important” prediction.
-For many forecasters this is the 50% quantile prediction.
-- Geo Type
-
States or U.S. as a nation
-
Dashboard Inclusion Criteria
-
-- Includes only weekly deaths incidence and weekly case incidence target variables
-- Includes only horizon < 5 weeks ahead
-- Includes only geo values that are 2 characters (states / territories / nation)
-- Includes only non-NA target dates (if the date is not in yyyy/mm/dd, the prediction will not be included)
-- Includes only predictions with at least 3 quantile values
-- Includes only one file per forecaster per week (according to forecast date). That file must be from a Sunday or Monday. If both are present, we keep the Monday data.
-- If a forecaster updates a file, we do not include the new predictions
-
-
Notes on the Data
-
-- When totaling over all locations, these locations include states and territories and do not include nationwide forecasts.
-- We do include revisions of observed values, meaning the scores for forecasts made in the past can change.
-Scores change as our understanding of the truth changes.
-- TODO: Is there anything else missing here?
-
-
-
Explanation of Scoring Methods
-
-
Weighted Interval Score", wisExplanation,
-"
-
Absolute Error", aeExplanation,
-"
-
Coverage", coverageExplanation,
-"
")
-
+aboutPageText = includeMarkdown("about.md")
ui <- fluidPage(
useShinyjs(),
@@ -214,11 +138,12 @@ ui <- fluidPage(
),
tags$hr(),
),
- tags$div(HTML("This app was conceived and built by the Forecast Evaluation Working Group, a collaboration between
- the Reich Lab's Forecast Hub and
- Carnegie Mellon's Delphi Research Group.
+ tags$div(HTML("This app was conceived and built by the Forecast Evaluation Research Collaborative,
+ a collaboration between the UMass-Amherst Reich Lab's
+ COVID-19 Forecast Hub
+ and Carnegie Mellon's Delphi Research Group.
- This data can also be viewed in a weekly report on the Forecast Hub site.")),
+ This data can also be viewed in a weekly report on the Forecast Hub site. TODO need link")),
a("View Weekly Report", href = "#"),
width=3,
),
@@ -228,28 +153,35 @@ ui <- fluidPage(
tabsetPanel(id = "tabset",
selected = "evaluations",
tabPanel("About",
- tags$div(HTML("
", aboutPageText))),
+ fluidRow(column(9,offset=1,
+ aboutPageText,
+ h3("Explanation of Scoring Methods"),
+ h4("Weighted Interval Score"),
+ wisExplanation,
+ h4("Absolute Error"),
+ aeExplanation,
+ h4("Coverage Plot"),
+ coverageExplanation
+ )),
+ ),
tabPanel("Evaluation Plots", value = "evaluations",
- textOutput('renderWarningText'),
- plotlyOutput(outputId = "summaryPlot"),
- dataTableOutput('renderTable'),
- tags$br(),tags$br(),tags$br(),tags$br(),tags$br(),
- HTML(''),
- textOutput('renderLocationText'),
- textOutput('renderAggregateText'),
- textOutput('renderLocations'),
- HTML('
'),
-
- actionLink("scoreExplanation",
- h4(tags$div(style = "color: black; padding-left:40px;", HTML("Explanation Of Score"),
- icon("arrow-circle-down")))),
- hidden(div(id='explainScore',
- tags$div(style = "width: 90%", HTML("")))),
- actionLink("truthValues",
- h4(tags$div(style = "color: black; padding-left:40px;", HTML("Observed Values"),
- icon("arrow-circle-down")))),
- hidden(div(id="truthSection", hidden(div(id='truthPlot', plotlyOutput(outputId = "truthPlot"))))),
- tags$br(),tags$br()
+ fluidRow(column(9, offset=1, textOutput('renderWarningText'))),
+ plotlyOutput(outputId = "summaryPlot", height="auto"),
+ fluidRow(
+ column(9, offset=1,
+ hidden(div(id = "wisExplanation", wisExplanation)),
+ hidden(div(id = "aeExplanation", aeExplanation)),
+ hidden(div(id = "coverageExplanation", coverageExplanation))
+ )
+ ),
+ plotlyOutput(outputId = "truthPlot", height="auto"),
+ fluidRow(
+ column(9,offset=1,
+ textOutput('renderLocationText'),
+ textOutput('renderAggregateText'),
+ textOutput('renderLocations')
+ )
+ )
)
),
),
@@ -389,6 +321,7 @@ server <- function(input, output, session) {
scoreDf <- scoreDf %>%
group_by(Date) %>% summarize(Incidence = actual)
+ output$renderObservedValueDisclaimer = renderText(observedValueDisclaimer)
return (ggplotly(ggplot(scoreDf, aes(x = Date, y = Incidence)) +
geom_line() +
geom_point() +
@@ -455,13 +388,19 @@ server <- function(input, output, session) {
updateForecasterChoices(session, df, input$forecasters, input$scoreType)
if (input$scoreType == "wis") {
- html("explainScore", paste0(wisExplanation))
+ show("wisExplanation")
+ hide("aeExplanation")
+ hide("coverageExplanation")
}
if (input$scoreType == "ae") {
- html("explainScore", paste0(aeExplanation))
+ hide("wisExplanation")
+ show("aeExplanation")
+ hide("coverageExplanation")
}
if (input$scoreType == "coverage") {
- html("explainScore", paste0(coverageExplanation))
+ hide("wisExplanation")
+ hide("aeExplanation")
+ show("coverageExplanation")
}
})
diff --git a/dashboard/coverageplot.md b/dashboard/coverageplot.md
new file mode 100644
index 0000000..863a1fa
--- /dev/null
+++ b/dashboard/coverageplot.md
@@ -0,0 +1,3 @@
+The **coverage plot** shows how well a forecaster's confidence intervals performed on a given week, across all locations. The horizontal black line is the selected confidence interval, and the y-values are the percentage of time that the observed values of the target variable value fell into that confidence interval. A perfect forecaster on this measure would follow the black line.
+
+For example, a forecaster wants the observed values to be within the 50% confidence interval in 50% of locations for the given week. If the y-value is above the horizontal line, it means that the observed values fell within the forecaster's 50% CI more than 50% of the time, aka the forecaster's 50% CI was under-confident that week, or too wide. Conversely, if the y-value is below the line, it means that the forecaster's 50% CI was over-confident that week, or too narrow.
\ No newline at end of file
diff --git a/dashboard/wis.md b/dashboard/wis.md
new file mode 100644
index 0000000..017e26f
--- /dev/null
+++ b/dashboard/wis.md
@@ -0,0 +1 @@
+The **weighted interval score** (WIS) is a proper score that combines a set of interval scores. See [this article](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008618) about the WIS method for a more in depth explanation. The WIS factors in both the sharpness of prediction intervals and their calibration (or coverage) of the actual observations.
\ No newline at end of file