Skip to content

Gentle_walkthrough

Ben Tupper edited this page Jul 9, 2025 · 1 revision

Local Archive Walk-Through

Local copernicus data

We have built local databases of copernicus datasets for two regions: Northwest Atlantic (nwa) and the world (world). databse contains one or more product suites. These are rasters stored one-variable-one-period-per-file (geotiff format).

world

Contains 0.25 degree biogeochemical datasets for hindcasts (stable), interim hindcasts (meta-stable) and forecasts (unstable). Here stability refers to our own perspective of reliability and usability.

nwa

Contains 0.083 degree physical datasets following the same, stable, meta-stable and unstable paradigm.

Access

First we establish a product path for a given region. [N.B. be sure you have set the root data path to the copernicus data directory - see the copernicus package]

suppressPackageStartupMessages({
  library(copernicus)
  library(andreas)
  library(dplyr)
  library(stars)
  library(sf)
  library(ggplot2)
  library(ggokabeito) # for colorblind qualitative palette
})

nwapath = copernicus_path("nwa", "GLOBAL_MULTIYEAR_PHY_001_030")

Retrieve the tabulation of local data

Next we read a table that catalogs the records in the database.

DB = read_database(nwapath) |>
  glimpse()
## Rows: 81,040
## Columns: 7
## $ id        <chr> "cmems_mod_glo_phy_my_0.083deg_P1D-m", "cmems_mod_glo_phy_my…
## $ date      <date> 1993-01-01, 1993-01-01, 1993-01-01, 1993-01-01, 1993-01-01,…
## $ time      <chr> "000000", "000000", "000000", "000000", "000000", "000000", …
## $ depth     <chr> "bot", "mld", "sur", "sur", "sur", "sur", "zos", "bot", "mld…
## $ period    <chr> "day", "day", "day", "day", "day", "day", "day", "day", "day…
## $ variable  <chr> "bottomT", "mlotst", "so", "thetao", "uo", "vo", "zos", "bot…
## $ treatment <chr> "raw", "raw", "raw", "raw", "raw", "raw", "raw", "raw", "raw…

Here id is the official dataset_id reference into the Copernicus universe. Currently we only have 4 depths: bot, mld, sur, zos. treatment is a reserved for future local manipulations (such as rolling means, etc), but for now is simply “raw” everywhere.

count(DB, id, variable)
## # A tibble: 13 × 3
##    id                                     variable     n
##    <chr>                                  <chr>    <int>
##  1 cmems_mod_glo_phy_my_0.083deg_P1D-m    bottomT  10408
##  2 cmems_mod_glo_phy_my_0.083deg_P1D-m    mlotst   10408
##  3 cmems_mod_glo_phy_my_0.083deg_P1D-m    so       10408
##  4 cmems_mod_glo_phy_my_0.083deg_P1D-m    thetao   10408
##  5 cmems_mod_glo_phy_my_0.083deg_P1D-m    uo       10408
##  6 cmems_mod_glo_phy_my_0.083deg_P1D-m    vo       10408
##  7 cmems_mod_glo_phy_my_0.083deg_P1D-m    zos      10408
##  8 cmems_mod_glo_phy_myint_0.083deg_P1D-m bottomT   1364
##  9 cmems_mod_glo_phy_myint_0.083deg_P1D-m mlotst    1364
## 10 cmems_mod_glo_phy_myint_0.083deg_P1D-m thetao    1364
## 11 cmems_mod_glo_phy_myint_0.083deg_P1D-m uo        1364
## 12 cmems_mod_glo_phy_myint_0.083deg_P1D-m vo        1364
## 13 cmems_mod_glo_phy_myint_0.083deg_P1D-m zos       1364

Filter the database

We might have an interest in just a few variables over a short time period. For that we simply filter to a small database.

db = filter(DB,
            between(.data$date, as.Date("2015-06-01"), as.Date("2015-06-05")),
            variable %in% c("thetao", "so", "mlotst")) |>
  glimpse()
## Rows: 15
## Columns: 7
## $ id        <chr> "cmems_mod_glo_phy_my_0.083deg_P1D-m", "cmems_mod_glo_phy_my…
## $ date      <date> 2015-06-01, 2015-06-01, 2015-06-01, 2015-06-02, 2015-06-02,…
## $ time      <chr> "000000", "000000", "000000", "000000", "000000", "000000", …
## $ depth     <chr> "mld", "sur", "sur", "mld", "sur", "sur", "mld", "sur", "sur…
## $ period    <chr> "day", "day", "day", "day", "day", "day", "day", "day", "day…
## $ variable  <chr> "mlotst", "so", "thetao", "mlotst", "so", "thetao", "mlotst"…
## $ treatment <chr> "raw", "raw", "raw", "raw", "raw", "raw", "raw", "raw", "raw…

Read that subset of rasters

x = read_andreas(db, nwapath)
x
## stars object with 3 dimensions and 3 attributes
## attribute(s):
##              Min.   1st Qu.    Median     Mean  3rd Qu.      Max.   NA's
## mlotst   0.000000 10.528886 10.528886 13.41818 12.97037 757.92725 171240
## so       4.657125 32.311470 34.342480 33.82587 36.17054  37.06168 171240
## thetao  -2.014863  4.758782  9.193731 12.11081 21.12818  28.12375 171240
## dimension(s):
##      from  to         offset    delta  refsys point x/y
## x       1 415         -77.04  0.08333  WGS 84 FALSE [x]
## y       1 243          56.71 -0.08333  WGS 84 FALSE [y]
## time    1   5 2015-06-01 UTC   1 days POSIXct    NA

Extracting point data

We provide some example buoy locations for the Gulf of Maine; we’ll use these as an example.

buoys = read_buoys() |>
  glimpse()
## Rows: 8
## Columns: 4
## $ name     <chr> "wms", "cms", "pb", "ems", "jb", "nec", "bgr", "mht"
## $ longname <chr> "Western Maine Shelf", "Central Maine Shelf", "Penobscot Bay"…
## $ id       <chr> "B01", "E01", "F01", "I01", "M01", "N01", "BGR", "MHT"
## $ geometry <POINT [°]> POINT (-70.4277 43.18065), POINT (-69.3578 43.7148), POINT (-…

Extracting point data is easy.

v = extract_points(x, buoys) |>
  glimpse()
## Rows: 120
## Columns: 4
## $ point <chr> "p1", "p1", "p1", "p1", "p1", "p2", "p2", "p2", "p2", "p2", "p3"…
## $ name  <chr> "mlotst", "mlotst", "mlotst", "mlotst", "mlotst", "mlotst", "mlo…
## $ time  <dttm> 2015-06-01, 2015-06-02, 2015-06-03, 2015-06-04, 2015-06-05, 201…
## $ value <dbl> 10.52889, 10.52889, 10.52889, 10.52889, 10.52889, 10.52889, 10.5…

You can also request wide-format if that is more suitable to your needs.

v = extract_points(x, buoys, form = "wide") |>
  glimpse()
## Rows: 40
## Columns: 5
## $ point  <chr> "p1", "p1", "p1", "p1", "p1", "p2", "p2", "p2", "p2", "p2", "p3…
## $ time   <dttm> 2015-06-01, 2015-06-02, 2015-06-03, 2015-06-04, 2015-06-05, 20…
## $ mlotst <dbl> 10.52889, 10.52889, 10.52889, 10.52889, 10.52889, 10.52889, 10.…
## $ so     <dbl> 31.96814, 31.87353, 31.85980, 31.87963, 31.85217, 31.89337, 31.…
## $ thetao <dbl> 8.202002, 8.757195, 9.005493, 9.387829, 9.807520, 8.393903, 9.8…
Clone this wiki locally