-
Notifications
You must be signed in to change notification settings - Fork 297
Description
Closes #5165
User story
- As an: Iris user loading NetCDF files written according to the CF Conventions.
- I want: Iris to capture malformed CF information during loading, instead of crashing or disposing of it.
- So that: I can fix CF problems within my Iris script - avoiding the complexity of multiple scripts/tools.
Why this is hard
There are many ways Iris crashes when it encounters bad CF. It is tempting to think of these crashes as deliberate - with easy-to-modify code blocks for each rule - and there are a few of these, but Iris is not a CF-checker. Instead we have used CF to make assumptions so that the code can be simpler/smaller; barely any of the crashes are raised from dedicated lines, and they are often hard to predict.
Any 'fix' must therefore be a form of generic error handling, which cannot have knowledge of what precisely might go wrong.
The architecture
iris.LOAD_PROBLEMS- a global object where Iris can capture objects that could not be loaded, and the stack trace error that was raised.build_raw_cube()- a routine that will represent anyCFVariableas a very basicCube, with as little interpretation as possible.- Separate building objects versus adding them to the
Cubebeing loaded. - Ensure all objects - including names, units, etcetera - are contained within their own building routine.
_add_or_capture():- Can't build the object? Use
build_raw_cube()and store iniris.LOAD_PROBLEMS. - Object built but can't add to the
Cube? Store iniris.LOAD_PROBLEMS.
- Can't build the object? Use
- Issue a warning at the end of loading if anything is found within
iris.LOAD_PROBLEMS. - Make it easier to convert
Cubes - output byraw_cube_from_cf_var()- into other objects e.g.DimCoords? Documentation at the very least.
Note that I have checked cf.py and believe it can remain unchanged. This has a defensive philosophy already, which involves checking if variables can be interpreted as different types, and the remainder are all represented as CfDataVariables, so we already have an existing fallback in place. Anything here that is not formatted correctly just shows up as extra Cube(s) in the loaded CubeList.
More specifics on implementation
For reference when writing #6318 and #6319
- ✔
Iris:__init__.py- Create the
LOAD_PROBLEMSobject - Example structure:
{"file/path/1": [(problem_object_1, error_or_stacktrace), (problem_object_1, error_or_stacktrace)]}
- Create the
helpers.py:- ✔
Introduce a new function -:_add_or_capture()- that- Attempts to call a
build_routine (passed as an argument) in atry-except - On failure: falls back on
build_raw_cube(). - Attempts to add successfully built objects (e.g.
DimCoord) to aCube(passed as an argument) in atry-except - On failure: adds the built objects to
iris.LOAD_PROBLEMSinstead.
- Attempts to call a
- ✔
Create thebuild_raw_cubefunction - Separate as much as possible into
build_routines. We already have many, but even getting hold of standard names etcetera should be separated in this way. - Refactor
build_routines to only perform the building - returning the built object rather than adding it to theCube. - Make the build routines private -
_build...- and call them frombuild_and_add...routines, which prepare the necessary arguments for_add_or_capture(). Here are examples that have already been completed:build_and_add_dimension_coordinatebuild_and_add_names
- ✔
actions.py:- Three
action_routines have success criteria and failure information. These should be refactored so that a failure falls back tobuild_raw_cube(). The failure reason (already being recorded) should be captured iniris.LOAD_PROBLEMS. See Tolerant handling ofstandard_nameand dimension coordinate loading #6338 for how this was done with dimension coordinates - ALL
action_routines should be refactored to call the newbuild_and_addroutines.
- Three
- Tests
- Confirm this can fix known cases (Common agreement on loading CF non-compliant NetCDF files #5165)
- Docstrings
- What's New
Sub-issues
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status
Status
Status
Status