-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened?
When writing to and reading my dataset from netCDF using ds.to_netcdf() and xr.open_dataset(...), xarray creates nan values where previously number values (float32) where.
The issue seems related to the encoding used for the original dataset, which causes the data to be stored as short. During loading, the stored values then collide with _FillValue leading to the numbers being interpreted as nan.
What did you expect to happen?
Values after saving & loading should be the same as before saving.
Minimal Complete Verifiable Example
We had a back-and-forth on SO about this, I hope it's fine to just refer to it here:
https://stackoverflow.com/a/75806771/11318472MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
See the SO link above.Anything else we need to know?
I'm not sure whether this should be considered a bug or just a combination of conflicting features. My current workaround is resetting the encoding and letting xarray decide to store as float instead of short (cf. #7686).
Environment
INSTALLED VERSIONS
commit: None
python: 3.11.0 | packaged by conda-forge | (main, Oct 25 2022, 06:24:40) [GCC 10.4.0]
python-bits: 64
OS: Linux
OS-release: 5.15.90.1-microsoft-standard-WSL2
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.8.1
xarray: 2022.11.0
pandas: 1.5.2
numpy: 1.23.5
scipy: 1.10.0
netCDF4: 1.6.2
pydap: None
h5netcdf: 1.1.0
h5py: 3.8.0
Nio: None
zarr: 2.13.6
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.3.3
cfgrib: None
iris: None
bottleneck: 1.3.5
dask: 2022.02.1
distributed: 2022.2.1
matplotlib: 3.6.2
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.11.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.5.1
pip: 22.3.1
conda: None
pytest: 7.2.0
IPython: 8.11.0
sphinx: None