Skip to content

Releases: NVIDIA/physicsnemo

v1.1.1

16 Jun 19:59
ccb2b89
Compare
Choose a tag to compare

PhysicsNeMo (Core) General Release v1.1.1 (patch to v1.1.0)

Fixed

  • Fixed an inadvertent change to the deterministic sampler 2nd order correction

v1.1.0

10 Jun 20:36
7a798a3
Compare
Choose a tag to compare

PhysicsNeMo (Core) General Release v1.1.0

Added

  • Added ReGen score-based data assimilation example
  • General purpose patching API for patch-based diffusion
  • New positional embedding selection strategy for CorrDiff SongUNet models
  • Added Multi-Storage Client to allow checkpointing to/from Object Storage

Changed

  • Simplified CorrDiff config files, updated default values
  • Refactored CorrDiff losses and samplers to use the patching API
  • Support for non-square images and patches in patch-based diffusion
  • ERA5 download example updated to use current file format convention and
    restricts global statistics computation to the training set
  • Support for training custom StormCast models and various other improvements for StormCast
  • Updated CorrDiff training code to support multiple patch iterations to amortize
    regression cost and usage of torch.compile
  • Refactored physicsnemo/models/diffusion/layers.py to optimize data type
    casting workflow, avoiding unnecessary casting under autocast mode
  • Refactored Conv2d to enable fusion of conv2d with bias addition
  • Refactored GroupNorm, UNetBlock, SongUNet, SongUNetPosEmbd to support usage of
    Apex GroupNorm, fusion of activation with GroupNorm, and AMP workflow.
  • Updated SongUNetPosEmbd to avoid unnecessary HtoD Memcpy of pos_embd
  • Updated from_checkpoint to accommodate conversion between Apex optimized ckp
    and non-optimized ckp
  • Refactored CorrDiff NVTX annotation workflow to be configurable
  • Refactored ResidualLoss to support patch-accumlating training for
    amortizing regression costs
  • Explicit handling of Warp device for ball query and sdf
  • Merged SongUNetPosLtEmb with SongUNetPosEmb, add support for batch>1
  • Add lead time embedding support for positional_embedding_selector. Enable
    arbitrary positioning of probabilistic variables
  • Enable lead time aware regression without CE loss
  • Bumped minimum PyTorch version from 2.0.0 to 2.4.0, to minimize
    support surface for physicsnemo.distributed functionality.

Dependencies

  • Made nvidia.dali an optional dependency

v1.0.1

25 Mar 23:57
51c931f
Compare
Choose a tag to compare

PhysicsNeMo (Core) General Release v1.0.1

Added

  • Added version checks to ensure compatibility with older PyTorch for distributed utilities and ShardTensor

Fixed

  • EntryPoint error that occured during physicsnemo checkpoint loading

v1.0.0

18 Mar 20:24
Compare
Choose a tag to compare

PhysicsNeMo (Core) General Release v1.0.0

Added

  • DoMINO model architecture, datapipe and training recipe
  • Added matrix decomposition scheme to improve graph partitioning
  • DrivAerML dataset support in FIGConvNet example.
  • Retraining recipe for DoMINO from a pretrained model checkpoint
  • Prototype support for domain parallelism of using ShardTensor (new).
  • Enable DeviceMesh initialization via DistributedManager.
  • Added Datacenter CFD use case.
  • Add leave-in profiling utilities to physicsnemo, to easily enable torch/python/nsight
    profiling in all aspects of the codebase.

Changed

  • Refactored StormCast training example
  • Enhancements and bug fixes to DoMINO model and training example
  • Enhancement to parameterize DoMINO model with inlet velocity
  • Moved non-dimensionaliztion out of domino datapipe to datapipe in domino example
  • Updated utils in physicsnemo.launch.logging to avoid unnecessary wandb and mlflow
    imports
  • Moved to experiment-based Hydra config in Lagrangian-MGN example
  • Make data caching optional in MeshDatapipe
  • The use of older importlib_metadata library is removed

Deprecated

  • ProcessGroupConfig is tagged for future deprecation in favor of DeviceMesh.

Fixed

  • Update pytests to skip when the required dependencies are not present
  • Bug in data processing script in domino training example
  • Fixed NCCL_ASYNC_ERROR_HANDLING deprecation warning

Dependencies

  • Remove the numpy dependency upper bound
  • Moved pytz and nvtx to optional
  • Update the base image for the Dockerfile
  • Introduce Multi-Storage Client (MSC) as an optional dependency.
  • Introduce wrapt as an optional dependency, needed when using
    ShardTensor's automatic domain parallelism

v0.9.0

27 Nov 19:04
5bc7702
Compare
Choose a tag to compare

Modulus (core) general release v0.9.0

Added

  • FIGConvUNet model and example.
  • The Transolver model.
  • The XAeroNet model.
  • Incoporated CorrDiff-GEFS-HRRR model into CorrDiff, with lead-time aware SongUNet and
    cross entropy loss.

Changed

  • Refactored EDMPrecondSRV2 preconditioner and fixed the bug related to the metadata
  • Extended the checkpointing utility to store metadata.
  • Corrected missing export of loggin function used by transolver model

v0.8.0

24 Sep 17:10
e5844cc
Compare
Choose a tag to compare

Modulus (core) general release v0.8.0

Added

  • Graph Transformer processor for GraphCast/GenCast.
  • Utility to generate STL from Signed Distance Field.
  • Metrics for CAE and CFD domain such as integrals, drag, and turbulence invariances and
    spectrum.
  • Added gradient clipping to StaticCapture utilities.
  • Bistride Multiscale MeshGraphNet example.

Changed

  • Refactored CorrDiff training recipe for improved usability
  • Fixed timezone calculation in datapipe cosine zenith utility.

v0.7.0

23 Jul 23:25
336cc94
Compare
Choose a tag to compare

Modulus (core) general release v0.7.0

Added

  • Code logging for CorrDiff via Wandb.
  • Augmentation pipeline for CorrDiff.
  • Regression output as additional conditioning for CorrDiff.
  • Learnable positional embedding for CorrDiff.
  • Support for patch-based CorrDiff training and generation (stochastic sampling only)
  • Enable CorrDiff multi-gpu generation
  • Diffusion model for fluid data super-resolution (CMU contribution).
  • The Virtual Foundry GraphNet.
  • A synthetic dataloader for global weather prediction models, demonstrated on GraphCast.
  • Sorted Empirical CDF CRPS algorithm
  • Support for history, cos zenith, and downscaling/upscaling in the ERA5 HDF5 dataloader.
  • An example showing how to train a "tensor-parallel" version of GraphCast on a
    Shallow-Water-Equation example.
  • 3D UNet
  • AeroGraphNet example of training of MeshGraphNet on Ahmed body and DrivAerNet datasets.
  • Warp SDF routine
  • DLWP HEALPix model
  • Pangu Weather model
  • Fengwu model
  • SwinRNN model
  • Modulated AFNO model

Changed

  • Raise ModulusUndefinedGroupError when querying undefined process groups
  • Changed Indexing error in examples/cfd/swe_nonlinear_pino for modulus loss function
  • Safeguarding against uninitialized usage of DistributedManager

Removed

  • Remove mlflow from deployment image

Fixed

  • Fixed bug in the partitioning logic for distributing graph structures
    intended for distributed message-passing.
  • Fixed bugs for corrdiff diffusion training of EDMv1 and EDMv2

Dependencies

  • Update DALI to CUDA 12 compatible version.
  • Update minimum python version to 3.10

v0.6.0

17 Apr 22:45
eff54e6
Compare
Choose a tag to compare

Modulus (core) general release v0.6.0

Added

  • Added citation file
  • Link to the CWA dataset
  • ClimateDatapipe: an improved datapipe for HDF5/NetCDF4 formatted climate data
  • Performance optimizations to CorrDiff
  • Physics-Informed Nonlinear Shallow Water Equations example
  • Warp neighbor search routine with a minimal example
  • Strict option for loading Modulus checkpoints
  • Regression only or diffusion only inference for CorrDiff
  • Support for organization level model files on NGC file system
  • Physics-Informed Magnetohydrodynamics example

Changed

  • Updated Ahmed Body and Vortex Shedding examples to use Hydra config
  • Added more config options to FCN AFNO example
  • Moved posiitonal embedding in CorrDiff from the dataloader to network architecture

Deprecated

  • modulus.models.diffusion.preconditioning.EDMPrecondSR. Use EDMPecondSRV2 instead

Removed

  • Pickle dependency for CorrDiff

Fixed

  • Consistent handling of single GPU runs in DistributedManager
  • Output location of objects downloaded with NGC file system
  • Bug in scaling the conditional input in CorrDiff deterministic sampler

Dependencies

  • Updated DGL build in Dockerfile
  • Updated default base image
  • Moved Onnx from optional to required dependencies
  • Optional Makani dependency required for SFNO model

v0.5.0

26 Jan 01:13
24bee5c
Compare
Choose a tag to compare

Modulus (core) general release v0.5.0

Added

  • Distributed process group configuration mechanism.
  • DistributedManager utility to instantiate process groups based on a process group config.
  • Helper functions to facilitate distributed training with shared parameters.
  • Brain anomaly detection example.
  • Updated Frechet Inception Distance to use Wasserstein 2-norm with improved stability.
  • Molecular Dynamics example.
  • Improved usage of GraphPartition, added more flexible ways of defining a partitioned graph.
  • Physics-Informed Stokes Flow example.

Changed

  • MLFLow logging such that only proc 0 logs to MLFlow.
  • FNO given separate methods for constructing lift and spectral encoder layers.

Removed

  • The experimental SFNO

Dependencies

  • Removed experimental SFNO dependencies
  • Added CorrDiff dependencies (cftime, einops, pyspng)
  • Made tqdm a required dependency

v0.4.0

20 Nov 18:48
b9608e4
Compare
Choose a tag to compare

Modulus (core) general release v0.4.0

Added

  • Added Stokes flow dataset
  • An experimental version of SFNO to be used in unified training recipe for weather models.
  • Added distributed FFT utility.
  • Added ruff as a linting tool.
  • Ported utilities from Modulus Launch to main package.
  • EDM diffusion models and recipes for training and sampling.
  • NGC model registry download integration into package/filesystem.
  • Added distributed process group configuration mechanism.
  • Added DistributedManager utility to instantiate process groups based on thier process group config.

Changed

  • The AFNO input argument img_size to inp_shape.
  • Integrated the network architecture layers from Modulus-Sym.

Fixed

  • Fixed modulus.Module from_checkpoint to work from custom model classes.

Security

  • Updated the base container to PyTorch 23.10.