Skip to content

Releases: Lightning-AI/pytorch-lightning

2.0.1 patch release

30 Mar 14:45

Choose a tag to compare

App

No changes


Fabric

Changed

  • Generalized Optimizer validation to accommodate both FSDP 1.x and 2.x (#16733)

PyTorch

Changed

  • Pickling the LightningModule no longer pickles the Trainer (#17133)
  • Generalized Optimizer validation to accommodate both FSDP 1.x and 2.x (#16733)
  • Disable torch.inference_mode with torch.compile in PyTorch 2.0 (#17215)

Fixed

  • Fixed issue where pickling the module instance would fail with a DataLoader error (#17130)
  • Fixed WandbLogger not showing "best" aliases for model checkpoints when ModelCheckpoint(save_top_k>0) is used (#17121)
  • Fixed the availability check for rich that prevented Lightning to be imported in Google Colab (#17156)
  • Fixed parsing the precision config for inference in DeepSpeedStrategy (#16973)
  • Fixed issue where torch.compile would fail when logging to WandB (#17216)

Contributors

@Borda @williamFalcon @lightningforever @adamjstewart @carmocca @tshu-w @saryazdi @parambharat @awaelchli @colehawkins @woqidaideshi @md-121 @yhl48 @gkroiz @idc9 @speediedan

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Lightning 2.0: Fast, Flexible, Stable

15 Mar 12:58
01834c8

Choose a tag to compare

Lightning AI is excited to announce the release of Lightning 2.0 ⚡

Over the last couple of years PyTorch Lightning has become the preferred deep learning framework for researchers and ML developers around the world, with close to 50 million downloads and 18k OSS projects, from top universities to leading labs.

With the help of over 800 contributors, we have added many features and functionalities to make it the most complete research toolkit possible, but some of these changes also introduced issues:

  • API changes to the trainer
  • Trainer code became harder to follow
  • Many integrations made Lightning appear bloated
  • The trainer became harder to customize / takes away what I instead need to tweak / have control over.

To make the research experience better, we are introducing 2.0:

  • No API changes - We commit to backward compatibility in the 2.0 series
  • Simplified abstraction layers, removed legacy functionality, integrations out of the main repo. This improves the project's readability and debugging experience.
  • Introducing Fabric. Scale any PyTorch model with just a few lines of code. Read-on!

Highlights

PyTorch 2.0 and torch.compile

Lightning 2.0 is best friends with PyTorch 2.0. You can torch.compile your LightningModules now!

import torch
import lightning as L

model = LitModel()
# This will compile forward and {training,validation,test,predict}_step 
compiled_model = torch.compile(model)

trainer = L.Trainer()
trainer.fit(compiled_model)

PyTorch reports that on average, "models runs 43% faster in training on an NVIDIA A100 GPU. At Float32 precision, it runs 21% faster on average and at AMP Precision it runs 51% faster on average" (source). If you want to learn more about torch.compile and how such speedups can be achieved, read the official PyTorch 2.0 blog post.

Automatic accelerator selection (#16847)

The Trainer now chooses accelerator="auto", strategy="auto", devices="auto" as defaults. This automatically detects the best hardware on your system (TPUs, GPUs, Apple Silicon, etc.) and chooses as many devices as are available.

import lightning as L

# Selects accelerator, devices and strategy automatically!
trainer = L.Trainer()

# Same as:
trainer = L.Trainer(accelerator="auto", strategy="auto", devices="auto")

For example, on a 8-GPU server, this will implicitly select Trainer(accelerator="cuda", strategy="ddp", devices=8).

Support for arbitrary iterables (#16726)

Previously, the Trainer supported DataLoader-like iterables. However, with this release, users can now work with any iterable that implements the Python iterable definition. This includes custom data structures, such as user-defined classes and generators, as well as built-in Python objects.

To use this new feature, return any iterable (or collection of iterables) from the dataloader hooks.

def train_dataloader(self):
    return DataLoader(...)
    return list(range(1000))
    
    # pass loaders as a dict. This will create batches like this:
    # {'a': batch_from_loader_a, 'b': batch_from_loader_b}
    return {"a": DataLoader(...), "b": DataLoader(...)}
    
    # pass loaders as list. This will create batches like this:
    # [batch_from_dl_1, batch_from_dl_2]
    return [DataLoader(...), DataLoader(...)]
    
    # arbitrary nesting
    # {'a': [batch_from_dl_1, batch_from_dl_2], 'b': [batch_from_dl_3, batch_from_dl_4]}
    return {"a": [dl1, dl2], "b": [dl3, dl4]}

Read our data section for more information.

Redesigned multi-dataloader support (#16743, #16784, #16939)

Lightning automatically collates the batches from multiple iterables based on a "mode". This is done with our newly revamped CombinedLoader class.

from lightning.pytorch.utilities import CombinedLoader

iterables = {"a": DataLoader(), "b": DataLoader()}
# Lightning uses this under the hood, but this way you can change the "mode"
combined_loader = CombinedLoader(iterables, mode="min_size")

model = ...
trainer = Trainer()
trainer.fit(model, combined_loader)

The following modes are supported:

  • min_size: stops after the shortest iterable (the one with the lowest number of items) is done.
  • max_size_cycle: stops after the longest iterable (the one with most items) is done, while cycling through the rest of the iterables.
  • max_size: stops after the longest iterable (the one with most items) is done, while returning None for the exhausted iterables.
  • sequential: completely consumes ecah iterable sequentially, and returns a triplet (data, idx, iterable_idx)

If you have a need for a different "mode", feel free to open a feature request! Adding new modes is now very simplified. These improvements also allowed us to simplify the trainer's loops by abstracting this logic inside the CombinedLoader.

Barebones Trainer mode (#16854)

A new Trainer argument Trainer(barebones=...) was added (default is False) to disable all features that may impact the raw speed of the training loop. This allows users to quickly and fairily compare the runtime of a Lightning script with a raw PyTorch script.

This is how you enable it:

import lightning as L

# Default: False
trainer = L.Trainer(barebones=True)

A message informs about the changed settings:

You are running in `Trainer(barebones=True)` mode. All features that may impact raw speed have been disabled to facilitate analyzing the Trainer overhead. Specifically, the following features are deactivated:
 - Checkpointing: `Trainer(enable_checkpointing=True)`
 - Progress bar: `Trainer(enable_progress_bar=True)`
 - Model summary: `Trainer(enable_model_summary=True)`
 - Logging: `Trainer(logger=True)`, `Trainer(log_every_n_steps>0)`, `LightningModule.log(...)`, `LightningModule.log_dict(...)`
 - Sanity checking: `Trainer(num_sanity_val_steps>0)`
 - Development run: `Trainer(fast_dev_run=True)`
 - Anomaly detection: `Trainer(detect_anomaly=True)`
 - Profiling: `Trainer(profiler=...)`

Tip: This feature is also very useful for unit testing!

Better progress bar (#16695)

Based on feedback from users, we decided to separate the training progress bar from the validation bar. This greatly improves the time estimates (since validation is usually faster) and resolves confusion around the total batches being processed in an epoch.

This is how the bar looked in versions before 2.0:

Epoch 3:  21%|██        | 28/128 [00:36<01:32, 23.12it/s, loss=0.163]
Validation DataLoader 0:  38%|███      | 12/32 [00:12<00:20,  1.01s/it]

Note how the total batches (128) is the sum of the training batches (32) and the three validation runs (3 x 32). And this is how the progress bar looks like now:

Epoch 3:  50%|█████     | 16/32 [00:36<01:32, 23.12it/s]
Validation DataLoader 0:  38%|███      | 12/32 [00:12<00:20,  1.01s/it]

Note how the batch counts are now separate. The training progress bar pauses until validation is completed.

Lightning Fabric

Lightning 2.0 is the official release for Lightning Fabric 🎉

Fabric spans across a large spectrum - from raw PyTorch all the way to high-level PyTorch Lightning

Fabric is the fast and lightweight way to scale PyTorch models without boilerplate code.

  • Easily switch from running on CPU to GPU (Apple Silicon, CUDA, ...), TPU, multi-GPU or even multi-node training
  • State-of-the-art distributed training strategies (DDP, FSDP, DeepSpeed) and mixed precision out of the box
  • Handles all the boilerplate device logic for you
  • Brings useful tools to help you build a trainer (callbacks, logging, checkpoints, ...)
  • Designed with multi-billion parameter models in mind

📖 Go to Fabric documentation 📖

  import torch
  import torch.nn as nn
  from torch.utils.data import DataLoader, Dataset

+ from lightning.fabric import Fabric

  class PyTorchModel(nn.Module):
      ...

  class PyTorchDataset(Dataset):
      ...

+ fabric = Fabric(accelerator="cuda", devices=8, strategy="ddp")
+ fabric.launch()

- device = "cuda" if torch.cuda.is_available() else "cpu"
  model = PyTorchModel(...)
  optimizer = torch.optim.SGD(model.parameters())
+ model, optimizer = fabric.setup(model, optimizer)
  dataloader = DataLoader(PyTorchDataset(...), ...)
+ dataloader = fabric.setup_dataloaders(dataloader)
  model.train()

  for epoch in range(num_epochs):
      for ba...
Read more

Weekly patch release

01 Mar 13:54
3bee819

Choose a tag to compare

App

Removed

  • Removed implicit ui testing with testing.run_app_in_cloud in favor of headless login and app selection (#16741)

Fabric

Added

  • Added Fabric(strategy="auto") support (#16916)

Fixed

  • Fixed edge cases in parsing device ids using NVML (#16795)
  • Fixed DDP spawn hang on TPU Pods (#16844)
  • Fixed an error when passing find_usable_cuda_devices(num_devices=-1) (#16866)

PyTorch

Added

  • Added Fabric(strategy="auto") support. It will choose DDP over DDP-spawn, contrary to strategy=None (default) (#16916)

Fixed

  • Fixed DDP spawn hang on TPU Pods (#16844)
  • Fixed edge cases in parsing device ids using NVML (#16795)
  • Fixed backwards compatibility for lightning.pytorch.utilities.parsing.get_init_args (#16851)

Contributors

@ethanwharris, @carmocca, @awaelchli, @justusschock , @dtuit, @Liyang90

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Lightning 2.0 Release Candidate

23 Feb 18:56
0130273

Choose a tag to compare

Pre-release

Full Changelog: 1.9.0...2.0.0rc0

Weekly patch release

21 Feb 20:39

Choose a tag to compare

App

Fixed

  • Fixed lightning open command and improved redirects (#16794)

Fabric

Fixed

  • Fixed an issue causing a wrong environment plugin to be selected when accelerator=tpu and devices > 1 (#16806)
  • Fixed parsing of defaults for --accelerator and --precision in Fabric CLI when accelerator and precision are set to non-default values in the code (#16818)

PyTorch

Fixed

  • Fixed an issue causing a wrong environment plugin to be selected when accelerator=tpu and devices > 1 (#16806)

Contributors

@ethanwharris, @carmocca, @awaelchli, @Borda, @tchaton, @yurijmikhalevich

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Weekly patch release

15 Feb 15:23
c5b836a

Choose a tag to compare

App

Added

  • Added Storage Commands (#16740)
    • rm: Delete files from your Cloud Platform Filesystem
  • Added lightning connect data to register data connection to private s3 buckets (#16738)

Fabric

Fixed

  • Fixed an attribute error and improved input validation for invalid strategy types being passed to Fabric (#16693)

PyTorch

Changed

  • Disabled strict loading in multiprocessing launcher ("ddp_spawn", etc.) when loading weights back into the main process (#16365)

Fixed

  • Fixed an attribute error and improved input validation for invalid strategy types being passed to Trainer (#16693)
  • Fixed early stopping triggering extra validation runs after reaching min_epochs or min_steps (#16719)

Contributors

@akihironitta, @awaelchli, @Borda, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Weekly patch release

10 Feb 16:57
c24b4bb

Choose a tag to compare

App

Added

  • Added lightning open command (#16482)
  • Added experimental support for interruptable GPU in the cloud (#16399)
  • Added FileSystem abstraction to simply manipulate files (#16581)
  • Added Storage Commands (#16606)
    • ls: List files from your Cloud Platform Filesystem
    • cd: Change the current directory within your Cloud Platform filesystem (terminal session based)
    • pwd: Return the current folder in your Cloud Platform Filesystem
    • cp: Copy files between your Cloud Platform Filesystem and local filesystem
  • Prevent to cd into non-existent folders (#16645)
  • Enabled cp (upload) at project level (#16631)
  • Enabled ls and cp (download) at project level (#16622)
  • Added lightning connect data to register data connection to s3 buckets (#16670)
  • Added support for running with multiprocessing in the cloud (#16624)
  • Initial plugin server (#16523)
  • Connect and Disconnect node (#16700)

Changed

  • Changed the default LightningClient(retry=False) to retry=True (#16382)
  • Add support for async predict method in PythonServer and remove torch context (#16453)
  • Renamed lightning.app.components.LiteMultiNode to lightning.app.components.FabricMultiNode (#16505)
  • Changed the command lightning connect to lightning connect app for consistency (#16670)
  • Refactor cloud dispatch and update to new API (#16456)
  • Updated app URLs to the latest format (#16568)

Fixed

  • Fixed a deadlock causing apps not to exit properly when running locally (#16623)
  • Fixed the Drive root_folder not parsed properly (#16454)
  • Fixed malformed path when downloading files using lightning cp (#16626)
  • Fixed app name in URL (#16575)

Fabric

Fixed

  • Fixed error handling for accelerator="mps" and ddp strategy pairing (#16455)
  • Fixed strict availability check for torch_xla requirement (#16476)
  • Fixed an issue where PL would wrap DataLoaders with XLA's MpDeviceLoader more than once (#16571)
  • Fixed the batch_sampler reference for DataLoaders wrapped with XLA's MpDeviceLoader (#16571)
  • Fixed an import error when torch.distributed is not available (#16658)

Pytorch

Fixed

  • Fixed an unintended limitation for calling save_hyperparameters on mixin classes that don't subclass LightningModule/LightningDataModule (#16369)
  • Fixed an issue with MLFlowLogger logging the wrong keys with .log_hyperparams() (#16418)
  • Fixed logging more than 100 parameters with MLFlowLogger and long values are truncated (#16451)
  • Fixed strict availability check for torch_xla requirement (#16476)
  • Fixed an issue where PL would wrap DataLoaders with XLA's MpDeviceLoader more than once (#16571)
  • Fixed the batch_sampler reference for DataLoaders wrapped with XLA's MpDeviceLoader (#16571)
  • Fixed an import error when torch.distributed is not available (#16658)

Contributors

@akihironitta, @awaelchli, @Borda, @BrianPulfer, @ethanwharris, @hhsecond, @justusschock, @Liyang90, @ruro, @senarvi, @shenoynikhil, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Stability and additional improvements

17 Jan 17:26
fc195b9

Choose a tag to compare

App

Added

  • Added a possibility to set up basic authentication for Lightning apps (#16105)

Changed

  • The LoadBalancer now uses internal ip + port instead of URL exposed (#16119)
  • Added support for logging in different trainer stages with DeviceStatsMonitor
    (#16002)
  • Changed lightning_app.components.serve.gradio to lightning_app.components.serve.gradio_server (#16201)
  • Made cluster creation/deletion async by default (#16185)

Fixed

  • Fixed not being able to run multiple lightning apps locally due to port collision (#15819)
  • Avoid relpath bug on Windows (#16164)
  • Avoid using the deprecated LooseVersion (#16162)
  • Porting fixes to autoscaler component (#16249)
  • Fixed a bug where lightning login with env variables would not correctly save the credentials (#16339)

Fabric

Added

  • Added Fabric.launch() to programmatically launch processes (e.g. in Jupyter notebook) (#14992)
  • Added the option to launch Fabric scripts from the CLI, without the need to wrap the code into the run method (#14992)
  • Added Fabric.setup_module() and Fabric.setup_optimizers() to support strategies that need to set up the model before an optimizer can be created (#15185)
  • Added support for Fully Sharded Data Parallel (FSDP) training in Lightning Lite (#14967)
  • Added lightning_fabric.accelerators.find_usable_cuda_devices utility function (#16147)
  • Added basic support for LightningModules (#16048)
  • Added support for managing callbacks via Fabric(callbacks=...) and emitting events through Fabric.call() (#16074)
  • Added Logger support (#16121)
    • Added Fabric(loggers=...) to support different Logger frameworks in Fabric
    • Added Fabric.log for logging scalars using multiple loggers
    • Added Fabric.log_dict for logging a dictionary of multiple metrics at once
    • Added Fabric.loggers and Fabric.logger attributes to access the individual logger instances
    • Added support for calling self.log and self.log_dict in a LightningModule when using Fabric
    • Added access to self.logger and self.loggers in a LightningModule when using Fabric
  • Added lightning_fabric.loggers.TensorBoardLogger (#16121)
  • Added lightning_fabric.loggers.CSVLogger (#16346)
  • Added support for a consistent .zero_grad(set_to_none=...) on the wrapped optimizer regardless of which strategy is used (#16275)

Changed

  • Renamed the class LightningLite to Fabric (#15932, #15938)
  • The Fabric.run() method is no longer abstract (#14992)
  • The XLAStrategy now inherits from ParallelStrategy instead of DDPSpawnStrategy (#15838)
  • Merged the implementation of DDPSpawnStrategy into DDPStrategy and removed DDPSpawnStrategy (#14952)
  • The dataloader wrapper returned from .setup_dataloaders() now calls .set_epoch() on the distributed sampler if one is used (#16101)
  • Renamed Strategy.reduce to Strategy.all_reduce in all strategies (#16370)
  • When using multiple devices, the strategy now defaults to "ddp" instead of "ddp_spawn" when none is set (#16388)

Removed

  • Removed support for FairScale's sharded training (strategy='ddp_sharded'|'ddp_sharded_spawn'). Use Fully-Sharded Data Parallel instead (strategy='fsdp') (#16329)

Fixed

  • Restored sampling parity between PyTorch and Fabric dataloaders when using the DistributedSampler (#16101)
  • Fixes an issue where the error message wouldn't tell the user the real value that was passed through the CLI (#16334)

PyTorch

Added

  • Added support for native logging of MetricCollection with enabled compute groups (#15580)
  • Added support for custom artifact names in pl.loggers.WandbLogger (#16173)
  • Added support for DDP with LRFinder (#15304)
  • Added utilities to migrate checkpoints from one Lightning version to another (#15237)
  • Added support to upgrade all checkpoints in a folder using the pl.utilities.upgrade_checkpoint script (#15333)
  • Add an axes argument ax to the .lr_find().plot() to enable writing to a user-defined axes in a matplotlib figure (#15652)
  • Added log_model parameter to MLFlowLogger (#9187)
  • Added a check to validate that wrapped FSDP models are used while initializing optimizers (#15301)
  • Added a warning when self.log(..., logger=True) is called without a configured logger (#15814)
  • Added support for colossalai 0.1.11 (#15888)
  • Added LightningCLI support for optimizer and learning schedulers via callable type dependency injection (#15869)
  • Added support for activation checkpointing for the DDPFullyShardedNativeStrategy strategy (#15826)
  • Added the option to set DDPFullyShardedNativeStrategy(cpu_offload=True|False) via bool instead of needing to pass a configuration object (#15832)
  • Added info message for Ampere CUDA GPU users to enable tf32 matmul precision (#16037)
  • Added support for returning optimizer-like classes in LightningModule.configure_optimizers (#16189)

Changed

  • Switch from tensorboard to tensorboardx in TensorBoardLogger (#15728)
  • From now on, Lightning Trainer and LightningModule.load_from_checkpoint automatically upgrade the loaded checkpoint if it was produced in an old version of Lightning (#15237)
  • Trainer.{validate,test,predict}(ckpt_path=...) no longer restores the Trainer.global_step and trainer.current_epoch value from the checkpoints - From now on, only Trainer.fit will restore this value (#15532)
  • The ModelCheckpoint.save_on_train_epoch_end attribute is now computed dynamically every epoch, accounting for changes to the validation dataloaders (#15300)
  • The Trainer now raises an error if it is given multiple stateful callbacks of the same time with colliding state keys (#15634)
  • MLFlowLogger now logs hyperparameters and metrics in batched API calls (#15915)
  • Overriding the on_train_batch_{start,end} hooks in conjunction with taking a dataloader_iter in the training_step no longer errors out and instead shows a warning (#16062)
  • Move tensorboardX to extra dependencies. Use the CSVLogger by default (#16349)
  • Drop PyTorch 1.9 support (#15347)

Deprecated

  • Deprecated description, env_prefix and env_parse parameters in LightningCLI.__init__ in favour of giving them through parser_kwargs (#15651)
  • Deprecated pytorch_lightning.profiler in favor of pytorch_lightning.profilers (#16059)
  • Deprecated Trainer(auto_select_gpus=...) in favor of pytorch_lightning.accelerators.find_usable_cuda_devices (#16147)
  • Deprecated pytorch_lightning.tuner.auto_gpu_select.{pick_single_gpu,pick_multiple_gpus} in favor of pytorch_lightning.accelerators.find_usable_cuda_devices (#16147)
  • nvidia/apex deprecation (#16039)
    • Deprecated pytorch_lightning.plugins.NativeMixedPrecisionPlugin in favor of pytorch_lightning.plugins.MixedPrecisionPlugin
    • Deprecated the LightningModule.optimizer_step(using_native_amp=...) argument
    • Deprecated the Trainer(amp_backend=...) argument
    • Deprecated the Trainer.amp_backend property
    • Deprecated the Trainer(amp_level=...) argument
    • Deprecated the pytorch_lightning.plugins.ApexMixedPrecisionPlugin class
    • Deprecates the pytorch_lightning.utilities.enums.AMPType enum
    • Deprecates the DeepSpeedPrecisionPlugin(amp_type=..., amp_level=...) arguments
  • horovod deprecation (#16141)
    • Deprecated Trainer(strategy="horovod")
    • Deprecated the HorovodStrategy class
  • Deprecated pytorch_lightning.lite.LightningLite in favor of lightning.fabric.Fabric (#16314)
  • FairScale deprecation (in favor of PyTorch's FSDP implementation) (#16353)
    • Deprecated the pytorch_lightning.overrides.fairscale.LightningShardedDataParallel class
    • Deprecated the pytorch_lightning.plugins.precision.fully_sharded_native_amp.FullyShardedNativeMixedPrecisionPlugin class
    • Deprecated the pytorch_lightning.plugins.precision.sharded_native_amp.ShardedNativeMixedPrecisionPlugin class
    • Deprecated the pytorch_lightning.strategies.fully_sharded.DDPFullyShardedStrategy class
    • Deprecated the pytorch_lightning.strategies.sharded.DDPShardedStrategy class
    • Deprecated the pytorch_lightning.strategies.sharded_spawn.DDPSpawnShardedStrategy class

Removed

  • Removed deprecated pytorch_lightning.utilities.memory.get_gpu_memory_map in favor of pytorch_lightning.accelerators.cuda.get_nvidia_gpu_stats (#15617)
  • Temporarily removed support for Hydra multi-run (#15737)
  • Removed deprecated pytorch_lightning.profiler.base.AbstractProfiler in favor of pytorch_lightning.profilers.profiler.Profiler (#15637)
  • Removed deprecated pytorch_lightning.profiler.base.BaseProfiler in favor of pytorch_lightning.profilers.profiler.Profiler (#15637)
  • Removed deprecated code in pytorch_lightning.utilities.meta (#16038)
  • Removed the deprecated LightningDeepSpeedModule (#16041)
  • Removed the deprecated pytorch_lightning.accelerators.GPUAccelerator in favor of pytorch_lightning.accelerators.CUDAAccelerator (#16050)
  • Removed the deprecated pytorch_lightning.profiler.* classes in favor of pytorch_lightning.profilers (#16059)
  • Removed the deprecated pytorch_lightning.utilities.cli module in favor of pytorch_lightning.cli (#16116)
  • Removed the deprecated pytorch_lightning.loggers.base module in favor of pytorch_lightning.loggers.logger (#16120)
  • Removed the deprecated pytorch_lightning.loops.base module in favor of pytorch_lightning.loops.loop (#16142)
  • Removed the deprecated pytorch_lightning.core.lightning module in favor of pytorch_lightning.core.module (#16318)
  • Removed the deprecated pytorch_lightning.callbacks.base module in favor of pytorch_lightning.callbacks.callback (#16319)
  • Removed the deprecated Trainer.reset_train_val_dataloaders() in favor of Trainer.reset_{train,val}_dataloader (#16131)
  • Removed support for `LightningCLI(seed_ever...
Read more

Weekly patch release

21 Dec 18:35
caa3329

Choose a tag to compare

App

Added

  • Added partial support for fastapi Request annotation in configure_api handlers (#16047)
  • Added a nicer UI with URL and examples for the autoscaler component (#16063)
  • Enabled users to have more control over scaling out/in intervals (#16093)
  • Added more datatypes to the serving component (#16018)
  • Added work.delete method to delete the work (#16103)
  • Added display_name property to LightningWork for the cloud (#16095)
  • Added ColdStartProxy to the AutoScaler (#16094)
  • Added status endpoint, enable ready (#16075)
  • Implemented ready for components (#16129)

Changed

  • The default start_method for creating Work processes locally on macOS is now 'spawn' (previously 'fork') (#16089)
  • The utility lightning.app.utilities.cloud.is_running_in_cloud now returns True during the loading of the app locally when running with --cloud (#16045)
  • Updated Multinode Warning (#16091)
  • Updated app testing (#16000)
  • Changed overwrite to True (#16009)
  • Simplified messaging in cloud dispatch (#16160)
  • Added annotations endpoint (#16159)

Fixed

  • Fixed PythonServer messaging "Your app has started" (#15989)
  • Fixed auto-batching to enable batching for requests coming even after the batch interval but is in the queue (#16110)
  • Fixed a bug where AutoScaler would fail with min_replica=0 (#16092
  • Fixed a non-thread safe deepcopy in the scheduler (#16114)
  • Fixed HTTP Queue sleeping for 1 sec by default if no delta was found (#16114)
  • Fixed the endpoint info tab not showing up in the AutoScaler UI (#16128)
  • Fixed an issue where an exception would be raised in the logs when using a recent version of streamlit (#16139)
  • Fixed e2e tests (#16146)

Full Changelog: 1.8.5.post0...1.8.6

Minor patch release

16 Dec 14:12
a8a3519

Choose a tag to compare

App

  • Fixed install/upgrade - removing single quote (#16079)
  • Fixed bug where components that are re-instantiated several times failed to initialize if they were modifying self.lightningignore (#16080)
  • Fixed a bug where apps that had previously been deleted could not be run again from the CLI (#16082)

Pytorch

  • Add function to remove checkpoint to allow override for extended classes (#16067)

Full Changelog: 1.8.5...1.8.5.post0