Skip to content

Add KV Cache for Autoregressive Inference #12600

@DN6

Description

@DN6

Autoregressive Diffusion Techniques such as Self Forcing rely on a rolling KV Cache across video frame chunks to transfer information from past context frames to the current frames being denoised.

This rolling KV Cache design (or variants similar to it) is likely to show up in other types of long video generation/ world models, so it would be good to see if we can support it natively in Diffusers.

Tasks

  • Implement rolling KV Cache seen in Self Forcing using Diffusers' cache hooks design.
  • Add a Modular Block to Wan Modular Pipelines that uses this rolling KV Cache to perform autoregressive inference.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions