-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Closed
Labels
status: trackingTracking work in progressTracking work in progress
Description
Overview
There have been many great suggestions from the community regarding loading and caching model weights. This tracker issue compiles the suggestions and keeps track of the progress.
Action Items
-
C0: Make ArtifactCache https://github.com/apache/tvm/blob/main/web/src/runtime.ts#L991 an interface ArtifactCache in a new file
artifact_cache.ts- Provide implementation ArtifactCacheBasic (our existing approach)
- Provide parallel download methods in the same file (C1)
- Optionally, allow injection of additional class of ArtifactCache that can be implemented via other means.
- [WEB] Move ArtifactCache to Interface, Support Cache delete and Batch Delete, Remove typo apache/tvm#16525
-
C1: Parallelize weight shards download on tvmjs side
-
C2: Add a helper function to delete cache storage (part of C0)
- Add a helper function to delete data from Cache Storage #267
- This applies to both model library and weights. Especially when we update the model libraries, it can be tricky to use the newest version, since the names are the same.
- TVM-side support for deleting model weights is added via [WEB] Move ArtifactCache to Interface, Support Cache delete and Batch Delete, Remove typo apache/tvm#16525
- Missing WebLLM side API
-
C3: Switch to IndexDB for caching
- Error Not Working QuotaExceededError: Failed to execute 'add' on 'Cache': Quota exceeded. #144
- "caches is not defined" #257
- Currently, in some environments, the cache storage may be too small to load the entire weights
- This should be an implementation of C0, namely ArtifactCaheIndexDB
-
C4: Allow using local models
- Loading models from disk #282
- Downloading model weights is arguably the largest overhead of our project; providing alternatives would be helpful.
Metadata
Metadata
Assignees
Labels
status: trackingTracking work in progressTracking work in progress
Type
Projects
Status
Done