CPU Feature Detection #65

martindevans · 2023-07-25T23:00:12Z

Motivation

During testing I noticed that the avx512 libllama.dll which I was using seemed to be much faster than the default CPU backend.

Proposed Change

This draft PR is a demonstration of a new way to load the native dependencies which we could use. This system performs feature detection on the CPU and then loads up the best DLL that the CPU can run. It is only written for Windows at the moment, but it could easily be extended to other platforms.

Why Is This A Draft?

Using this would require some modification to the backend package which I don't know how to do:

Modify the LLamaSharp.Backend.Cpu to contain three folders:

win-avx512/libllama.dll
win-avx2/libllama.dll
win-avx/libllama.dll
libllama.dll - whatever the most basic default should be (no e.g. AVX support at all?)

Modify LLamaSharp.Backend.Cuda11 to contain:

win-cuda11/libllama.dll

Modify LLamaSharp.Backend.Cuda12 to contain:

win-cuda12/libllama.dll

SanftMonster · 2023-08-05T00:53:58Z

It's a great feature. Since many people don't have an efficient GPU, the performance of CPU is significant. BTW, would you like to be a committer of this project? I noticed that you're familiar with this area and has completed many good features. Recently I'm too busy with my work to maintain this project well. Even if I get away of my damm work after some time, I'd still be happy if you could develop together on this project. :)

martindevans · 2023-08-05T01:09:45Z

BTW, would you like to be a committer of this project?

That would be great, I'll be happy to help if I can help take some of maintenance weight off you! Thankyou for asking :)

SanftMonster · 2023-08-05T01:17:22Z

That would be great, I'll be happy to help if I can help take some of maintenance weight off you! Thankyou for asking :)

I've added you to the group of write access. :) If there's any need for publishing a release, please contact me to push the packages. Thank you a lot for all your contributions and hope we could have a good time developing together!

martindevans · 2023-08-05T01:25:11Z

By the way, if you're adding new DLLs since you merged #64 you may want to put them into folders with appropriate names for this PR at the same time?

That way you won't need to do another release of the runtime packages when this feature gets added.

SanftMonster · 2023-08-09T15:16:46Z

By the way, if you're adding new DLLs since you merged #64 you may want to put them into folders with appropriate names for this PR at the same time?

That way you won't need to do another release of the runtime packages when this feature gets added.

Thank you for the reminder, but I didn't notice this comment before😶‍🌫️. I'll make a new release after this PR is merged :) v0.4.2 is only a pre-release.

martindevans · 2023-08-09T15:25:28Z

I've cleaned this up a bit:

It's now usable on any platform (not just Windows)
- It doesn't do anything on the other platforms yet, but there are obvious blocks waiting to be filled in a future PR.
Rearranged the code a bit so the list of preferred dependencies is much more readable in the code

So I think it's ready for review now.

You'll need to rearrange the DLLs so that they're in separate folders (see the top comment).

SanftMonster · 2023-09-02T04:17:56Z

LLama/Native/NativeApi.cs

        static NativeApi()
        {
+            // Try to load a preferred library, based on CPU feature detection
+            TryLoadLibrary();


The return value is ignored here, does the library loading still work?

That should be fine.

The DllImport methods still work as normal. So if this fails to load anything they will just try to load the DLL as normal when called.

This is actually really important because on MacOS and Linux the TryLoadLibrary method does nothing! It always "fails" and falls back to the normal behaviour.

martindevans · 2023-09-02T13:32:00Z

I've rebased this onto master, so it's fully up to date with all the GGUF changes. I think the only thing left for this PR is to rearrange the native deps (which I'm not completely sure how to do).

The required layout is:

-- All this stuff is distributed in the "CPU runtimes" nuget package
    /libllama.so (this is the noavx version)
    /libllama.dll (this is the noavx version)
    /libllama.dylib (??? see below)
    /avx
        /libllama.so
        /libllama.dll
    /avx2
        /libllama.so
        /libllama.dll
    /avx512
        /libllama.so
        /libllama.dll

-- This is distributed in the cu12 runtime package
    /cu12.1.0
        /libllama.so
        /libllama.dll
        
-- This is distributed in the cu11 runtime package
    /cu11.7.1
        /libllama.so
        /libllama.dll

MacOS?

At the moment I'm not sure exactly how MacOS should be handled. The basic CPU libllama.dylib should probably be in the folder whre I've shown it, but I don't know how metal works.

SanftMonster · 2023-09-02T14:06:49Z

Thank you for this work. Seems that it could integrate all the dlls into one backend package. I'll try to add the avx binaries and test it.

martindevans · 2023-09-02T14:45:57Z

I'm not sure how the CUDA stuff will work if it's all in one package. If it's all in one package I think the CUDA binaries would load (because they exist) but then fail at runtime if CUDA isn't supported. If that's true we'd either need to add a runtime check for CUDA compatibility, or keep them in a separate package.

SanftMonster · 2023-09-05T18:07:57Z

Hey Martin, I really appreciate for this good work but I'm afraid to delay this feature to the next version. I test it on my PC and found something strange. For example,

TryLoadLibrary cannot load any of it even though I thought I placed the files in correct structure.
Though my computer supports avx2, it loaded the dll in avx folder instead. (Maybe it just search the file under the directory by name order and avx is prior to avx2)

I think we should publish the new version quickly since the file format has been changed from ggml to gguf. I believe we'll resolve the problems above before the next release. 😊

martindevans · 2023-09-05T18:11:42Z

That's absolutely fine! GGUF is really critical feature to get out ASAP, wheras this can wait.

martindevans · 2023-09-17T20:47:46Z

We'll need to test this extensively on different hardware before the next release!

martindevans mentioned this pull request Aug 5, 2023

feat: update the llama backends. #78

Merged

martindevans force-pushed the alternative_dependency_loading branch from 309cd3c to 44fe261 Compare August 9, 2023 14:47

martindevans marked this pull request as ready for review August 9, 2023 15:25

martindevans requested a review from SanftMonster August 11, 2023 01:04

martindevans mentioned this pull request Aug 13, 2023

Will it be possible to load 70b models soon? #98

Closed

martindevans changed the title ~~Proposal: CPU Feature Detection~~ CPU Feature Detection Aug 14, 2023

SanftMonster approved these changes Aug 16, 2023

View reviewed changes

This was referenced Aug 20, 2023

No LLamaSharp backend was installed #58

Closed

GGUF #122

Merged

SanftMonster reviewed Sep 2, 2023

View reviewed changes

Added a new way to load dependencies, performing CPU feature detection

756a1ad

martindevans force-pushed the alternative_dependency_loading branch from 765f5ab to 756a1ad Compare September 2, 2023 13:03

martindevans added 2 commits September 2, 2023 14:10

Changed paths to match what the GitHub build action produces

dd49574

Added Linux dependency loading

8f58a40

AshD mentioned this pull request Sep 6, 2023

Auto loaded correct LlamaSharp backend for WPF app #154

Closed

Oceania2018 approved these changes Sep 17, 2023

View reviewed changes

Oceania2018 merged commit 10678a8 into SciSharp:master Sep 17, 2023

martindevans deleted the alternative_dependency_loading branch September 17, 2023 20:47

martindevans mentioned this pull request Nov 5, 2023

Todo: Runtime Feature Detection #251

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CPU Feature Detection #65

CPU Feature Detection #65

Uh oh!

martindevans commented Jul 25, 2023 •

edited

Loading

Uh oh!

SanftMonster commented Aug 5, 2023

Uh oh!

martindevans commented Aug 5, 2023

Uh oh!

SanftMonster commented Aug 5, 2023

Uh oh!

martindevans commented Aug 5, 2023 •

edited

Loading

Uh oh!

SanftMonster commented Aug 9, 2023 •

edited

Loading

Uh oh!

martindevans commented Aug 9, 2023

Uh oh!

SanftMonster Sep 2, 2023

Uh oh!

martindevans Sep 2, 2023

Uh oh!

martindevans commented Sep 2, 2023 •

edited

Loading

Uh oh!

SanftMonster commented Sep 2, 2023

Uh oh!

martindevans commented Sep 2, 2023

Uh oh!

SanftMonster commented Sep 5, 2023

Uh oh!

martindevans commented Sep 5, 2023

Uh oh!

martindevans commented Sep 17, 2023

Uh oh!

Uh oh!

CPU Feature Detection #65

CPU Feature Detection #65

Uh oh!

Conversation

martindevans commented Jul 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Proposed Change

Why Is This A Draft?

Uh oh!

SanftMonster commented Aug 5, 2023

Uh oh!

martindevans commented Aug 5, 2023

Uh oh!

SanftMonster commented Aug 5, 2023

Uh oh!

martindevans commented Aug 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SanftMonster commented Aug 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martindevans commented Aug 9, 2023

Uh oh!

SanftMonster Sep 2, 2023

Choose a reason for hiding this comment

Uh oh!

martindevans Sep 2, 2023

Choose a reason for hiding this comment

Uh oh!

martindevans commented Sep 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

MacOS?

Uh oh!

SanftMonster commented Sep 2, 2023

Uh oh!

martindevans commented Sep 2, 2023

Uh oh!

SanftMonster commented Sep 5, 2023

Uh oh!

martindevans commented Sep 5, 2023

Uh oh!

martindevans commented Sep 17, 2023

Uh oh!

Uh oh!

martindevans commented Jul 25, 2023 •

edited

Loading

martindevans commented Aug 5, 2023 •

edited

Loading

SanftMonster commented Aug 9, 2023 •

edited

Loading

martindevans commented Sep 2, 2023 •

edited

Loading