Skip to content

Conversation

ggerganov
Copy link
Member

target #9707

Adapt the Metal backend to the new registry and device interfaces.

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Oct 2, 2024
@ggerganov ggerganov changed the title ggml-backend : add device and backend reg interfaces ggml : add metal backend registry / devic Oct 2, 2024
@ggerganov ggerganov changed the title ggml : add metal backend registry / devic ggml : add metal backend registry / device Oct 2, 2024
@slaren
Copy link
Member

slaren commented Oct 2, 2024

ggml_backend_metal_buffer_type() also needs to be updated to set the device field.

@slaren
Copy link
Member

slaren commented Oct 3, 2024

There have been a few minor changes to the interfaces:

  • The get_backend_reg function of the device interface has been removed, instead a pointer is stored directly in ggml_backend_device: d0c4954
  • Some functions have been renamed: cfef355

@ggerganov ggerganov force-pushed the sl/backend-registry-2-add-metal branch from 37de34c to a62ea59 Compare October 4, 2024 11:11
@github-actions github-actions bot added script Script related testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment Vulkan Issues specific to the Vulkan backend examples python python script changes devops improvements to build systems and github actions server SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language Kompute https://github.com/KomputeProject/kompute/ labels Oct 4, 2024
@ggerganov ggerganov changed the base branch from sl/backend-registry-2 to master October 4, 2024 11:11
@ggerganov ggerganov force-pushed the sl/backend-registry-2-add-metal branch 2 times, most recently from 058430f to ae56ec2 Compare October 4, 2024 12:15
Copy link

@mmtmn mmtmn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@slaren
Copy link
Member

slaren commented Oct 5, 2024

This seems to be working now.

@slaren slaren force-pushed the sl/backend-registry-2-add-metal branch 2 times, most recently from 7e8d2a9 to 84c3b2a Compare October 5, 2024 22:47
@ggerganov ggerganov force-pushed the sl/backend-registry-2-add-metal branch from 84c3b2a to 6dcb899 Compare October 6, 2024 10:16
@ggerganov ggerganov marked this pull request as ready for review October 6, 2024 10:16
@ggerganov ggerganov requested a review from slaren October 6, 2024 10:17
@ggerganov
Copy link
Member Author


ggml_backend_t ggml_backend_reg_metal_init(const char * params, void * user_data) {
static const char * ggml_backend_metal_device_get_description(ggml_backend_dev_t dev) {
return [[g_state.mtl_device name] UTF8String];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a guarantee that mtl_device is initialized here, it probably needs a call to ggml_backend_metal_get_device/ggml_backend_metal_free_device like in ggml_backend_metal_device_get_memory. However, I imagine that could cause issues with the lifetime of the string returned by MTLDevice, so it may be necessary to keep a copy of the string in the context instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed now.

I also reworked the implementation to avoid accessing g_state when we can get the device context locally. Should be much cleaner now and easier to add multi-GPU support in the future if needed.

Comment on lines 496 to 502
#if TARGET_OS_OSX || (TARGET_OS_IOS && __clang_major__ >= 15)
if (@available(macOS 10.12, iOS 16.0, *)) {
GGML_LOG_INFO("%s: recommendedMaxWorkingSetSize = %8.2f MB\n", __func__, ctx->device.recommendedMaxWorkingSetSize / 1e6);
GGML_LOG_INFO("%s: recommendedMaxWorkingSetSize = %8.2f MB\n", __func__, device.recommendedMaxWorkingSetSize / 1e6);
}
#elif TARGET_OS_OSX
if (ctx->device.maxTransferRate != 0) {
GGML_LOG_INFO("%s: maxTransferRate = %8.2f MB/s\n", __func__, ctx->device.maxTransferRate / 1e6);
if (device.maxTransferRate != 0) {
GGML_LOG_INFO("%s: maxTransferRate = %8.2f MB/s\n", __func__, device.maxTransferRate / 1e6);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this #if/#elif is correct, this will never be printed.

@slaren
Copy link
Member

slaren commented Oct 7, 2024

Should we put a deprecate notice for these API calls?

I think it may be too early for that, it's probably better to wait a bit until all the backends and the ggml examples are updated.

@ggerganov ggerganov merged commit d5ac8cf into master Oct 7, 2024
53 checks passed
@ggerganov ggerganov deleted the sl/backend-registry-2-add-metal branch October 7, 2024 15:27
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
* ggml : add metal backend registry / device

ggml-ci

* metal : fix names [no ci]

* metal : global registry and device instances

ggml-ci

* cont : alternative initialization of global objects

ggml-ci

* llama : adapt to backend changes

ggml-ci

* fixes

* metal : fix indent

* metal : fix build when MTLGPUFamilyApple3 is not available

ggml-ci

* fix merge

* metal : avoid unnecessary singleton accesses

ggml-ci

* metal : minor fix [no ci]

* metal : g_state -> g_ggml_ctx_dev_main [no ci]

* metal : avoid reference of device context in the backend context

ggml-ci

* metal : minor [no ci]

* metal : fix maxTransferRate check

* metal : remove transfer rate stuff

---------

Co-authored-by: slaren <[email protected]>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* ggml : add metal backend registry / device

ggml-ci

* metal : fix names [no ci]

* metal : global registry and device instances

ggml-ci

* cont : alternative initialization of global objects

ggml-ci

* llama : adapt to backend changes

ggml-ci

* fixes

* metal : fix indent

* metal : fix build when MTLGPUFamilyApple3 is not available

ggml-ci

* fix merge

* metal : avoid unnecessary singleton accesses

ggml-ci

* metal : minor fix [no ci]

* metal : g_state -> g_ggml_ctx_dev_main [no ci]

* metal : avoid reference of device context in the backend context

ggml-ci

* metal : minor [no ci]

* metal : fix maxTransferRate check

* metal : remove transfer rate stuff

---------

Co-authored-by: slaren <[email protected]>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
* ggml : add metal backend registry / device

ggml-ci

* metal : fix names [no ci]

* metal : global registry and device instances

ggml-ci

* cont : alternative initialization of global objects

ggml-ci

* llama : adapt to backend changes

ggml-ci

* fixes

* metal : fix indent

* metal : fix build when MTLGPUFamilyApple3 is not available

ggml-ci

* fix merge

* metal : avoid unnecessary singleton accesses

ggml-ci

* metal : minor fix [no ci]

* metal : g_state -> g_ggml_ctx_dev_main [no ci]

* metal : avoid reference of device context in the backend context

ggml-ci

* metal : minor [no ci]

* metal : fix maxTransferRate check

* metal : remove transfer rate stuff

---------

Co-authored-by: slaren <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apple Metal https://en.wikipedia.org/wiki/Metal_(API) devops improvements to build systems and github actions examples ggml changes relating to the ggml tensor library for machine learning Kompute https://github.com/KomputeProject/kompute/ nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment Nvidia GPU Issues specific to Nvidia GPUs python python script changes script Script related server SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language testing Everything test related Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants