Skip to content

Conversation

lslusarczyk
Copy link
Contributor

@lslusarczyk lslusarczyk commented Sep 11, 2025

@lslusarczyk lslusarczyk requested a review from a team as a code owner September 11, 2025 12:30
@bratpiorka bratpiorka requested review from ldorau and lplewa September 11, 2025 12:53
Copy link
Contributor

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

14 / 16 files reviewed (will be continued)

@lplewa
Copy link
Contributor

lplewa commented Sep 19, 2025

I still oppose against adding provider specific code to ops interface.
If you don't want to do it thru CTL just implement it as umfLevelZeroMemoryProviderResidentDeviceChange() in provider_level_zero.h which do not go throught OPS. This function will act as an "level zero specific api extension".

note: remember that you will have to call umfMemoryProviderGetPriv() in this function to retrieve ze_memory_provider_t from provider handle


static void init_ze_global_state(void) {

char *lib_name = getenv("UMF_ZE_LOADER_LIB_NAME");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do this thru CTL please.

We created CTL to limit new envvariables.

Copy link
Contributor Author

@lslusarczyk lslusarczyk Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for tests only. Env variables are simple to write and to understand for future maintainers. Could you please accept it as is in this PR? But I am OK if anyone wants to change it to CTL after this PR is merged.


// TODO: add assertions to UMF and change it to be an assertion
if (info->props.base != (void *)key) {
LOG_ERR("key:%p is different than base:%p", (void *)key,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOG_FATAL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, applied

Comment on lines 27 to 45
#define LOG_DEBUG(...) utils_log(LOG_DEBUG, __func__, __VA_ARGS__);
#define LOG_INFO(...) utils_log(LOG_INFO, __func__, __VA_ARGS__);
#define LOG_WARN(...) utils_log(LOG_WARNING, __func__, __VA_ARGS__);
#define LOG_ERR(...) utils_log(LOG_ERROR, __func__, __VA_ARGS__);
#define LOG_FATAL(...) utils_log(LOG_FATAL, __func__, __VA_ARGS__);

#define LOG_PDEBUG(...) utils_plog(LOG_DEBUG, __func__, __VA_ARGS__);
#define LOG_PINFO(...) utils_plog(LOG_INFO, __func__, __VA_ARGS__);
#define LOG_PWARN(...) utils_plog(LOG_WARNING, __func__, __VA_ARGS__);
#define LOG_PERR(...) utils_plog(LOG_ERROR, __func__, __VA_ARGS__);
#define LOG_PFATAL(...) utils_plog(LOG_FATAL, __func__, __VA_ARGS__);
#ifdef UMF_DEVELOPER_MODE
#define UMF_STRINGIFY(x) #x
#define UMF_TOSTRING(x) UMF_STRINGIFY(x)
#define UMF_FUNC_DESC() __FILE__ ":" UMF_TOSTRING(__LINE__)
#else
#define UMF_FUNC_DESC() __func__
#endif

#define LOG_DEBUG(...) utils_log(LOG_DEBUG, UMF_FUNC_DESC(), __VA_ARGS__);
#define LOG_INFO(...) utils_log(LOG_INFO, UMF_FUNC_DESC(), __VA_ARGS__);
#define LOG_WARN(...) utils_log(LOG_WARNING, UMF_FUNC_DESC(), __VA_ARGS__);
#define LOG_ERR(...) utils_log(LOG_ERROR, UMF_FUNC_DESC(), __VA_ARGS__);
#define LOG_FATAL(...) utils_log(LOG_FATAL, UMF_FUNC_DESC(), __VA_ARGS__);

#define LOG_PDEBUG(...) utils_plog(LOG_DEBUG, UMF_FUNC_DESC(), __VA_ARGS__);
#define LOG_PINFO(...) utils_plog(LOG_INFO, UMF_FUNC_DESC(), __VA_ARGS__);
#define LOG_PWARN(...) utils_plog(LOG_WARNING, UMF_FUNC_DESC(), __VA_ARGS__);
#define LOG_PERR(...) utils_plog(LOG_ERROR, UMF_FUNC_DESC(), __VA_ARGS__);
#define LOG_PFATAL(...) utils_plog(LOG_FATAL, UMF_FUNC_DESC(), __VA_ARGS__);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a separate PR.

There are comments which i whould make - but i strongly believe that this discussion should happen in the separate PR to do not block this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// found
if (is_adding) {
utils_write_unlock(&ze_provider->resident_device_rwlock);
// impossible for UR, should be an assertion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imho we should not assume that user will not do something then is should be an assertion. If it comes it for any kind of user it should be validated an correct error should be returned

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, comment removed


uint32_t existing_peer_index = 0;
utils_write_lock(&ze_provider->resident_device_rwlock);
while (existing_peer_index < ze_provider->resident_device_count &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: this code whould be much easier to read if it whould be a for loop

for (index = 0; index < count; index++) if (device[index] == index) break;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need for loop index to be available outside the loop. I don't think it is more readable. Left as is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do the incrementation within the while loop and have extra && to mark the stop, I believe I agree with Łukasz - this would look better as for.

// you obviously don't have to init anything in for (e.g. for(; index < count; index++)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes i noticed that you need it outside of loop - this is why i write
for(index = 0; index < count; index++) if (device[index] == index) break;
not
for (int index = 0; index < count; index++) if (device[index] == index) break;

If something is for loop we should use for not while. If you need access to loop counter outside of function you can declare it before loop - this is a standard approach which should not confuse reader.

Comment on lines 1062 to 1071
const uint32_t new_capacity =
ze_provider->resident_device_capacity * 2 +
1; // +1 to work also with old capacity == 0
ze_device_handle_t *new_handles =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do this code is performance critical, or will be called multiple times

If not i would prefer to have this array just always reallocated by one,
If yes can we abstract this vector like structure to separate module, or even better just use list structure with is included in umf.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, not critical, applied

@lslusarczyk
Copy link
Contributor Author

I still oppose against adding provider specific code to ops interface. If you don't want to do it thru CTL just implement it as umfLevelZeroMemoryProviderResidentDeviceChange() in provider_level_zero.h which do not go throught OPS. This function will act as an "level zero specific api extension".

note: remember that you will have to call umfMemoryProviderGetPriv() in this function to retrieve ze_memory_provider_t from provider handle

Great! Using umfLevelZeroMemoryProviderResidentDeviceChange simplified code even more than using ops. Applied with pleasure.

@lslusarczyk
Copy link
Contributor Author

@lukaszstolarczuk , @bratpiorka , @ldorau ,
CI passed, please check if I applied your comments correctly, resolve discussions which can be resolved and give approve if there is nothing left to do with respect to your comments

@lplewa , please do the same wrt your comments when you're back


uint32_t existing_peer_index = 0;
utils_write_lock(&ze_provider->resident_device_rwlock);
while (existing_peer_index < ze_provider->resident_device_count &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do the incrementation within the while loop and have extra && to mark the stop, I believe I agree with Łukasz - this would look better as for.

// you obviously don't have to init anything in for (e.g. for(; index < count; index++)

Copy link
Contributor

@lplewa lplewa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refactor commits structure.

commits in PR should be split to functional parts, with proper commit message.
Each commit should work (at least it should build - but we do not check it so exception happens). You should not split implementation, tests, documentation to separate commits.

If i were you i would create(This list is an inspiration for proper commits structure, you do not have split your PR to exactly this commits) : commit for adding iterate over critnib, commit for changes in asserts, maybe commit for this mock library, commit for allowing ovveride lib_name, commit for change in setResidentDevides, commit for main functionality of this PR....


uint32_t existing_peer_index = 0;
utils_write_lock(&ze_provider->resident_device_rwlock);
while (existing_peer_index < ze_provider->resident_device_count &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes i noticed that you need it outside of loop - this is why i write
for(index = 0; index < count; index++) if (device[index] == index) break;
not
for (int index = 0; index < count; index++) if (device[index] == index) break;

If something is for loop we should use for not while. If you need access to loop counter outside of function you can declare it before loop - this is a standard approach which should not confuse reader.

@lslusarczyk
Copy link
Contributor Author

Please refactor commits structure.

commits in PR should be split to functional parts, with proper commit message. Each commit should work (at least it should build - but we do not check it so exception happens). You should not split implementation, tests, documentation to separate commits.

If i were you i would create(This list is an inspiration for proper commits structure, you do not have split your PR to exactly this commits) : commit for adding iterate over critnib, commit for changes in asserts, maybe commit for this mock library, commit for allowing ovveride lib_name, commit for change in setResidentDevides, commit for main functionality of this PR....

Refactor will be squashing all commits to one during merge.
Each commit will be working :)

@lplewa
Copy link
Contributor

lplewa commented Oct 6, 2025

Please refactor commits structure.
commits in PR should be split to functional parts, with proper commit message. Each commit should work (at least it should build - but we do not check it so exception happens). You should not split implementation, tests, documentation to separate commits.
If i were you i would create(This list is an inspiration for proper commits structure, you do not have split your PR to exactly this commits) : commit for adding iterate over critnib, commit for changes in asserts, maybe commit for this mock library, commit for allowing ovveride lib_name, commit for change in setResidentDevides, commit for main functionality of this PR....

Refactor will be squashing all commits to one during merge. Each commit will be working :)

We do not do squash merge in this project

void critnib_iter_all(critnib *c,
int (*func)(uintptr_t key, void *value, void *privdata),
void *privdata) {
critnib_iter(c, 0, (uintptr_t)-1, func, privdata);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is UINTPTR_MAX instead of integer overflow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. However integer overflow is a common pattern used in UMF.

# git grep \)-1
examples/custom_file_provider/custom_file_provider.c:#define ALIGN_UP(value, align) (((value) + (align)-1) & ~((align)-1))
src/critnib/critnib.c:        if (key == (uintptr_t)-1) {
src/critnib/critnib.c:    critnib_iter(c, 0, (uintptr_t)-1, func, privdata);
src/uthash/uthash.h:        bkt = ((hashv) & ((num_bkts)-1U));                                     \
src/utils/utils_common.h:#define IS_POWER_OF_2(value) ((value) != 0 && ((value) & ((value)-1)) == 0)
src/utils/utils_common.h:    ((align == 0 || (((value) & ((align)-1)) == 0)))
src/utils/utils_common.h:    ((align != 0 && (((value) & ((align)-1)) != 0)))
src/utils/utils_common.h:#define ALIGN_UP(value, align) (((value) + (align)-1) & ~((align)-1))
src/utils/utils_common.h:         : (((value) + (align)-1) < (value) ? 0 : ALIGN_UP((value), (align))))
src/utils/utils_common.h:#define ALIGN_DOWN(value, align) ((value) & ~((align)-1))
test/provider_fixed_memory.cpp:    test_alloc_failure((size_t)-1, 0, UMF_RESULT_ERROR_OUT_OF_HOST_MEMORY, 0);
test/providers/provider_cuda.cpp:        create_cuda_prov_params(ctx, (CUdevice)-1, UMF_MEMORY_TYPE_HOST, 0);

... so I guess -1 can be an accepted solution here too. Let's leave as is.

}

ze_memory_provider_t *ze_provider = provider;
umf_result_t ret;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lack of initialization will trigger Coverity warning

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, UMF lives with it quite well:

$ git grep "umf_result_t ret;"
examples/memspace_hmat/memspace_hmat.c:    umf_result_t ret;
examples/memspace_numa/memspace_numa.c:    umf_result_t ret;
src/ctl/ctl.c:    optional_umf_result_t ret;
src/memspace.c:    umf_result_t ret;
src/pool/pool_jemalloc.c:    umf_result_t ret;
src/pool/pool_jemalloc.c:    umf_result_t ret;
src/pool/pool_jemalloc.c:    umf_result_t ret;
src/provider/provider_devdax_memory.c:    umf_result_t ret;
src/provider/provider_file_memory.c:    umf_result_t ret;
src/provider/provider_fixed_memory.c:    umf_result_t ret;
src/provider/provider_level_zero.c:    umf_result_t ret;
src/provider/provider_os_memory.c:    umf_result_t ret;
src/provider/provider_tracking.c:    umf_result_t ret;
test/ctl/ctl_api.cpp:        umf_result_t ret;
test/ipcFixtures.hpp:        umf_result_t ret;
test/ipcFixtures.hpp:        umf_result_t ret;
test/ipc_negative.cpp:        umf_result_t ret;
test/poolFixtures.hpp:    umf_result_t ret;

Leaving as is.

Comment on lines +13 to +14
#include <gmock/gmock.h>
#include <vector>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

different include group
gmock -> 3rd party
vector -> system

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let's don't be too pedantic. I see no value for Intel in changing this particular order.

Comment on lines +83 to +85

void *lib_handle;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alone data member+new lines, move to the end of declaration under private: specifier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A space after "TearDown", then "private:", then a new field is also a three lines.
I understand some people want always to see "private:" keyword, but it is just a test. Again please let's don't focus on so insignificant details.

@lslusarczyk lslusarczyk merged commit 1209db2 into oneapi-src:main Oct 9, 2025
148 of 149 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants