-
Notifications
You must be signed in to change notification settings - Fork 40
Add resident device change call #1517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d53a23f
to
9d7e12c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
14 / 16 files reviewed (will be continued)
88f672b
to
53ec753
Compare
I still oppose against adding provider specific code to ops interface. note: remember that you will have to call umfMemoryProviderGetPriv() in this function to retrieve ze_memory_provider_t from provider handle |
src/provider/provider_level_zero.c
Outdated
|
||
static void init_ze_global_state(void) { | ||
|
||
char *lib_name = getenv("UMF_ZE_LOADER_LIB_NAME"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do this thru CTL please.
We created CTL to limit new envvariables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for tests only. Env variables are simple to write and to understand for future maintainers. Could you please accept it as is in this PR? But I am OK if anyone wants to change it to CTL after this PR is merged.
src/provider/provider_level_zero.c
Outdated
|
||
// TODO: add assertions to UMF and change it to be an assertion | ||
if (info->props.base != (void *)key) { | ||
LOG_ERR("key:%p is different than base:%p", (void *)key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOG_FATAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, applied
src/utils/utils_log.h
Outdated
#define LOG_DEBUG(...) utils_log(LOG_DEBUG, __func__, __VA_ARGS__); | ||
#define LOG_INFO(...) utils_log(LOG_INFO, __func__, __VA_ARGS__); | ||
#define LOG_WARN(...) utils_log(LOG_WARNING, __func__, __VA_ARGS__); | ||
#define LOG_ERR(...) utils_log(LOG_ERROR, __func__, __VA_ARGS__); | ||
#define LOG_FATAL(...) utils_log(LOG_FATAL, __func__, __VA_ARGS__); | ||
|
||
#define LOG_PDEBUG(...) utils_plog(LOG_DEBUG, __func__, __VA_ARGS__); | ||
#define LOG_PINFO(...) utils_plog(LOG_INFO, __func__, __VA_ARGS__); | ||
#define LOG_PWARN(...) utils_plog(LOG_WARNING, __func__, __VA_ARGS__); | ||
#define LOG_PERR(...) utils_plog(LOG_ERROR, __func__, __VA_ARGS__); | ||
#define LOG_PFATAL(...) utils_plog(LOG_FATAL, __func__, __VA_ARGS__); | ||
#ifdef UMF_DEVELOPER_MODE | ||
#define UMF_STRINGIFY(x) #x | ||
#define UMF_TOSTRING(x) UMF_STRINGIFY(x) | ||
#define UMF_FUNC_DESC() __FILE__ ":" UMF_TOSTRING(__LINE__) | ||
#else | ||
#define UMF_FUNC_DESC() __func__ | ||
#endif | ||
|
||
#define LOG_DEBUG(...) utils_log(LOG_DEBUG, UMF_FUNC_DESC(), __VA_ARGS__); | ||
#define LOG_INFO(...) utils_log(LOG_INFO, UMF_FUNC_DESC(), __VA_ARGS__); | ||
#define LOG_WARN(...) utils_log(LOG_WARNING, UMF_FUNC_DESC(), __VA_ARGS__); | ||
#define LOG_ERR(...) utils_log(LOG_ERROR, UMF_FUNC_DESC(), __VA_ARGS__); | ||
#define LOG_FATAL(...) utils_log(LOG_FATAL, UMF_FUNC_DESC(), __VA_ARGS__); | ||
|
||
#define LOG_PDEBUG(...) utils_plog(LOG_DEBUG, UMF_FUNC_DESC(), __VA_ARGS__); | ||
#define LOG_PINFO(...) utils_plog(LOG_INFO, UMF_FUNC_DESC(), __VA_ARGS__); | ||
#define LOG_PWARN(...) utils_plog(LOG_WARNING, UMF_FUNC_DESC(), __VA_ARGS__); | ||
#define LOG_PERR(...) utils_plog(LOG_ERROR, UMF_FUNC_DESC(), __VA_ARGS__); | ||
#define LOG_PFATAL(...) utils_plog(LOG_FATAL, UMF_FUNC_DESC(), __VA_ARGS__); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a separate PR.
There are comments which i whould make - but i strongly believe that this discussion should happen in the separate PR to do not block this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/provider/provider_level_zero.c
Outdated
// found | ||
if (is_adding) { | ||
utils_write_unlock(&ze_provider->resident_device_rwlock); | ||
// impossible for UR, should be an assertion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imho we should not assume that user will not do something then is should be an assertion. If it comes it for any kind of user it should be validated an correct error should be returned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, comment removed
src/provider/provider_level_zero.c
Outdated
|
||
uint32_t existing_peer_index = 0; | ||
utils_write_lock(&ze_provider->resident_device_rwlock); | ||
while (existing_peer_index < ze_provider->resident_device_count && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: this code whould be much easier to read if it whould be a for loop
for (index = 0; index < count; index++) if (device[index] == index) break;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need for loop index to be available outside the loop. I don't think it is more readable. Left as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you do the incrementation within the while loop and have extra &&
to mark the stop, I believe I agree with Łukasz - this would look better as for
.
// you obviously don't have to init anything in for (e.g. for(; index < count; index++)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes i noticed that you need it outside of loop - this is why i write
for(index = 0; index < count; index++) if (device[index] == index) break;
not
for (int index = 0; index < count; index++) if (device[index] == index) break;
If something is for loop we should use for not while. If you need access to loop counter outside of function you can declare it before loop - this is a standard approach which should not confuse reader.
const uint32_t new_capacity = | ||
ze_provider->resident_device_capacity * 2 + | ||
1; // +1 to work also with old capacity == 0 | ||
ze_device_handle_t *new_handles = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do this code is performance critical, or will be called multiple times
If not i would prefer to have this array just always reallocated by one,
If yes can we abstract this vector like structure to separate module, or even better just use list structure with is included in umf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, not critical, applied
Great! Using umfLevelZeroMemoryProviderResidentDeviceChange simplified code even more than using ops. Applied with pleasure. |
@lukaszstolarczuk , @bratpiorka , @ldorau , @lplewa , please do the same wrt your comments when you're back |
src/provider/provider_level_zero.c
Outdated
|
||
uint32_t existing_peer_index = 0; | ||
utils_write_lock(&ze_provider->resident_device_rwlock); | ||
while (existing_peer_index < ze_provider->resident_device_count && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you do the incrementation within the while loop and have extra &&
to mark the stop, I believe I agree with Łukasz - this would look better as for
.
// you obviously don't have to init anything in for (e.g. for(; index < count; index++)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refactor commits structure.
commits in PR should be split to functional parts, with proper commit message.
Each commit should work (at least it should build - but we do not check it so exception happens). You should not split implementation, tests, documentation to separate commits.
If i were you i would create(This list is an inspiration for proper commits structure, you do not have split your PR to exactly this commits) : commit for adding iterate over critnib, commit for changes in asserts, maybe commit for this mock library, commit for allowing ovveride lib_name, commit for change in setResidentDevides, commit for main functionality of this PR....
src/provider/provider_level_zero.c
Outdated
|
||
uint32_t existing_peer_index = 0; | ||
utils_write_lock(&ze_provider->resident_device_rwlock); | ||
while (existing_peer_index < ze_provider->resident_device_count && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes i noticed that you need it outside of loop - this is why i write
for(index = 0; index < count; index++) if (device[index] == index) break;
not
for (int index = 0; index < count; index++) if (device[index] == index) break;
If something is for loop we should use for not while. If you need access to loop counter outside of function you can declare it before loop - this is a standard approach which should not confuse reader.
Refactor will be squashing all commits to one during merge. |
We do not do squash merge in this project |
ee516a4
to
dbe3b88
Compare
e2b49a4
to
96e69c4
Compare
void critnib_iter_all(critnib *c, | ||
int (*func)(uintptr_t key, void *value, void *privdata), | ||
void *privdata) { | ||
critnib_iter(c, 0, (uintptr_t)-1, func, privdata); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is UINTPTR_MAX
instead of integer overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. However integer overflow is a common pattern used in UMF.
# git grep \)-1
examples/custom_file_provider/custom_file_provider.c:#define ALIGN_UP(value, align) (((value) + (align)-1) & ~((align)-1))
src/critnib/critnib.c: if (key == (uintptr_t)-1) {
src/critnib/critnib.c: critnib_iter(c, 0, (uintptr_t)-1, func, privdata);
src/uthash/uthash.h: bkt = ((hashv) & ((num_bkts)-1U)); \
src/utils/utils_common.h:#define IS_POWER_OF_2(value) ((value) != 0 && ((value) & ((value)-1)) == 0)
src/utils/utils_common.h: ((align == 0 || (((value) & ((align)-1)) == 0)))
src/utils/utils_common.h: ((align != 0 && (((value) & ((align)-1)) != 0)))
src/utils/utils_common.h:#define ALIGN_UP(value, align) (((value) + (align)-1) & ~((align)-1))
src/utils/utils_common.h: : (((value) + (align)-1) < (value) ? 0 : ALIGN_UP((value), (align))))
src/utils/utils_common.h:#define ALIGN_DOWN(value, align) ((value) & ~((align)-1))
test/provider_fixed_memory.cpp: test_alloc_failure((size_t)-1, 0, UMF_RESULT_ERROR_OUT_OF_HOST_MEMORY, 0);
test/providers/provider_cuda.cpp: create_cuda_prov_params(ctx, (CUdevice)-1, UMF_MEMORY_TYPE_HOST, 0);
... so I guess -1
can be an accepted solution here too. Let's leave as is.
} | ||
|
||
ze_memory_provider_t *ze_provider = provider; | ||
umf_result_t ret; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lack of initialization will trigger Coverity warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, UMF lives with it quite well:
$ git grep "umf_result_t ret;"
examples/memspace_hmat/memspace_hmat.c: umf_result_t ret;
examples/memspace_numa/memspace_numa.c: umf_result_t ret;
src/ctl/ctl.c: optional_umf_result_t ret;
src/memspace.c: umf_result_t ret;
src/pool/pool_jemalloc.c: umf_result_t ret;
src/pool/pool_jemalloc.c: umf_result_t ret;
src/pool/pool_jemalloc.c: umf_result_t ret;
src/provider/provider_devdax_memory.c: umf_result_t ret;
src/provider/provider_file_memory.c: umf_result_t ret;
src/provider/provider_fixed_memory.c: umf_result_t ret;
src/provider/provider_level_zero.c: umf_result_t ret;
src/provider/provider_os_memory.c: umf_result_t ret;
src/provider/provider_tracking.c: umf_result_t ret;
test/ctl/ctl_api.cpp: umf_result_t ret;
test/ipcFixtures.hpp: umf_result_t ret;
test/ipcFixtures.hpp: umf_result_t ret;
test/ipc_negative.cpp: umf_result_t ret;
test/poolFixtures.hpp: umf_result_t ret;
Leaving as is.
#include <gmock/gmock.h> | ||
#include <vector> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
different include group
gmock
-> 3rd party
vector
-> system
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let's don't be too pedantic. I see no value for Intel in changing this particular order.
|
||
void *lib_handle; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alone data member+new lines, move to the end of declaration under private:
specifier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A space after "TearDown", then "private:", then a new field is also a three lines.
I understand some people want always to see "private:" keyword, but it is just a test. Again please let's don't focus on so insignificant details.
UMF part of https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/supported/sycl_ext_oneapi_peer_access.asciidoc feature.
UR/llvm part is in intel/llvm#19257