Skip to content

Conversation

@hlu1
Copy link
Contributor

@hlu1 hlu1 commented Aug 19, 2019

Two optimizations:

  • set_input: use std::unordered_map to reduce lookup time, which could be significant where the number of inputs is large
  • set_input_zero_copy: use a vector to track the corresponding input arg dltensor* so we don't need to iterate through the inputs of all notes to look them up every single time. This is a followup to [Runtime] Enable set_input_zero_copy in GraphRuntime #3416

@yinghai, @tqchen please review

@hlu1 hlu1 force-pushed the minimize_set_input_overhead branch from 2567891 to e335da7 Compare August 20, 2019 19:44
Copy link

@yinghai yinghai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

this->SetupOpExecs();
for (size_t i = 0; i < input_nodes_.size(); i++) {
uint32_t nid = input_nodes_[i];
std::string& name = nodes_[nid].name;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const or just inline this into next line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@hlu1 hlu1 force-pushed the minimize_set_input_overhead branch from e335da7 to 43693b7 Compare August 20, 2019 23:19
@hlu1
Copy link
Contributor Author

hlu1 commented Aug 21, 2019

@ZihengJiang, could you also take a look please?

@tqchen tqchen merged commit 137bf5f into apache:master Aug 29, 2019
@tqchen
Copy link
Member

tqchen commented Aug 29, 2019

Thanks @hlu @yinghai this PR is now merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants