Skip to content

Conversation

@junrushao
Copy link
Member

@junrushao junrushao commented Sep 29, 2021

In unittests, we establish a "faked" RPC tracker/runner locally, but we forgot to wait until the server process is set up, which causes flakiness on mainline.

https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/1815/pipeline

Thanks @vinx13 for reporting! CC @zxybazh

Copy link

@shingjan shingjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM. Would 0.5s be enough?

@junrushao
Copy link
Member Author

@shingjan Yes it's far more than enough as I experimented with @zxybazh weeks ago, but I did some refactoring when upstreaming the codebase, which accidentally dropped this line...

@zxybazh
Copy link
Member

zxybazh commented Sep 29, 2021

LGTM. Thanks for the fix.

@tqchen
Copy link
Member

tqchen commented Sep 29, 2021

Interesting, under popen impl, we should have wait until we get the related fields(where socket already binds) https://github.com/apache/tvm/blob/main/python/tvm/rpc/tracker.py#L450

Same thing for server, so i wonder why wait is still needed

@tqchen
Copy link
Member

tqchen commented Sep 29, 2021

OK, answering my own Q, this might be needed for server to connect to tracker.

@junrushao
Copy link
Member Author

@tqchen right, the server needs some time to talk to the tracker

@junrushao junrushao merged commit 677f2d4 into apache:main Sep 30, 2021
AndrewZhaoLuo added a commit to AndrewZhaoLuo/tvm that referenced this pull request Sep 30, 2021
* main: (80 commits)
  Introduce centralised name transformation functions (apache#9088)
  [OpenCL] Add vectorization to cuda conv2d_nhwc schedule (apache#8636)
  [6/6] Arm(R) Ethos(TM)-U NPU codegen integration with `tvmc` (apache#8854)
  [microTVM] Add wrapper for creating project using a MLF (apache#9090)
  Fix typo (apache#9156)
  [Hotfix][Testing] Wait for RPCServer to be established (apache#9150)
  Update find cublas so it search default path if needed. (apache#9149)
  [TIR][LowerMatchBuffer] Fix lowering strides when source region has higher dimension than the buffer (apache#9145)
  Fix flaky NMS test by making sure scores are unique (apache#9140)
  [Relay] Merge analysis/context_analysis.cc and transforms/device_annotation.cc (apache#9038)
  [LLVM] Make changes needed for opaque pointers (apache#9138)
  Arm(R) Ethos(TM)-U NPU codegen integration (apache#8849)
  [CI] Split Integration tests out of first phase of pipeline (apache#9128)
  [Meta Schedule][M3b] Runner (apache#9111)
  Fix Google Mock differences between Ubuntu 18.04 and 16.04 (apache#9141)
  [TIR] add loop partition hint pragma (apache#9121)
  fix things (apache#9146)
  [Meta Schedule][M3a] SearchStrategy (apache#9132)
  [Frontend][PyTorch] support for quantized conv_transpose2d op (apache#9133)
  [UnitTest] Parametrized test_conv2d_int8_intrinsics (apache#9143)
  ...
@areusch
Copy link
Contributor

areusch commented Oct 4, 2021

it would be great for these fixes if, in the future, they could come with a comment explaining why we're adding sleep :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants