|
90 | 90 | # =============== |
91 | 91 | # ** File: worker/patch_common/patch_metrics.py ** |
92 | 92 | # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
93 | | -# 1. `vllm.spec_decode.metrics.AsyncMetricsCollector.maybe_collect_rejsample_metrics` |
| 93 | +# 1. `vllm.spec_decode.metrics.AsyncMetricsCollector._copy_rejsample_metrics_async` |
94 | 94 | # Why: |
95 | 95 | # There are cuda hard code (current_platform.is_cuda_alike()) in |
96 | | -# `AsyncMetricsCollector.maybe_collect_rejsample_metrics` |
| 96 | +# `AsyncMetricsCollector._copy_rejsample_metrics_async` |
97 | 97 | # How: |
98 | 98 | # Change to use `current_platform.Event` to determine whether to return None |
99 | | -# Related PR (if no, explain why): 1. refused by vllm. 2. vllm doesn't support 3. prepare to submit.... |
100 | | -# https://github.com/vllm-project/vllm/pull/14411 |
| 99 | +# Related PR (if no, explain why): |
| 100 | +# Need a PR to vllm to fix the issue. |
101 | 101 | # Future Plan: |
102 | 102 | # Revert it when the related pr is merged in vllm. |
103 | 103 | # |
|
110 | 110 | # However float32 is not supported in cann rope op, thus we keep this patch |
111 | 111 | # How: |
112 | 112 | # Removed the dtype convert operations in forward |
113 | | -# Related PR (if no, explain why): 1. refused by vllm. 2. vllm doesn't support 3. prepare to submit.... |
| 113 | +# Related PR (if no, explain why): |
114 | 114 | # NO, only for npu due to rope op. |
115 | 115 | # Future Plan: |
116 | 116 | # Keep this patch in vllm-ascend. |
|
126 | 126 | # - support attention metadata register to the set supported spec decode |
127 | 127 | # - offer a api in platform to determine whether spec decode is supported, |
128 | 128 | # and deprecate is_cuda_alike in it. |
129 | | -# Related PR (if no, explain why): 1. refused by vllm. 2. vllm doesn't support 3. prepare to submit.... |
| 129 | +# Related PR (if no, explain why): |
130 | 130 | # - https://github.com/vllm-project/vllm/pull/15195 |
131 | 131 | # - https://github.com/vllm-project/vllm-ascend/pull/395 |
132 | 132 | # Future Plan: |
|
138 | 138 | # vLLM `Remove Sampler from Model Code` so vllm-ascend needs adapt to this change. |
139 | 139 | # How: |
140 | 140 | # Use vLLM 0.8.4 method to patch it. |
141 | | -# Related PR (if no, explain why): 1. refused by vllm. 2. vllm doesn't support 3. prepare to submit.... |
| 141 | +# Related PR (if no, explain why): |
142 | 142 | # - https://github.com/vllm-project/vllm/pull/15195 |
143 | 143 | # - https://github.com/vllm-project/vllm-ascend/pull/395 |
144 | 144 | # Future Plan: |
|
153 | 153 | # `FlashAttentionMetadata` |
154 | 154 | # How: |
155 | 155 | # ditto |
156 | | -# Related PR (if no, explain why): 1. refused by vllm. 2. vllm doesn't support 3. prepare to submit.... |
| 156 | +# Related PR (if no, explain why): |
157 | 157 | # - https://github.com/vllm-project/vllm/pull/15195 |
158 | 158 | # - https://github.com/vllm-project/vllm-ascend/pull/395 |
159 | 159 | # Future Plan: |
|
0 commit comments