You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/en/features/shardformer.md
+71-3Lines changed: 71 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -214,17 +214,83 @@ In addition, xFormers's `cutlass_op` can serve as a backup for flash attention.
214
214
Enabling `Shardformer` through `Booster` initialized with `HybridParallelPlugin` is the recommended way to awaken the power of Shardformer.
215
215
The main reason is that pipeline parallelism cannot successfully work without the calling of `execute_pipeline` method of `Booster`. Besides, `HybridParallelPlugin` provides the capacity to combine the features of `Shardformer` with other useful features, such as mixed precision training or Zero.
216
216
217
-
More details about this usage can be found in chapter [Booster API](../basics/booster_api.md) and [Booster Plugins](../basics/booster_plugins.md).
217
+
[Here](https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/bert) is an example on how to trigger `Shardformer` through `HybridParallelPlugin`. Move to the root directory of this example, and execute
Then you can start finetuning a bert model wrapped by `Shardformer`. The process of wrapping is operated by `HybridParallelPlugin`.
222
+
223
+
Let's delve into the code of `finetune.py`:
224
+
225
+
In the `main` function, the plugin is created through the following codes:
226
+
```python
227
+
...
228
+
elif args.plugin =="hybrid_parallel":
229
+
# modify the param accordingly for finetuning test cases
230
+
plugin = HybridParallelPlugin(
231
+
tp_size=1,
232
+
pp_size=2,
233
+
num_microbatches=None,
234
+
microbatch_size=1,
235
+
enable_all_optimization=True,
236
+
zero_stage=1,
237
+
precision="fp16",
238
+
initial_scale=1,
239
+
)
240
+
```
241
+
Here you can change the configuration of plugin by setting `tp_size`, `pp_size` or `zero_stage` to other values. More details about plugin configuration can be found in [Booster Plugins Doc](../basics/booster_plugins.md).
242
+
243
+
If pipeline parallel is not enabled, just do the training in the same way of other booster plugins(first boost with Booster, then do forward and backward through normal way).
244
+
However, if pipeline parallel is enabled, there are several usages different from other normal cases:
245
+
246
+
1. Before doing forward or backward, the criterion function (loss function) is processed to meet the argument demand of running pipeline:
247
+
```python
248
+
def_criterion(outputs, inputs):
249
+
outputs = output_transform_fn(outputs)
250
+
loss = criterion(outputs)
251
+
return loss
252
+
```
253
+
254
+
2. In `train_epoch` function, dataloader is converted into `Iterator`class before running pipeline:
255
+
```python
256
+
train_dataloader_iter =iter(train_dataloader)
257
+
```
218
258
219
-
[Here](https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/bert) is an example on how to trigger `Shardformer` through `HybridParallelPlugin`. Please be aware that there's a difference in the way of doing forward and backward between the situation of using pipeline and not using pipeline.
259
+
3. Do forward and backward passing through calling `Booster.execute_pipeline` method:
Backward passing has been completed by this method, so there is no need to call `loss.backward()` after executing this method.
266
+
More details about `Booster.execute_pipeline` can be found in [Booster API Doc](../basics/booster_api.md).
220
267
221
268
222
269
#### 2. Enabling Shardformer Through Shardformer APIs (Not Recommended)
223
270
224
271
You can also use Shardformer through manually calling Shardformer APIs. However, this usage isnot recommended since pipeline parallelism can't run without `Booster`.
is an example on how to trigger `Shardformer` through calling Shardformer APIs.
274
+
is an example on how to trigger `Shardformer` through calling Shardformer APIs. In the `train` function of example code, the model is wrapped by `Shardformer` through the following few codes:
275
+
```python
276
+
...
277
+
if dist.get_world_size() >1:
278
+
tp_group = dist.new_group(backend="nccl")
279
+
280
+
# First create configuration for Shardformer
281
+
shard_config = ShardConfig(
282
+
tensor_parallel_process_group=tp_group,
283
+
enable_tensor_parallelism=True,
284
+
enable_all_optimization=True
285
+
)
286
+
287
+
# Then create ShardFormer object with created config
更多关于这一用法的细节可以参考 [Booster API 文档](../basics/booster_api.md)以及[Booster 插件文档](../basics/booster_plugins.md)。[这里](https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/bert)是一个通过`HybridParallelPlugin`启动`Shardformer`的示例。
0 commit comments