NVIDIA / Megatron-LM Public

Notifications You must be signed in to change notification settings
Fork 3k
Star 13k

Code
Issues 304
Pull requests 157
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: NVIDIA/Megatron-LM

Labels 24 Milestones 0

New pull request New

157 Open 363 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

docs: improve code comments module: documentation

#1712 opened Jul 26, 2025 by lorinlee

Loading…

Fix a typo on README git checkout module: documentation

#1705 opened Jul 24, 2025 by GindaChen

Loading…

BugFix: FP8 Communication Mismatch with --first-last-layers-bf16 in tp-comm-overlap bug

Something isn't working

module: transformer engine

#1703 opened Jul 24, 2025 by xiaomin-D

Loading…

Add Support for Packed Sequence Format in GPT Training module: training

#1696 opened Jul 17, 2025 by sbhavani • Draft

Fix the typo in readme module: documentation

#1695 opened Jul 17, 2025 by DNXie

Loading…

Align import to existing module module: data pipeline

#1692 opened Jul 15, 2025 by AlexanderLavelle

Loading…

fix(mtp logging): Correctly accumulate MTP loss for logging when log_interval > 1 module: moe

#1684 opened Jul 11, 2025 by Luowaterbi

Loading…

Update pretrain_mamba.py bug

Something isn't working

module: documentation

#1682 opened Jul 11, 2025 by vignesh1507

Loading…

[feat, moe] Add support for global aux loss module: moe

#1681 opened Jul 11, 2025 by Victarry

Loading…

Issue 1672 fix: initializing the current pointed with int64 to avoid … bug

Something isn't working

#1673 opened Jul 7, 2025 by sharanmayank

Loading…

Support 1f1b a2a overlap module: distributed

#1671 opened Jul 7, 2025 by lhb8125

Loading…

moe: remove unused variable scale_up module: moe

#1670 opened Jul 6, 2025 by WineChord

Loading…

Speed up model parallel initialization module: distributed

#1662 opened Jul 2, 2025 by alexqdh

Loading…

Update README.md module: documentation

#1660 opened Jul 2, 2025 by 21jun

Loading…

bug fixed: wandb artifact requires the tracker file module: debugging

#1654 opened Jun 27, 2025 by yezhengmao1

Loading…

Apply roll operation to position_ids in MTP module: moe

#1651 opened Jun 26, 2025 by iansheng

Loading…

fix twice allgather in moe distrib optimizer module: moe

#1645 opened Jun 23, 2025 by irobot2013-why

Loading…

Fix log-timer-to-tensorboard on logging module: debugging

#1631 opened Jun 13, 2025 by wplf

Loading…

Fix typos: vritual → virtual and decoeder → decoder module: documentation

#1626 opened Jun 11, 2025 by EricLabile

Loading…

Fix: Apply q_layernorm consistently in MLA LoRA path module: fine-tuning

#1624 opened Jun 11, 2025 by Flink-ddd

Loading…

fix: when using moe parallel folding feature and set etp > 1 && ep == 1, the grad sync is incorrect and the loss curve is bad bug

Something isn't working

module: moe

#1622 opened Jun 10, 2025 by Louis-J

Loading…

Set weights_only=False in optimizer module: optimizer

#1618 opened Jun 9, 2025 by zhic-mt

Loading…

use a cpu set to cache cuda tensor finished_request_ids module: inference

#1610 opened Jun 5, 2025 by ladyrick

Loading…

Add DistTrain, Allow Encoder to Have Different DP Size module: multimodal

#1605 opened May 30, 2025 by zidanehuang001

Loading…

add node_rank argument for example scripts module: training

#1604 opened May 30, 2025 by xylllllllll

Loading…

Previous 1 2 3 4 5 6 7 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!