ML Notes

home

❯

optimisation

❯

parallelism

❯

tensor parallel

❯

Megatron LM Update

Megatron-LM Update

13 Mar 20251 min read

  • paper

This paper builds upon the tensor parallel scheme introduced by Megatron-LM by adding two additional techniques:

  • Sequence parallelism
  • Selective activation recomputation

References

  • Reducing Activation Recomputation in Large Transformer Models

Graph View

Created with Quartz v4.5.0 © 2025