microddp
Principles
Syllabus
Intro
What is distributed training?
Manual
Manually split the batch across two “GPUs” and average gradients.
Sandbox
Play with all-reduce ops.
All-Reduce
Various implementations
.
Lab
.
Performance Analysis
When is DDP worth it, how well does it scale?