Work in Progress

❯

❯

language models

❯

tensor parallelism

tensor parallelism

Jan 30, 2025

technical

A technique used to fit a large model in multiple GPUs.

Each GPU processes a slice of a tensor and only aggregates the full tensor for operations requiring it.

Created with Quartz v4.4.0 © 2025

GitHub