Capability
Distributed Collective Operations And Tensor Utilities
2 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
Easy distributed training — abstracts PyTorch distributed, DeepSpeed, FSDP behind simple API.
Unique: Abstracts backend-specific collective operation APIs (DDP's all_reduce, FSDP's scatter_full_optim_state_dict, DeepSpeed's communication hooks) behind a unified interface, and includes automatic tensor type handling (e.g., converting to float32 for all-reduce if needed)
vs others: More convenient than raw PyTorch distributed operations and more backend-agnostic than backend-specific APIs; includes RNG synchronization utilities that raw PyTorch doesn't provide