-
v0.17.0.post1
Bugfix: make Horovodrun work with ABI-incompatible frameworks
-
v0.17.0
MPI-less Horovod alpha, TensorFlow 2.0 support, MXNet 1.5.0 support, g++ autodetection, RDMA in Dockerfile
-
v0.16.4
Improved optimizer.skip_synchronize() API for NVIDIA AMP
-
v0.16.3
Intel MLSL support, improvements to gradient clipping & NVIDIA AMP performance, bugfixes
-
v0.16.2
Apache MXNet 1.4.1 compatibility, improvements to coordination performance at ultra-large scale, stall message improvements, bugfixes
-
-
v0.16.0
PySpark, Apache MXNet, autotuning, TensorFlow eager execution, mixed-precision & embedding improvements
-
v0.15.2
Force allreduce of all gradients in step(), bugfixes
-
-
-
-
-
v0.13.11
Add compatibility with PyTorch 0.4.1
-
v0.13.10
Support for IBM PowerAI DDL & APIs to restore optimizer state
-
v0.13.8
Critical Bugfix: PyTorch must wait for GPU data before allreduce
-
v0.13.7
Critical Bugfix: non-fused allreduce produces incorrect results
-
-
v0.13.5
Fix PyTorch master break - use proper THTensor_storage() API
-
v0.13.4
Bugfix mpi4py: Create a private MPI communicator
-
v0.13.3
Collective control plane & other low latency improvements