Skip to content

v0.4.0

Compare
Choose a tag to compare
@alculquicondor alculquicondor released this 05 Apr 20:54
· 42 commits to master since this release
c77dfcf

Changes since 0.3.0

  • Breaking changes
    • Removed v1 operator. If you want to use MPIJob v1, you can use the training-operator.
  • Support for suspending semantics. Third party controllers can leverage the suspend field to implement queuing and preemption for an MPIJob.
  • Support for the coscheduling plugins of the scheduler-plugins.
  • The operator supports multi-architecture (amd64, aarch64, and ppc64le).
  • Bug fixes
    • Fix support for elastic Horovod.

Acknowledgements

Special thanks to @tenzen-y for multiple contributions.
Thank you to all the contributors (in no particular order): @mimowo @adilhusain-s @davidLif @ArangoGutierrez @shaowei-su @ggaaooppeenngg @pugangxa @HeGaoYuan @Dimss @alculquicondor @terrytangyuan