Skip to content

WeeklyTelcon_20230307

Geoffrey Paulsen edited this page Mar 7, 2023 · 1 revision

Open MPI Weekly Telecon ---

  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • Brendan Cunningham (Cornelis Networks)
  • Brian Barrett (Amazon)
  • Edgar Gabriel (AMD)
  • Geoffrey Paulsen (IBM)
  • Howard Pritchard (LANL)
  • Jeff Squyres (Cisco)
  • Joseph Schuchart (UTK)
  • Josh Fisher (Cornelis Networks)
  • Luke Robison (Amazon)
  • Quincey
  • Todd Kordenbrock (Sandia)
  • Tomislav Janjusic (nVidia)
  • William Zhang (AWS)

Not here today, but keep here for easy cut-n-paste for future.

  • Austen Lauria (IBM)
  • Christoph Nietham
  • David Bernholdt
  • Josh Hursey (IBM)
  • Matthew Dosanjh (Sandia)
  • Thomas Naughton (ORNL)

New Items

  • New Blocker v5.0 issue: https://github.com/open-mpi/ompi/issues/11471

    • Can't launch more than default RADIX with non-ssh
    • Should be fixed soon.
    • Need new ISSUE for v5/main - ORTE RADIX
      • Please have a test that Turns RADIX to 1 and turn off treespawn
      • At least for Jenkins tests, since they're all on one node today. Isn't viable.
      • Add another MTT config.
  • new Issue #11448 - CUDA/HAN collective infinite loop (v5.0 Blocker)

  • MPIR Shim (https://github.com/openpmix/mpir-to-pmix-guide) went away.

    • Howard grabbed a fork of it.
    • May be some ecosystems as well (older/current?) tools only support MPIR, not newer PMIx_Tool API
    • There are a ton of tools that have MPIR, but have they converted and released a PMIx_Tools interface.
    • This is a new feature for Open MPI v5.0.0 (MPIR is gone, now use PMIx_Tools based debuggers/profilers.
    • Not packaged with Open MPI release.
    • Feels like as soon as we announce MPIR is going away, then people will complain loudly
    • Open MPI has CI that tests MPIR-shm with PMIx and PRRTE
    • A customer does have interest in MPIR-Shim for STAT and FLUX and other tools.
      • If no one else wants to put this somewhere, we'll probably put it somewhere, and link up what can be done with testing for Github Actions.
      • Because they'll want to use it.
    • Some docs now have broken links because it was removed.
    • What happened to MPIR-shim Wiki?
      • Don't know what was on there, might just be gone.

v4.1.x

  • released v4.1.5

v5.0.x

  • Will need another RC after we get submodule for PMIx/PRRTE
  • How is the documentation going?
    • Perhaps 40% through FAQ.
  • Issue #11347 Versioning is wrong in v5.0.x
  • Runtime docs stuff should be doable by end of the month.
  • We'd talked about supplying some docs about how HAN is great, and why we're enabling it for v5.0.0 by default.
    • Like to include instructions on how to reproduce as well for users.

Main branch

Administration Topics

Clone this wiki locally