Skip to content

WeeklyTelcon_20220531

Geoffrey Paulsen edited this page May 31, 2022 · 1 revision

Open MPI Weekly Telecon ---

  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • Brendan Cunningham (Cornelis Networks)
  • Brian Barrett (AWS)
  • Edgar Gabriel (UoH)
  • Geoffrey Paulsen (IBM)
  • Hessam Mirsadeghi (UCX/nVidia)
  • Howard Pritchard (LANL)
  • Josh Fisher (Cornelis Networks)
  • Josh Hursey (IBM)
  • Matthew Dosanjh (Sandia)
  • Thomas Naughton (ORNL)
  • Todd Kordenbrock (Sandia)
  • Tommy Janjusic (nVidia)

not there today (I keep this for easy cut-n-paste for future notes)

  • Akshay Venkatesh (NVIDIA)
  • Artem Polyakov (nVidia)
  • Aurelien Bouteiller (UTK)
  • Austen Lauria (IBM)
  • Brandon Yates (Intel)
  • Charles Shereda (LLNL)
  • Christoph Niethammer (HLRS)
  • David Bernhold (ORNL)
  • Erik Zeiske
  • Geoffroy Vallee (ARM)
  • George Bosilca (UTK)
  • Harumi Kuno (HPE)
  • Jeff Squyres (Cisco)
  • Joseph Schuchart
  • Joshua Ladd (nVidia)
  • Marisa Roman (Cornelius)
  • Mark Allen (IBM)
  • Matias Cabral (Intel)
  • Michael Heinz (Cornelis Networks)
  • Nathan Hjelm (Google)
  • Noah Evans (Sandia)
  • Raghu Raja (AWS)
  • Ralph Castain (Intel)
  • Sam Gutierrez (LLNL)
  • Scott Breyer (Sandia?)
  • Shintaro iwasaki
  • William Zhang (AWS)
  • Xin Zhao (nVidia)

v4.1.x

  • v4.1.4 Released!
    • A dozen bugfixes
    • UCC backported
  • v4.1.5 - Soft Schedule targeting Novemberish
    • No driver on schedule yet.

v5.0.x

  • Schedule:
    • Blockers are still the same.
    • PRRTE blocker -
    • Right now looking like late summer (Us not having a PRRTE release for Packager to package)
      • Call for help - If anyone has resources to help, we can move this release date much sooner.
      • Requires investment from us.
    • Blockers are listed Some are in the PRRTE project
    • Any Alternatives?
      • The problem for Open MPI is not that PRRTE isn't ready to release. The parts we use, works great, but other parts still have issues (namely DVM)
      • Because we install PMIx and PRRTE as if they came from their own tarballs.
        • This leaves Packagers no good way to distribute Open MPI.
      • How do we install PMIx and PRRTE in open-mpi/lib instead and get all of the rpaths correct?
      • This might be the best bet (aside from fixing PRRTE ources of course)
  • Several Backported PRs
  • New issue opened on Performance when oversubscribed.
  • New issue topology issues when mapping by topology cache L3.

Main branch

  • Please HELP!
    • Performance test default selection of Tuned vs HAN
    • Brian hasn't (and might not for a while) have time to send out instructions on how to test.
      • Can anyone send out these instructions?
    • Call for folks to performance test at 16 nodes, and at whatever "makes sense" for them.
  • Accelerator stuff that William is working on, should be able to get out of draft.
    • Edgar has been working on ROCME component of Framework
    • Post v5.0.0? Originally was shouldn't since release was close, but if it slips to end of summer, we'll see ...
  • Can anyone who understands packaging review: https://github.com/open-mpi/ompi/pull/10386 ?
  • Automate 3rd Party minimum version checks into a txt file that both
    • configure and docs could read from a common file.
    • config.py runs at beginning of Sphynx and could read in files, etc.
    • Still iterating on.

MTT

Face-to-face

Clone this wiki locally