Skip to content

WeeklyTelcon_20230725

Geoffrey Paulsen edited this page Jul 25, 2023 · 1 revision

Open MPI Weekly Telecon ---

  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • Tommy Janjusic (NVIDIA)
  • Geoff Paulsen (IBM)
  • Jeff Squiyres (Cisco)
  • Howard Pritchard (LANL)
  • Luke Robison (AWS)
  • Edgar Gabriel (AMD)
  • Quincey Koziol (AWS)
  • Thomas Naughton (ORNL)
  • Todd Kordenbrok (Sandia)
  • Tommy J (nVidia)

V4.1

  • Readying release Needed to make a change for external pmix - ompi that's out there (v4.1.5) does not support pmix version XX - had to make a change to support taht. Small number of other changes and bug fixes that made it to the branch, low impact release.

V5.0

Old/New Issues:

  • https://github.com/open-mpi/ompi/issues/11532

    • Got good discussion on what needs to be done.
    • Jeff wrote his notes at the bottom.
  • Update mpirun man page #11730 - Quincey / Jeff, need first part to be completed by Jeff

    • Continues making good progress, iterating with Ralph.
  • https://github.com/open-mpi/ompi/issues/10099 Edgar - oing to push the first part of the patch, second part (Fortran stuff) will need help from Jeff.

    • Done - v5.0.x PR - Dying

    • Geoff will follow up the IBM-CI

    • IMPORTANT part of ABI claim - ABI for C We support Fortran ABI, except 3 minor cases, and assuming you're using the Fortran Compilers.

  • https://github.com/open-mpi/ompi/pull/11818/ - need to see what George says, if he nixes the current proposal we're going to open up PR to address the structure issues.

    • All sorted out. Done.
    • Needs to go back to v5.0.x
  • https://github.com/open-mpi/ompi/issues/7668 - review the current output and updated changes

    • This is WHY we need to document many of the new behaviors in v5.0.x
    • Behavior has changed, and need to document the new behavior.
  • https://github.com/open-mpi/ompi/issues/11734 - need to review and close (could be addressed), big one is how to build with ompi v5.0 *

  • https://github.com/open-mpi/ompi/issues/11831

    • Change to cuda component causes some issues.
    • Delayed init DID work... but when mixed with the cuda component disqualifying itself causes issues.
      • May be a blocker???
  • https://github.com/open-mpi/ompi/issues/11798

    • Ompi v4.1. cuda mem cpy issue: 11798 (Tommy)
    • OMPIO should be able to write from Device Memory directly into File.
    • This used to work.
    • Reported this against v4.1.x
  • https://github.com/open-mpi/ompi/issues/10657

    • Open table in Docs
    • If you're going to launch with mpirun/PRTE - requires PRTE v3
      • Because then we're locked into PMIx v4.1.2
    • Earlier we'd said we'd really like to support earlier versions of PMIx?
      • Potentially pre-installed on the system.
    • Direct Launch - you could go all the way back to PMIx v3 but would have to configure without sessions and without ULFM.
      • ULFM does need newer than sessions needs.
    • Don't like to make decisions based on versions, want to make decisions based on capabilities
    • Keep the table the same for both direct launch and mpirun launch (unless we need to)
      • Meeting on this tomorrow.
Clone this wiki locally