Skip to content

WeeklyTelcon_20220322

Geoffrey Paulsen edited this page Mar 22, 2022 · 1 revision

Open MPI Weekly Telecon ---

  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • Austen Lauria (IBM)
  • David Bernhold (ORNL)
  • Edgar Gabriel (UoH)
  • Geoffrey Paulsen (IBM)
  • Harumi Kuno (HPE)
  • Howard Pritchard (LANL)
  • Jeff Squyres (Cisco)
  • Joseph Schuchart
  • Josh Hursey (IBM)
  • Thomas Naughton (ORNL)
  • Todd Kordenbrock (Sandia)
  • William Zhang (AWS)

not there today (I keep this for easy cut-n-paste for future notes)

  • Akshay Venkatesh (NVIDIA)
  • Artem Polyakov (nVidia)
  • Aurelien Bouteiller (UTK)
  • Brandon Yates (Intel)
  • Brendan Cunningham (Cornelis Networks)
  • Brian Barrett (AWS)
  • Charles Shereda (LLNL)
  • Christoph Niethammer (HLRS)
  • Erik Zeiske
  • Geoffroy Vallee (ARM)
  • George Bosilca (UTK)
  • Hessam Mirsadeghi (UCX/nVidia)
  • Joshua Ladd (nVidia)
  • Marisa Roman (Cornelius)
  • Mark Allen (IBM)
  • Matias Cabral (Intel)
  • Matthew Dosanjh (Sandia)
  • Michael Heinz (Cornelis Networks)
  • Nathan Hjelm (Google)
  • Noah Evans (Sandia)
  • Raghu Raja (AWS)
  • Ralph Castain (Intel)
  • Sam Gutierrez (LLNL)
  • Scott Breyer (Sandia?)
  • Shintaro iwasaki
  • Tomislav Janjusic (nVidia)
  • Xin Zhao (nVidia)

v4.1.x

  • Schedule: Shooting for v4.1.3 end of March/Q1.
    • Goal v4.1.3rc2 Today.
  • No other update.
  • New docs landing page on read-the-docs to say docs are NOT there.
  • Fortran Elemental - https://github.com/open-mpi/ompi/pull/10119
    • New MPI Errata
    • Better for users, and does NOT affect ABI.
    • Only in v4.1.3 and v5.0.0+

Read the Docs

  • merged to master last night.

  • Jeff sent email, and will either put into the wiki or in docs themselves

  • Jeff shared https://docs.open-mpi.org/

    • This will be for v5.0.0 and later
    • Links to older docs for v4.1.x and earlier
  • Also a mobile rendering

  • Think the docs configury is done.

    • If issues, slack or email devel
  • Tons of stuff that is ready, but lots of places

  • Thanks to Harumi for converting all of the man-pages

    • They look great!
    • They are cross-referenced now.
  • For developer, when you git clone, you'll now get a docs/ directory

    • There's an RST guide under developer's guide.
  • Now when you push a PR, there's a details link under the Read-the-docs CI under your PR, for you to preview that PR.

  • This is true for master and will be in a few weeks for v5.0.x release branches.

  • Going to let this soak a while on master, but hope to bring to v5.0.x after a bit more testing on master.

  • USES branch names, so this may be a driver to change master to main.

    • Branch name is in URL so might want to do this sooner before others cache urls.
  • Official Tarballs will have html and man-pages pre-built

  • Developers will need to install sphinx to generate html and man-pages.

    • Open MPI Developers guide has a page on how to install sphinx.
    • uses sphinx-build, and (like Make) it's stateful and only rebuilds changes.
  • Just open build/index.html locally in browser.

  • When you git clone, there IS not build directory. It can take 3-5 minutes to build build directory.

    • But if you want you can just rm -rf build.
  • When doing code changes PR against master, Good to do both Doc updates AND code changes in same PR,

    • But if you
  • What's the behavior if you don't have sphinx installed?

    • Configure will just skip building the docs.
    • BUT in this state, you won't be able to do make dist
  • Amazon is calling make dist in CI.

    • so CI should be covered.
    • This will test and fail CI if error in docs

v5.0.x

  • Schedule: v5.0.0rc4 Friday 3/25.
    • Need a pmix/prrte submodule pointer update.
    • Issues with --version and help file.
  • Could do cherry-pick of all the docs to v5.0.x
  • POSIX command line options with double-dashes, but also single dash -np for -n.
    • There's a PR in PRRTE to silently convert all of the single dash options to double dashes.
    • We do this conversion, then just call get_opt_long()
      • No need for warning, since we don't think we want to drop single dash options EVER.
    • All of this code lives in the ompi schizo, so we can do this.
    • Do we document these single dash options?
      • Need to document some of them, because it'd be very weird to not document the MPI specified ones.
    • Could just have a single line saying we do this, since we hope to maintain this long term
    • If we do mpirun -n in docs... but then also say -np is a synonym for -n.
  • Renewed interest in not breaking ABI/API/CLI backwards compatibility for 'C'
    • mca parameters,
    • Only documenting the double-dash options in docs, but still support single dash
      • TODO
    • We did break ABI for Fortran bindings, but this was deemed acceptable.
  • https://docs.open-mpi.org
    • When we push a v5.0.0 tag, there's a regular expression it will make this version available on the website.
  • Brian's trying to get the point where static builds work correctly with prte and pmix.
    • had to rewrite check_package macro (will port to OMPI) to use pkg_config properly.
    • If we have a pkg_config file, will use that, rather than grepping around for this.

Master

  • Some email on devel-core about renaming master to main might be painful for spac community.
    • Most people will be using 4.x
    • Most spac community users wont be using master except for some
    • Named the spac version of the branch devel.
    • Should not be impactful at all.
    • Howard will open PR against SPAC, should also give a heads up to easy-build. It should be very obvious, and should be a quick fix for repackagers.
    • If we give people time to change, they won't change until it breaks.
    • George mentioned that Github has branch aliasing, but thinks its an individual developer's setting in git
  • If you go to https://docs.open-mpi.org/ You'll notice that the name of the branch is in the URL.
    • Friday is the date of the change.
    • Community likes name main.
    • When we make the change, probably delete name master.
  • PRRTE seems to be busted when using SLURM launcher.

4.0.x

  • Dropping v4.0.x discussion from weekly meeting unless something exceptional.
  • Schedule: No schedule for v4.0.8 yet
  • Winding down v4.0.x, and after v5.0.x will stop
  • Really only want small changes reported by users.
  • New docs landing page on read-the-docs to say docs are NOT there.

MTT

  • Some Cisco Build Failures, haven't looked at yet.
  • A fix pending to workaround the IBM XL MTT build failure (compiler abort)
  • Issue 9919 - Thinks this common component should still be built.
    • Commons get built when it's likely their is a dependency.
    • Commons self-select if they should be built or not.
  • UCX one sided fixes merged to master/v5.0.x and many working now.

Face-to-face

Clone this wiki locally