Skip to content

WeeklyTelcon_20191112

Geoffrey Paulsen edited this page Nov 12, 2019 · 1 revision

Open MPI Weekly Telecon


  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • Geoffrey Paulsen (IBM)
  • Thomas Naughton (ORNL)
  • Todd Kordenbrock (Sandia)
  • Noah Evans (Sandia)
  • Artem Polyakov (Mellanox)
  • Austen Lauria (IBM)
  • Howard Pritchard (LANL)
  • Edgar Gabriel (UH)
  • Josh Hursey (IBM)
  • Matthew Dosanjh (Sandia)
  • William Zhang (AWS)
  • Jeff Squyres (Cisco)
  • Brendan Cunningham (Intel)
  • Brian Barrett (AWS)

not there today (I keep this for easy cut-n-paste for future notes)

  • Akshay Venkatesh (NVIDIA)
  • George Bosilca (UTK)
  • Michael Heinz (Intel)
  • David Bernhold (ORNL)
  • Harumi Kuno (HPE)
  • Brandon Yates (Intel)
  • Charles Shereda (LLNL)
  • Erik Zeiske
  • Joshua Ladd (Mellanox)
  • Mark Allen (IBM)
  • Matias Cabral (Intel)
  • Nathan Hjelm (Google)
  • Ralph Castain (Intel)
  • Tom Naughton
  • Xin Zhao (Mellanox)
  • mohan (AWS)

Agenda/New Business

Super Computing Next week:

  • Open Call - if you want 1 or 2 slides in BOF for your organization, need them by Thursday evening.
  • No CALL next week (11/19/2019)

New PRRTE launcher proposal on mailing list.

Thomas - took at look to make some high level observations

  • A comment about stability / testing.
  • There are not explicit testing for ORTE, but it gets tested in Open MPI CI / MTT.
  • PRRTE has less testing, because it's not directly testing.
  • PRRTE will be needed for PMIX community.
    • Probably same community, and a shame to double the effort.
  • Binding options are the same and are there in PRRTE.
  • Singleton and ??? frameworks not in PRRTE, because it's not needed in PMIX
    • A ticket open on singletons / PMIx_Spawn()
  • Georges code for reliable connections will get pushed upstream soon when it's ready.
  • DVM has switched over to PRRTE.

PRRTE / ORTE Discussion

  • Concerns about supporting another project.
    • It adds another level of overhead coordinating / synchronizing with PRRTE/PMIX community.
  • It is valuable to have a runtime system that's divorced from MPI.
    • Don't know how to balance these two, because they're both not wrong.
  • We need a launcher, but don't want to support more than one.
    • But we support many launchers like slurm, lsf, flux, etc.
    • Those other launchers have companies and organizations behind them, and they support them through Open MPI.
  • A compromise between the two, would be to create an ORTE with all MPI removed.
    • easy to make a dist tarball
    • would skirt many political issues
  • Shouldn't support both ORTE and PRRTE.
    • Need to use the high level interfaces for PMIx, so we can move from version to version.
    • So this gets back to making PMIx the first class citizen.
      • Has to happen if we are working away from ORTE.

OLD Discussion from previous weeks:

  • All of this in context in v5.0

  • Intel is no longer driving PRRTE work, and Ralph won't be available for PRRTE much either.

  • PRRTE will be a good PMIX developement environment, but no longer a focus to be a scale and robust launcher.

  • OMPI community could come into PRRTE, and put in production / scalability testing, features, etc.

  • Given that we have not been good at contributing to PRRTE (other than Ralph), there's another proposal

    • There's been a drift from ORTE / PRRTE, so transitioning is risky.
  • Step 1. Make PMIX a first class citizen

    • Still good to keep PMIX as a static framework (no more glue, but still under orte/mca/pmix, but basicly just passes through, and call PMIX_ calls directly.
    • Allows us to still have internal backup PMIx if no external PMIX is found.
  • Step 2. We can whittle down orte, since PMIX does much of this.

  • Two things PRRTE won't care about, is scale and all binding patterns.

  • Only recent versions of SLURM have PMIx

  • Need to continue to support ssh.

    • Not just core PMIx, still need daemons for SSH to work, but they're not part of PMIx.
    • Part of ORTE that we wouldn't be deleting.
  • What does Altair PbsPro and open source PbsPro do?

    • Torque is different than PbsPro
  • Are there OLD systems that we currently support that we still don't care, and could discontinue support in v5.x

    • Who supports PMIx, and who doesn't
  • If PMIx becomes a first class citizen and rest of code base just makes PMIx calls, how do we support these things?

    • mpirun would still have to launch orteds via plm.
    • srun wouldn't need
    • But this is how it works today. Torque doesn't support PMIx at all, but TM just launches ORTEDs
    • ALPS - aprun ./a.out - requires a.out to connect up to ALPS daemons.
      • Cray still supports PMI - someone would need to write a PMI -> PMIX adapter.
    • ORTE does not have the concept of persistant daemons
  • Is there a situation where we might have a launcher launching ortes and we'd need to relay pmix calls to the correct pmix server layer?

    • Generally we won't have that situation, since the launcher won't launch ORTEds.
  • George's work currently depends on PRRTE

    • If ORTEDs provides PMIx_Events, would that be enough?
      • No George needs PRRTE's fault-tollerant overlay network.
      • George will scope the effort to port that feature from PRRTE to ORTE.
  • ACTION - Please gather list of resource managers, and Tools that we care about supporting in Open-MPI v5.0.x

  • Today - Howard

    • Summary - make PMIx a first class citizen.
    • Then whittle away ORTE as much as possible.
    • We think the only one who uses PMI1, and PMI2 might be cray.
      • Howard doesn't think Cray's even going to go that direction, might be adopting pmix for future direciton. Good super computing question.
      • Most places will be whatever SLURM does.
      • What will MPICH do? suspect PMIx
    • Howard thinks that by the time Open-MPI v5 gets out
    • Is SLURM + PMIx dead? No, it's supported, just not all of the
  • George looked into scoping the amount of work to bring reliable overlay network from

    • PRRTE frameworks not in
  • Howard also brought up that Sessions only works with PRRTE right now, so would need to backport this as well.

  • Only thing that depends on PRRTE is Sessions, Reliable connections, and Resource allocation support. Thing Geoffry Valle was working on before. Howard will investigate.

  • William Zhang has not yet committed some graph code for reachability similar to usnic.
    • Brian/William will get with Josh Hursey to potentially test some more.
    • Not sure what wanted the behavior of the netlinks reachability component.
    • Wasn't detecting mark's
    • Linux is always going to give you local route before localhost. This is one place where using reachability framework changes behavior.
    • Options, where do we want to fix this?
    1. Truthfully (even if you use 192.168...., linux will route over localhost device).
      • can say if these two addresses are the same it's always "reachable".
      • could put this down in the frameworks itself.
      • This is what netmasks does today, they're the same, so have high reachability.
    2. Can handle it in the higher layers
    3. Users could always specify localhost, or we could specify it for them.
      • You don't use a device, you give the OS a hint of what device it should use.
    • What does usnic do?
      • usnic ONLY uses reachability for remote hosts.
    • We encourage customers to say ifinclude localhost.
      • This path has worked for years, and should probably keep it working.
  • What happens if I have 3 devices, loopback, and 2 eth not wired together
    • What if I say my source is 192, but dest is the 10. path.
    • regardless of what reachability tells us, will this actually work?
      • netlink will say it WONT work, but the OS will just make it work by routing over loopback device.
  • Probably right answer is to special case in the netlink module, to return a

Face to face

  • It's official! Portland Oregon, Feb 17, 2020.
    • Safe to begin booking travel now.
  • Please register on Wiki page, since Jeff has to register you.
  • Date looks good. Feb 17th right before MPI Forum
    • 2pm monday, and maybe most of Tuesday
    • Cisco has a portland facility and is happy to host.
    • But willing to step asside if others want to host.
    • about 20-30 min drive from MPI Forum, will probably need a car.

Infrastrastructure

Submodule prototype

  • No update 11/12

  • Can we just turn on locbot / probot until we can get AWS bot online? *

  • OMPI has been waiting for some git submodule work in Jenkins on AWS.

    • Need someone to have someone to figure out why Jenkins doesn't like Jeff's PR.
      • Anyone with github account for ompi team should have access.
      • PR 6821
      • Apparently Jenkin's isn't behaving as it should.
    • Three pieces: Jenkins, CI, bot.
      • AWS has a libfabirc setup like this for testing.
      • Issue is that they're reworking the design, and will rollout for both libfabric and open-mpi.
    • William Zhang talked to Brian
      • Not something AWS team will work on, but Brian will work on it.
    • Jeff will talk to Brian as well.
  • Howard and Jeff have access to Jenkins on AWS. Part of the problem is that we don't have much expertise on Jenkins/AWS.

    • William will probably be admining the Jenkins/AWS or communicating with those who will.
  • Merged --recurse-submodules update into ompi-scripts Jenkins script as first step. Let's see if that works.

  • Modular thread re-write (noah)

    • UGNI and Vader BTLs were getting better performance, not sure why.
    • For modular threading library, might be interesting to decide at compile time or runtime.
    • Previously similar things seemed to be related to ICACHE.
    • Howard will lok at.

Release Branches

Review v3.0.x Milestones v3.0.4

Review v3.1.x Milestones v3.1.4

  • release v3.0.5 and v3.1.5 tomorrow.

Review v4.0.x Milestones v4.0.2

  • v4.0.3 in the works.
    • Schedule: Originally end of january.
      • PR 1752 may drive an earlier release in case if UCX will be released sooner.
  • PR 7151 - enhancement -
  • UCX 1.7 release schedule - was an RC 1.
    • Artem can check.
    • There's a problem in Open MPI v4.0.2, that packagers will hit in UCX 1.7
      • PR 1752 may drive an earlier release in case if UCX will be released sooner.

v5.0.0

  • Schedule: April 2020?
    • Wiki - go look at items, and we should discuss a bit in weekly calls.
    • Some items:
      • MPI1 removed stuff.

Review Master Master Pull Requests

CI status

  • IBM's PGI test has NEVER worked. Is it a real issue or local to IBM.
    • Austen is looking into
  • Absoft 32bit fortran failures.

Depdendancies

PMIx Update

ORTE/PRRTE

  • No discussion this week.

MTT


Back to 2019 WeeklyTelcon-2019

Clone this wiki locally