Skip to content

WeeklyTelcon_20190319

Geoffrey Paulsen edited this page Mar 19, 2019 · 1 revision

Open MPI Weekly Telecon


  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • Geoff Paulsen
  • Jeff Squyres
  • Akshay Venkatesh (nVidia)
  • Edgar Gabriel
  • Josh Hursey
  • Ralph Castain
  • Todd Kordenbrock
  • Thomas Naughton
  • Michael Heinz (Intel)
  • Noah Evans (Sandia)

not there today (I keep this for easy cut-n-paste for future notes)

  • Dan Topa
  • David Bernholdt
  • Brian Barrett
  • Mike Heinz (Intel)
  • Jake Hemstad
  • Matthew Dosanjh
  • Howard Pritchard
  • Xin Zhao
  • Nathan Hjelm
  • Geoffroy Vallee
  • Joshua Ladd
  • Matias Cabral
  • George
  • Aravind Gopalakrishnan (Intel)
  • Dan Topa (LANL)
  • Arm (UTK)
  • Peter Gottesman (Cisco)
  • mohan

Agenda/New Business

  • NEW Ask George about Isuse: Overlapping Vector Datatype https://github.com/open-mpi/ompi/issues/5540

    • This is important. George is working on a patch.
    • If you're using complicated data types for real things, it's important.
    • Should it be back ported to release branches? Perhaps not, since only one customer has hit.
    • Is this a blocker for v4.0.1? Or v4.0.2? Not a blocker.
  • NEW Jeff will Open PR about StaleBot - https://github.com/apps/stale

    • https://github.com/open-mpi/ompi/pull/6495
    • We have a lot of OLD Issues, but they're so low priority that they won't get done.
      1. real bug, but no one will fix.
      2. bug WAS fixed, but no one noticed so should be closed.
      3. Work was abandoned for whatever reason.
    • Could just run Stalebot on github, but if we do we can't tweak behavior if we don't like.
      • We'd have to host our own, but we don't have anyone with NODE.js
      • Does have config file, with lots of tweakable options.
      • This would clear out our giant back-log.
      • Puts a label on their as closed for inactivity.
  • NEW Host Ordering fix to v3.0.x, v3.1.x, v4.0.x https://github.com/open-mpi/ompi/issues/6501

    • --host on command line, the ordering of the hosts were not ordered.
    • This Fix went into master. Do we want to bring it back to release branches?
    • Does this apply to the hostfile as well? - Yes. That seems like a common use-case
    • Is this a backwards compatibility issues? - No, since a specified ordering is a subset of a random ordering.
    • Seems like not too many people noticed, or just lived with this behavior.
    • Everyone on call liked PRing this to release branches, but want to see what Brian and Howard think.
  • NEW Giles openib issue: https://github.com/open-mpi/ompi/pull/6152

    • No one had any thoughts on.
    • Would like Mellanox to chime in and let us know if it's needed in v4.0.x
  • NEW George did you get any follow up with Season of Documentation?

    • Some. Making man pages better and finding a way to link man pages with examples would be really awesome.
    • In libfabric they make man pages in Markdown, and then in make dist, they convert it to nroff.
    • For user facing APIs, they use Sphynx - convert Markdown in comments to user facing HTML man pages.
  • Agenda for Face to face needs to be updated. Please put items on there.

    • The agenda is very light, and perhaps we should cancel this one.
    • Lets talk about this next week. If the agenda is still light, we may consider canceling this.
  • We should reconsider and talk about if we want to break away ORTE / PRTE.

    • A lot of risk, but a lot of people merit discussing. - one thing on face to face agenda that's critical
    • If we DONT do it NOW it may never get done (move to PRTE) since Ralph is retiring and has much of both sides of this in his head.
    • ULFM has invested in this
    • MPI_Sessions is tightly coupled.
    • We will discuss again next week.
    • It's on the face to face agenda, but needs to be discussed and decided.
  • Nathan Hjelm's day job will no longer involve Open MPI, so if you want him to review something, please check with him first.

  • Next face to face is San Jose - April 23-April25 @ Cisco -San Jose.

Minutes

Review v3.0.x Milestones v3.0.3

  • Brian is giving a talk at open fabrics conference. This week?

Review v3.1.x Milestones v3.1.0

  • Brian is giving a talk at open fabrics conference. This week?

Review v4.0.x Milestones v4.0.1

  • Looks like RC this week.
  • v4.0.1rc2 went out yesterday.
  • We forgot NEW update in rc2, so if we're happy with rc2 we'll post an rc3 and then release.

v5.0.0

  • Schedule: Delaying post Summer ***
  • Discussion of schedule depends on scope discussion
    • if we want to separate Orte out for that? Would be a bit past summer.
    • Giles has a prototype of PRTE replacing ORTE
  • Want to open up release-manager elections.
    • Now that we're delaying, will decide at face2face.
  • Now the possibility of v4.1 from master is a possibility
    • If we instead do a v4.1, some things we'd need fixed on master.
  • will discuss more at face to face.

Master

PMIx

  • Take a look at Gile's PRTE work. He may have done SOME of that. He should have done that all in PRTE layer, maybe just some MPI layer work remains.

MTT

  • IBM still has 10% failure rate and build issue. Please fix!!!

New topics

  • MPI Forum - nothing too substatial. MPI_Sessions getting a lot of tractions. Goal to get it done by next meeting. Need reading, and then vote, and another vote and another. So MPI Next would be in 2020 year. Language bindings, and some crazy proposals
  • Read MPI Forum link here: https://www.mpi-forum.org/

face to face -

  • how do we get more participation, and make MTT more meaningful?

Review Master Master Pull Requests

  • didn't discuss today.

Oldest PR

Oldest Issue


Status Updates:

Status Update Rotation

  1. Mellanox, Sandia, Intel
  2. LANL, Houston, IBM, Fujitsu
  3. Amazon,
  4. Cisco, ORNL, UTK, NVIDIA

Back to 2018 WeeklyTelcon-2018

Clone this wiki locally