Skip to content

Commit

Permalink
Update data redistribution process doc.
Browse files Browse the repository at this point in the history
  • Loading branch information
cindytsai committed Jul 7, 2023
1 parent 3d3c33b commit f80eabf
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 1 deletion.
18 changes: 17 additions & 1 deletion doc/HowItWorks.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,4 +75,20 @@ The changes made will be brought to the following round of analysis.
todo

## Data Redistribution Process
todo

Each MPI process contains one simulation code and one Python instance.
Each Python instance only has direct access to the data on local computing nodes.
During in situ Python analysis, workloads may be decomposed and rebalanced according
to the algorithm in Python packages.
It is not necessary to align with how data is distributed in simulation.
Furthermore, there is no way for `libyt` to know what kind of communication pattern a Python script needs for a much more general case. And it is difficult to schedule point-to-point communications that fit any kind of algorithms and any number of MPI processes.

`libyt` use one-sided communication in MPI, also known as Remote Memory Access (RMA), by which one no longer needs to explicitly specify senders and receivers.
`libyt` first collects what data is needed in each process, and the processes prepare the data requested.
Then it creates a RMA epoch, for which all MPI processes will enter, and each process can fetch the data
located on different processes without explicitly waiting for the remote process to respond.
It only needs to know which MPI process should it go to get the data.
The caveat in data redistribution process in `libyt` is that it is a collective operation, and requires every
MPI process to participate, otherwise, the process will hang there and wait for the others.

![](./assets/svgs/RMA.svg)
4 changes: 4 additions & 0 deletions doc/assets/svgs/RMA.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit f80eabf

Please sign in to comment.