MPI: work distribution and shared memory

elphmod.MPI.distribute(size, bounds=False, comm=<mpi4py.MPI.Intracomm object>, chunks=None)

Distribute work among processes.

elphmod.MPI.info(message, error=False, comm=<mpi4py.MPI.Intracomm object>)

Print status message from first process.

elphmod.MPI.load(filename, comm=<mpi4py.MPI.Intracomm object>)

Read and broadcast NumPy data.

elphmod.MPI.matrix(size, comm=<mpi4py.MPI.Intracomm object>)

Create sub-communicators.

elphmod.MPI.shared_array(shape, dtype=<class 'float'>, shared_memory=True, single_memory=False, comm=<mpi4py.MPI.Intracomm object>)

Create array whose memory is shared among all processes on same node.

With shared_memory=False (single_memory=True) a conventional array is created on each (only one) processor, which however allows for the same broadcasting syntax as shown below.

Example:

# Set up huge array:
node, images, array = shared_array(2 ** 30, dtype=np.uint8)

# Write data on one node:
if comm.rank == 0:
    array[:] = 0

# Broadcast data to other nodes:
if node.rank == 0:
    images.Bcast(array)

# Wait if node.rank != 0:
comm.Barrier()
                           ______ ______
                Figure:   |_0,_0_|_4,_0_|
                          |_1,_1_|_5,_1_|
comm.rank and node.rank   |_2,_2_|_6,_2_|
on machine with 2 nodes   |_3,_3_|_7,_3_|
with 4 processors each.    node 1 node 2

Because of the sorting key=comm.rank in the split functions below, comm.rank == 0 is equivalent to node.rank == images.rank == 0.