A singleton class that deals with all aspects of parallelization within Alpine3D (MPI, OpenMP, PETSc) Any class that wishes to utilize the MPI/OpenMP/PETSc capabilities should include this header and make sure that the respective cmake compilation options (called MPI, OPENMP, PETSC) are activated. The methods are designed in a transparent way also returning meaningful values if the compilation options were not activated, e. g. bool master() will return true if MPI was not activated and if it was activated then it will return false on all processes that are not master and true solely for the master process. More...

#include <MPIControl.h>

Public Member Functions
bool	openmp () const

size_t	thread () const

size_t	max_threads () const

bool	master () const

size_t	master_rank () const

size_t	rank () const

size_t	size () const

std::string	name () const

void	barrier () const

void	gather (const int &send_value, std::vector< int > &receive_vector, const size_t &root=0)

template<class T >
void	allreduce_sum (T &obj)
	Adds up the values of type T from all processes and distributes the sum back to all processes. More...

template<class T >
void	broadcast (T &obj, const size_t &root=0)
	Broadcast an object via MPI or in case MPI is not activated don't do anything. More...

template<class T >
void	broadcast (std::vector< T > &vec_obj, const size_t &root=0)
	Broadcast a vector of class T objects via MPI or in case MPI is not activated don't do anything. More...

template<class T >
void	scatter (std::vector< T *> &vec_local, const size_t &root=0)
	Scatter the objects pointed to by vector<T*> by slices to all preocesses. Internally no MPI_Scatterv or the like is used, because the size of the strings may become extremely large. Thusly internally blocking send and receive calls are utilized. In case MPI is not activated vec_local is not changed in any way. More...

template<class T >
void	send (const std::vector< T *> &vec_local, const size_t &destination, const int &tag=0)
	Send the objects pointed to by vector<T*> to process #destination. More...

template<class T >
void	send (const std::vector< T > &vec_local, const size_t &destination, const int &tag=0)
	Send the objects pointed to by vector<T*> to process #destination. More...

template<class T >
void	receive (std::vector< T *> &vec_local, const size_t &source, const int &tag=0)
	Receive vector of objects from process #source. More...

template<class T >
void	receive (std::vector< T > &vec_local, const size_t &source, const int &tag=0)
	Receive vector of objects from process #source. More...

template<class T >
void	gather (std::vector< T *> &vec_local, const size_t &root=0)
	Gathers the objects pointed to by vector<T> from all processes into a vector<T> on the root node. Internally no MPI_Gatherv or the like is used, because the size of the strings may become extremely large. Thusly internally blocking send and receive calls are utilized. In case MPI is not activated vec_local is not changed in any way. More...

void	getArraySliceParams (const size_t &dimx, size_t &startx_sub, size_t &nx_sub) const
	Helper to split an array for domain decomposition. Given the overall size of an array calculate a start index and block length for the subarray on the basis of the current MPI rank. More...

void	getArraySliceParams (const size_t &dimx, const size_t &idx_wk, size_t &startx_sub, size_t &nx_sub) const

void	getArraySliceParamsOptim (const size_t &dimx, size_t &startx_sub, size_t &nx_sub, const mio::DEMObject &dem, const mio::Grid2DObject &landuse)

void	getArraySliceParamsOptim (const size_t &dimx, const size_t &idx_wk, size_t &startx_sub, size_t &nx_sub, const mio::DEMObject &dem, const mio::Grid2DObject &landuse)
	Split the domain for MPI, balance the laod between MPI instances taking into account cells where no comutation are required. More...


void	allreduce_max (double &value)

void	allreduce_min (double &value)

void	allreduce_sum (double &value)

void	allreduce_sum (int &value)

Static Public Member Functions
static MPIControl &	instance ()

template<class T >
static void	deserialize (const void *in, const size_t &len, T &obj)

template<class T >
static size_t	serialize (void **out, const T &obj, const bool alloc=false)

template<class T >
static void	op_sum_func (void in, void out, int , MPI_Datatype datatype)

static void	getArraySliceParams (const size_t &dimx, const size_t &nbworkers, const size_t &idx_wk, size_t &startx_sub, size_t &nx_sub)
	Returns the parameters for splitting an array in several, balanced sub-arrays. This is mostly usefull for parallel calculations, where an array will be split and sent to different workers. More...

static void	getArraySliceParams (const size_t &dimx, const size_t &nbworkers, std::vector< size_t > &offset, std::vector< size_t > &nx)
	Returns the parameters for splitting an array in several, balanced sub-arrays. This is mostly usefull for parallel calculations, where an array will be split and sent to different workers. More...

Detailed Description

A singleton class that deals with all aspects of parallelization within Alpine3D (MPI, OpenMP, PETSc) Any class that wishes to utilize the MPI/OpenMP/PETSc capabilities should include this header and make sure that the respective cmake compilation options (called MPI, OPENMP, PETSC) are activated. The methods are designed in a transparent way also returning meaningful values if the compilation options were not activated, e. g. bool master() will return true if MPI was not activated and if it was activated then it will return false on all processes that are not master and true solely for the master process.

Author: Thomas Egger

Date: 2014-07-15

Member Function Documentation

◆ allreduce_max()

void MPIControl::allreduce_max ( double & value )

Combines values from all processes and distributes the result back to all processes.

allreduce_max distributes the maximum of all processes
allreduce_min distributes the minimum of all processes
allreduce_sum distributes the sum of all processes
Parameters

[in,out] value The value that is used to perform the reduction and to hold the result

◆ allreduce_min()

void MPIControl::allreduce_min ( double & value )

◆ allreduce_sum() [1/3]

void MPIControl::allreduce_sum ( double & value )

◆ allreduce_sum() [2/3]

void MPIControl::allreduce_sum ( int & value )

◆ allreduce_sum() [3/3]

template<class T >

void MPIControl::allreduce_sum ( T & obj )

inline

Adds up the values of type T from all processes and distributes the sum back to all processes.

Parameters

obj	The obj that is a summand of the global sum and will hold the result

Note: class T must support the extraction and insertion operators, >> and << and furthermore support the operator+

◆ barrier()

void MPIControl::barrier ( ) const

This method allows to synchronize all MPI processes. It blocks the calling process until all processes have called barrier()

◆ broadcast() [1/2]

template<class T >

void MPIControl::broadcast	(	T &	obj,
		const size_t &	root = `0`
	)

inline

Broadcast an object via MPI or in case MPI is not activated don't do anything.

Parameters

	obj	that is broadcasted from process root to all processes
[in]	root	The process rank that will commit the broadcast value, all others receive only

Note: Class T needs to have the serialize and deseralize operator << and >> implemented

◆ broadcast() [2/2]

template<class T >

void MPIControl::broadcast	(	std::vector< T > &	vec_obj,
		const size_t &	root = `0`
	)

inline

Broadcast a vector of class T objects via MPI or in case MPI is not activated don't do anything.

Parameters

	vec_obj	A vector of class T objects that shall be broadcasted, or if not root the vector that will hold the pointers to the objects broadcasted
[in]	root	The process rank that will commit the broadcast value, all others receive only

Note: Class T needs to have the serialize and deseralize operator << and >> implemented

◆ deserialize()

template<class T >

static void MPIControl::deserialize	(	const void *	in,
		const size_t &	len,
		T &	obj
	)

inlinestatic

This method is used when deserializing a class T from a void* representing a char*, instantiating an object from a string

Parameters

[in]	in	The void, which is actually a char
[in]	len	The length of the string
[out]	obj	The newly instantiated object

Note: class T must support the extraction and insertion operators, >> and <<

◆ gather() [1/2]

void MPIControl::gather	(	const int &	send_value,
		std::vector< int > &	receive_vector,
		const size_t &	root = `0`
	)

Gathers the integer send_values from all processes into a vector of integers on the root node. Index 5 of the receive_vector represents the send_value of process #5.

Parameters

[in]	send_value	The int value a process sends
[out]	receive_vector	The buffer for the root process to hold all the send_values
[in]	root	The process rank that will gather the send_values

◆ gather() [2/2]

template<class T >

void MPIControl::gather	(	std::vector< T *> &	vec_local,
		const size_t &	root = `0`
	)

inline

Gathers the objects pointed to by vector<T*> from all processes into a vector<T*> on the root node. Internally no MPI_Gatherv or the like is used, because the size of the strings may become extremely large. Thusly internally blocking send and receive calls are utilized. In case MPI is not activated vec_local is not changed in any way.

Parameters

[in,out]	vec_local	A vector of T* pointers to objects that shall be transmitted to root; if root then this vector will also hold the pointers to the gathered objects
[in]	root	The process rank that will commit the broadcast value, all others receive only

Note: Class T needs to have the serialize and deseralize operator << and >> implemented

◆ getArraySliceParams() [1/4]

void MPIControl::getArraySliceParams	(	const size_t &	dimx,
		size_t &	startx_sub,
		size_t &	nx_sub
	)		const

inline

Helper to split an array for domain decomposition. Given the overall size of an array calculate a start index and block length for the subarray on the basis of the current MPI rank.

Parameters

[in]	dimx	The overall size of the array to split up
[out]	startx_sub	The start of the subarray for the given processor rank
[out]	nx_sub	The block length of the subarray for the given processor rank

◆ getArraySliceParams() [2/4]

void MPIControl::getArraySliceParams	(	const size_t &	dimx,
		const size_t &	idx_wk,
		size_t &	startx_sub,
		size_t &	nx_sub
	)		const

inline

◆ getArraySliceParams() [3/4]

void MPIControl::getArraySliceParams	(	const size_t &	dimx,
		const size_t &	nbworkers,
		const size_t &	idx_wk,
		size_t &	startx_sub,
		size_t &	nx_sub
	)

static

Returns the parameters for splitting an array in several, balanced sub-arrays. This is mostly usefull for parallel calculations, where an array will be split and sent to different workers.

Parameters

[in]	dimx	number of cells in the desired dimension
[in]	nbworkers	total number of slices
[in]	idx_wk	current slice index (starting at 0)
[out]	startx_sub	calculated start index for the current slice
[out]	nx_sub	calculated number of cells (in the desired dimension) of the current slice

◆ getArraySliceParams() [4/4]

static void MPIControl::getArraySliceParams	(	const size_t &	dimx,
		const size_t &	nbworkers,
		std::vector< size_t > &	offset,
		std::vector< size_t > &	nx
	)

static

Returns the parameters for splitting an array in several, balanced sub-arrays. This is mostly usefull for parallel calculations, where an array will be split and sent to different workers.

Parameters

[in]	dimx	number of cells in the desired dimension
[in]	nbworkers	total number of slices
[out]	offset	calculated start index for all slices
[out]	nx	calculated number of cells (in the desired dimension) of all slice

◆ getArraySliceParamsOptim() [1/2]

void MPIControl::getArraySliceParamsOptim	(	const size_t &	dimx,
		size_t &	startx_sub,
		size_t &	nx_sub,
		const mio::DEMObject &	dem,
		const mio::Grid2DObject &	landuse
	)

inline

◆ getArraySliceParamsOptim() [2/2]

void MPIControl::getArraySliceParamsOptim	(	const size_t &	dimx,
		const size_t &	idx_wk,
		size_t &	startx_sub,
		size_t &	nx_sub,
		const mio::DEMObject &	dem,
		const mio::Grid2DObject &	landuse
	)

Split the domain for MPI, balance the laod between MPI instances taking into account cells where no comutation are required.

Parameters

[in]	dimx	size in x of the DEM (not necessary anymore, but retained to have simmilar call than getArraySliceParams)
[in]	idx_wk	MPI id of the current instance
[in]	startx_sub	Where the number of columns to compute for this instance will be stored
[in]	nx_sub	Where the starting point for this instance will be stored
[in]	dem	DEM used in the model
[in]	landuse	Landuse grid used in the model, nodata in landuse should indicate where no computaiton is required

◆ instance()

MPIControl & MPIControl::instance ( )

static

Returns the single instance of this class. If an instance wasn't initialized yet, it is instantiated.

Returns: a reference to the single instance

Note: MPI_Init and PetscInitialize are called entirely with NULL arguments

◆ master()

bool MPIControl::master ( ) const

Returns whether the calling process is the master process or not

Returns: true if caller is master, false otherwise

◆ master_rank()

size_t MPIControl::master_rank ( ) const

Returns the rank of the master process

Returns: rank of the master process

◆ max_threads()

size_t MPIControl::max_threads ( ) const

Returns the total number of OpenMP threads or 1 if OpenMP has not been activated

Returns: number of threads

◆ name()

std::string MPIControl::name ( ) const

Returns the name of the processor of the calling process or "local" if MPI was not activated

Returns: string name of the processor

◆ op_sum_func()

template<class T >

static void MPIControl::op_sum_func	(	void *	in,
		void *	out,
		int *	,
		MPI_Datatype *	datatype
	)

inlinestatic

For custom objects of class T a custom sum function is required MPI_Op_create expects exactly this interface, thus it cannot be changed

Parameters

[in]	in	The void* representing a summand of the sum operation
[out]	out	The void* representing the result of the sum operation
[in]	datatype	A pointer to the custom datatype previously committed

Note: class T must support the extraction and insertion operators, >> and << and furthermore support the operator+

◆ openmp()

bool MPIControl::openmp ( ) const

Returns whether or not OpenMP has been activated

Returns: true if _OPENMP has been defined, false otherwise

◆ rank()

size_t MPIControl::rank ( ) const

Returns the rank of the calling process or 0 if MPI was not activated

Returns: rank number of the calling process

◆ receive() [1/2]

template<class T >

void MPIControl::receive	(	std::vector< T *> &	vec_local,
		const size_t &	source,
		const int &	tag = `0`
	)

Receive vector of objects from process #source.

Parameters

[in]	vec_local	A vector of T* pointers to receive the object pointers
[in]	source	The process rank that will send the objects
[in]	tag	Arbitrary non-negative integer assigned to uniquely identify a message

Note: Class T needs to have the serialize and deseralize operator << and >> implemented

◆ receive() [2/2]

template<class T >

void MPIControl::receive	(	std::vector< T > &	vec_local,
		const size_t &	source,
		const int &	tag = `0`
	)

Receive vector of objects from process #source.

Parameters

[in]	vec_local	A vector of T* pointers to receive the object pointers
[in]	source	The process rank that will send the objects
[in]	tag	Arbitrary non-negative integer assigned to uniquely identify a message

Note: Class T needs to have the serialize and deseralize operator << and >> implemented

◆ scatter()

template<class T >

void MPIControl::scatter	(	std::vector< T *> &	vec_local,
		const size_t &	root = `0`
	)

inline

Scatter the objects pointed to by vector<T*> by slices to all preocesses. Internally no MPI_Scatterv or the like is used, because the size of the strings may become extremely large. Thusly internally blocking send and receive calls are utilized. In case MPI is not activated vec_local is not changed in any way.

Parameters

[in,out]	vec_local	A vector of T* pointers to objects that shall be scattered; if root then this vector will also hold the pointers to the scattered objects
[in]	root	The process rank that will scatter the values, all others receive only

Note: Class T needs to have the serialize and deseralize operator << and >> implemented

◆ send() [1/2]

template<class T >

void MPIControl::send	(	const std::vector< T *> &	vec_local,
		const size_t &	destination,
		const int &	tag = `0`
	)

Send the objects pointed to by vector<T*> to process #destination.

Parameters

[in]	vec_local	A vector of T* pointers to objects that shall be sent
[in]	destination	The process rank that will receive the values
[in]	tag	Arbitrary non-negative integer assigned to uniquely identify a message

Note: Class T needs to have the serialize and deseralize operator << and >> implemented

◆ send() [2/2]

template<class T >

void MPIControl::send	(	const std::vector< T > &	vec_local,
		const size_t &	destination,
		const int &	tag = `0`
	)

Send the objects pointed to by vector<T*> to process #destination.

Parameters

[in]	vec_local	A vector of T* pointers to objects that shall be sent
[in]	destination	The process rank that will receive the values
[in]	tag	Arbitrary non-negative integer assigned to uniquely identify a message

Note: Class T needs to have the serialize and deseralize operator << and >> implemented

◆ serialize()

template<class T >

static size_t MPIControl::serialize	(	void **	out,
		const T &	obj,
		const bool	alloc = `false`
	)

inlinestatic

This method is used when serializing a class T, turning an object into a string

Parameters

[out]	out	The void, which is actually a char
[in]	obj	The object to serialize
[in]	alloc	allocate memory?

Returns: The length of the string

Note: class T must support the extraction and insertion operators, >> and <<

◆ size()

size_t MPIControl::size ( ) const

Returns the size of the world in usage, 1 if MPI was not activated

Returns: the size of the MPI world

◆ thread()

size_t MPIControl::thread ( ) const

Returns the current OpenMP thread number or 0 if OpenMP has not been activated

Returns: current thread number

The documentation for this class was generated from the following files:

Public Member Functions

Static Public Member Functions

Detailed Description

Member Function Documentation

◆ allreduce_max()

◆ allreduce_min()

◆ allreduce_sum() [1/3]

◆ allreduce_sum() [2/3]

◆ allreduce_sum() [3/3]

◆ barrier()

◆ broadcast() [1/2]

◆ broadcast() [2/2]

◆ deserialize()

◆ gather() [1/2]

◆ gather() [2/2]

◆ getArraySliceParams() [1/4]

◆ getArraySliceParams() [2/4]

◆ getArraySliceParams() [3/4]

◆ getArraySliceParams() [4/4]

◆ getArraySliceParamsOptim() [1/2]

◆ getArraySliceParamsOptim() [2/2]

◆ instance()

◆ master()

◆ master_rank()

◆ max_threads()

◆ name()

◆ op_sum_func()

◆ openmp()

◆ rank()

◆ receive() [1/2]

◆ receive() [2/2]

◆ scatter()

◆ send() [1/2]

◆ send() [2/2]

◆ serialize()

◆ size()

◆ thread()