Skip to content

Latest commit

 

History

History
912 lines (574 loc) · 43.5 KB

File metadata and controls

912 lines (574 loc) · 43.5 KB

Collective Array Operations

Global Arrays provide functions for collective array operations, targeting both whole arrays and patches (portions of global arrays). Collective operations require all the processes to make the call. In the underlying implementation, each process deals with its local data. These functions include

  • basic array operations,
  • linear algebra operations, and
  • interfaces to third party software packages.

Basic Array Operations

Global Arrays provide several mechanisms to manipulate contents of the arrays. One can set all the elements in an array/patch to a specific value, or as a special case set to zero. Since GA does not explicitly initialize newly created arrays, these calls are useful for initialization of an array/patch. (To fill the array with different values for each element, one can choose the one sided operation putor each process can initialize its local portion of an array/patch like ordinary local memory). One can also scale the array/patch by a certain factor, or copy the contents of one array/patch to another.

Whole Arrays

These functions apply to the entire array. The function

  • Fortran subroutine: ga_zero(g_a)
  • C: void GA_Zero(int g_a)
  • C++: void GA::GlobalArray::zero()

sets all the elements in the array to zero.

To assign a single value to all the elements in an array, use the function

  • Fortran subroutine: ga_fill(g_a, val)
  • C: void GA_Fill(int g_a, void *val)
  • C++: void GA::GlobalArray::fill(void *val)

It sets all the elements in the array to the value val. The val must have the same data type as that of the array.

The function

  • Fortran subroutine: ga_scale(g_a, val)
  • C: void GA_scale(int g_a, void *val)
  • C++: void GA::GlobalArray::scale(void *val)

scales all the elements in the array by factor val. Again the val must be the same data type as that of the array itself.

The above three functions are dealing with one global array, to set values or change all the elements together. The following functions are for copying data between two arrays.

The function

  • Fortran subroutine: ga_copy(g_a, g_b)
  • C: void GA_copy(int g_a, int g_b)
    • C++: void GA::GlobalArray::copy(const GA::GlobalArray * g_a)

copies the contents of one array to another. The arrays must be of the same data type and have the same number of elements.

For global arrays containing ghost cells, the ghost cell data can be filled in with the corresponding data from neighboring processors using the function

  • n-D Fortran subroutine: ga_copy(g_a, g_b)
  • C: void GA_copy(int g_a, int g_b)
  • C++: void GA::GlobalArray::copy(const GA::GlobalArray * g_a)
  • n-D Fortran subroutine: ga_update_ghosts(g_a)
  • C: void ga_update_ghosts(int g_a)
  • C++: void GA::GlobalArray::updateGhosts()

This operation updates the ghost cell data by assuming periodic, or wrap-around, boundary conditions similar to those described for thenga_periodic_get operations described above. The wrap-around conditions are always applied, it is up to the individual application to decide whether or not the data in the ghost cells should be used. The update operation is illustrated below for a simple 4x2 global array distributed across two processors. The ghost cells are one element wide in each dimension.

images/ghost012.png
images/ghost015.png
  • n-d Fortran logical function: nga_update_ghosts_dir(g_a, dimension, idir, flag)
  • C: int NGA_Update_ghosts_dir(int g_a, int dimension, int idir, int cflag)
  • C++: int GA::GlobalArray::updateGhostsDir(int dimension, int idir, int cflag)

This function can be used to update the ghost cells along individual directions.

It is designed for algorithms that can overlap updates with computation. The variable dimension indicates which coordinate direction is to be updated (e.g. dimension = 1 would correspond to the y axis in a two or three dimensional system), the variable idir can take the values +/-1 and indicates whether the side that is to be updated lies in the positive or negative direction, and cflag indicates whether or not the corners on the side being updated are to be included in the update. The following calls would be equivalent to a call to GA_Update_ghosts for a 2-dimensional system:

status = NGA_Update_ghost_dir(g_a,0,-1,1);
status = NGA_Update_ghost_dir(g_a,0,1,1);
status = NGA_Update_ghost_dir(g_a,1,-1,0);
status = NGA_Update_ghost_dir(g_a,1,1,0);

The variable cflag is set equal to 1 (or non-zero) in the first two calls so that the corner ghost cells are update, it is set equal to 0 in the second two calls to avoid redundant updates of the corners. Note that updating the ghosts cells using several independent calls to the nga_update_ghost_dir functions is generally not as efficient as using GA_Update_ghosts unless the individual calls can be effectively overlapped with computation. This is a collective operation.

Patches

GA provides a set of operations on segments of the global arrays, namely patch operations. These functions are more general, in a sense they can apply to the entire array(s). As a matter of fact, many of the Global Array collective operations are based on the patch operations, for instance, the GA_Printis only a special case of NGA_Print_patch, called by setting the bounds of the patch to the entire global array. There are two interfaces for Fortran, one for two dimensional and the other for n-dimensional (one to seven). The (n-dimensional) interface can surely handle the two dimensional case as well. It is available for backward compatibility purposes. The functions dealing with n-dimensional patches use the ngaprefix and those dealing with two dimensional patches start with the ga prefix.

The function

  • Fortran subroutine: nga_zero_patchnga_zero_patch(g_a, alo, ahi)
  • C: void NGA_Zero_patch(int g_a, int lo[] int hi[])
  • C++: void GA::GlobalArray::zeroPatch(int lo[] int hi[])

is similar to ga_zero, except that instead of applying to entire array, it sets only the region defined by lo and hi to zero.

One can assign a single value to all the elements in a patch with the function:

  • n-D Fortran subroutine: nga_fill_patch(g_a, lo, hi, val)
  • 2-D Fortran subroutine: ga_fill_patch(g_a, ilo, ihi, jlo, jhi, val)
  • C: void NGA_Fill_patch(int g_a, int lo[], int hi[], void *val)
  • C++: void GA::GlobalArray::fillPatch(int lo[], int hi[], void *val)

Thelo and hi defines the patch and the val is the value to set.

The function

  • n-D Fortran subroutine: nga_scale_patch(g_a, lo, hi, val)
  • 2-D Fortran subroutine: ga_scale_patch(g_a, ilo, ihi, jlo, jhi, val)
  • C void NGA_Scale_patch(int g_a, int lo[] int hi[], void *val)
  • C++: void GA::GlobalArray::scalePatch(int lo[], int hi[], void *val)

scales the patch defined by loand hi by the factor val.

The copy patch operation is one of the fundamental and frequently used functions. The function

  • n-D Fortran subroutine: nga_copy_patch(trans, g_a, alo, ahi, g_b, blo, bhi)
  • 2-D Fortran subroutine: ga_copy_patch(trans, g_a, ailo, aihi, ajlo, ajhi, g_b, bilo, bihi, bjlo, bjhi)
  • C: void NGA_Copy_patch(char trans, int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[])
  • C++: voidGA::GlobalArray::copyPatch(char trans, const GA::GlobalArray* g_a, int alo[], int ahi[], int blo[], int bhi[])

copies one patch defined by alo and ahi in one global array g_ato another patch defined by blo and bhi in another global array g_b. The current implementation requires that the source patch and destination patch must be on different global arrays. They must also be the same data type. The patches may be of different shapes, but the number of elements must be the same. During the process of copying, the transpose operation can be performed by specifying trans.

Example: Assume that there two 8x6 Global Arrays, g_a and g_b,distributed on three processes. The operation of nag_copy_patch(Fortran notation), from

g_a:alo={2,2},ahi={4,5} to g_b:blo={3,4},bhi={6,6} and trans = 0

involves reshaping. It is illustrated in the following figure.

images/copy1.png

One step further, if one also want to perform the transpose operation during the copying, i.e. set trans = 1, it will look like:

images/copy2.png

If there is no reshaping or transpose, the operation can be fast (internally calling nga_put). Otherwise, it would be slow (internally calling nga_scatter, where extra time is spent on preparing the indices). Also note that extra memory is required to hold the indices if the operation involves reshaping or transpose.

Linear Algebra

Global arrays provide three linear algebra operations: addition, multiplication, and dot product. There are two sets of functions, one for the whole array and the other for the patches.

Whole Arrays

The function

  • Fortran subroutine: ga_add(alpha, g_a, beta, g_b, g_c)
  • C: void GA_Add(void *alpha, int g_a, void *beta,int g_b, int g_c)
  • C++ void GA::GlobalArray::add(void *alpha, const GA::GlobalArray* g_a, void *beta, const GA::GlobalArray* g_b)

adds two arrays, g_a and g_b, and saves the results to g_c. The two source arrays can be scaled by certain factors. This operation requires the two source arrays have the same number of elements and the same data types, but the arrays can have different shapes or distributions.g_c can also be g_a or g_b. It is encouraged to use this function when the two source arrays are identical in distributions and shapes, because of its efficiency. It would be less efficient if the two source arrays are different in distributions or shapes.

Matrix multiplication operates on two matrices, therefore the array must be two dimensional. The function

  • Fortran subroutine: ga_dgemm(transa, transb, m, n, k, alpha, g_a, g_b, beta, g_c )
  • C: void GA_Dgemm(char ta, char tb, int m, int n, int k, double alpha, int g_a, int g_b, double beta, int g_c )
  • C++: void GA::GlobalArray::dgemm(char ta, char tb, int m, int n, int k, double alpha, const GA::GlobalArray* g_a, const GA::GlobalArray* g_b, double beta)

Performs one of the matrix-matrix operations:

C := alpha*op( A )*op( B ) + beta*C,

where op( X ) is one of

op( X ) = X or op( X ) = X',

alpha and beta are scalars, and A, B, and C are matrices, with op( A ) an m by k matrix, op( B ) a k by n matrix and C an m by n matrix.

On entry, transa specifies the form of op( A ) to be used in the matrix multiplication as follows:

ta = 'N' or 'n', op( A ) = A.

ta = 'T' or 't', op( A ) = A'.

The function

  • Fortran integer function: ga_idot(g_a, g_b)
    • double precision function ga_ddot(g_a, g_b)
    • double complex function ga_zdot(g_a, g_b)
  • C: long GA_Idot(int g_a, int g_b)
    • double GGA_DdotA_Ddot(int g_a, int g_b)
    • DoubleComplex GA_ZdotGA_Zdot(int g_a, int g_b)
  • C++: long GA::GlobalArray::idot(const GA::GlobalArray* g_a)
    • double GA::GlobalArray::ddot(const GA::GlobalArray* g_a)
    • DoubleComplex GA::GlobalArray::zdot(const GA::GlobalArray* g_a)

computes the element-wise dot product of two arrays. It is available as three separate functions, corresponding to integer, double precision and double complex data types.

The following functions apply to the 2-dimensional whole arrays only. There are no corresponding functions for patch operations.

The function

symmetrizes matrix A represented with handle g_a:A = .5 * (A+A').

The function

  • Fortran subroutine: ga_transpose(g_a, g_b)

  • C: void GA_Transpose(int g_a, int g_b)

  • C++: void GA::GlobalArray::transpose(const GA::GlobalArray* g_a)

    transposes a matrix: B = A'.

Patches

The functions

  • n-D Fortran subroutine: nga_add_patch(alpha, g_a, alo, ahi, beta, g_b, blo, bhi, g_c, clo, chi)
  • 2-D Fortran subroutine: ga_add_patch(alpha, g_a, ailo, aihi, ajlo, ajhi, beta, g_b, bilo, bihi, bjlo, bjhi, g_c, cilo, cihi, cjlo, cjhi)
  • C: void NGA_Add_patch(void *alpha, int g_a, int alo[], int ahi[], void *beta, int g_b, int blo[], int bhi[], int g_c, int clo[], int chi[])
  • C++: void GA::GlobalArray::addPatch(void *alpha, const GA::GlobalArray* g_a,
    int alo[], int ahi[], void *beta, const GA::GlobalArray* g_b, int blo[], int bhi[], int clo[], int chi[])

add element-wise two patches and save the results into another patch. Even though it supports the addition of two patches with different distributions or different shapes (the number of elements must be the same), the operation can be expensive, because there can be extra copies which effect memory consumption. The two source patches can be scaled by a factor for the addition. The function is smart enough to detect the case that the patches are exactly the same but the global arrays are different in shapes. It handles the case as if for the arrays were identically distributed, thus the performance will not suffer.

The matrix multiplication is the only operation on array patches that is restricted to the two dimensional domain, because of its nature. It works for double and double complex data types. The prototype is

  • Fortran subroutine: ga_matmul_patch(transa, transb, alpha, beta, g_a, ailo, aihi, ajlo, ajhi, g_b, bilo, bihi, bjlo, bjhi, g_c, cilo, cihi, cjlo, cjhi)
  • C: void GA_Matmul_patch(char *transa, char* transb, void* alpha, void *beta, int g_a, int ailo, int aihi, int ajlo, int ajhi, int g_b, int bilo, int bihi, int bjlo, int bjhi, int g_c, int cilo, int cihi, int cjlo, int cjhi)
  • C++: void GA::GlobalArray::matmulPatch(char *transa, char* transb, void* alpha, void *beta, const GlobalArray * g_a, int ailo, int aihi, int ajlo, int ajhi, const GlobalArray * g_b, int bilo, int bihi, int bjlo, int bjhi, int cilo, int cihi, int cjlo, int cjhi)

It performs

C[cilo:cihi,cjlo:cjhi] := alpha* AA[ailo:aihi,ajlo:ajhi] * BB[bilo:bihi,bjlo:bjhi] + beta*C[cilo:cihi,cjlo:cjhi]

where AA = op(A), BB = op(B), and op( X ) is one of X or X',

Valid values for transpose argument: 'n', 'N', 't', 'T'.

The dot operation computes the element-wise dot product of two (possibly transposed) patches. It is implemented as three separate functions, corresponding to integer, double precision and double complex data types. They are:

  • n-D Fortran integer function: nga_idot_patch(g_a, ta, alo, ahi, g_b, tb, blo, bhi)
    • double precision function ga_ddot_patch (g_a, ta, alo, ahi, g_b, tb, blo, bhi)
    • double complex function ga_zdot_patch (g_a, ta, alo, ahi, g_b, tb, blo, bhi)
  • 2-D Fortran integer function: ga_idot_patch(g_a, ta, ailo, aihi, ajlo, ailo, g_b, tb, bilo, bihi, bjlo, bjhi)
    • double precision function ga_ddot_patch (g_a, ta, ailo, aihi, ajlo, ailo, g_b, tb, bilo, bihi, bjlo, bjhi)
    • double complex function ga_zdot_patch (g_a, ta, ailo, aihi, ajlo, ailo, g_b, tb, bilo, bihi, bjlo, bjhi)
  • C: Integer NGA_Idot_patch(int g_a, char* ta, int alo[], int ahi[], int g_b, char* tb, int blo[], int bhi[])

    • double NGA_Ddot_patch(int g_a, char* ta, int alo[], int ahi[], int g_b, char* tb, int blo[], int bhi[])
    • DoubleComplex NGA_Zdot_patch(int g_a, char* ta, int alo[], int ahi[], int g_b, char* tb, int blo[], int bhi[])
  • C++: IntegerGA::GlobalArray::idotPatch(const GA::GlobalArray* g_a, char* ta, int alo[], int ahi[], char* tb, int blo[], int bhi[])

    • double GA::GlobalArray::ddotPatch(const GA::GlobalArray* g_a, char* ta, int alo[], int ahi[], char* tb, int blo[], int bhi[])
    • DoubleComplex GA::GlobalArray::zdotPatch(const GA::GlobalArray* g_a, char* ta, int alo[], int ahi[], char* tb, int blo[], int bhi[])

The patches should be of the same data types and have the same number of elements. Like the array addition, if the source patches have different distributions/shapes, or it requires transpose, the operation would be less efficient, because there could be extra copies and/or memory consumption.

Element-wise operations

These operations work on individual array elements rather than arrays as matrices in the sense of linear algebra operations. For example multiplication of elements stored in arrays is a completely different operation than matrix multiplication.

Take element-wise absolute value of the array.

Take element-wise absolute value of the patch.

Add the contant pointed by alpha to each element of the array.

Add the contant pointed by alpha to each element of the patch.

  • Fortran subroutine: ga_recip(g_a)
  • C: void GA_Recip(int g_a)
  • C++: void GA::GlobalArray::recip()

Take element-wise reciprocal of the array.

  • Fortran subroutine: ga_recip_patch(g_a, lo, hi)
  • C: void GA_Recip_patch(int g_a, int lo[], int hi[])
  • C++: void GA::GlobalArray::recipPatch(int lo[], int hi[])

Take element-wise reciprocal of the patch.

  • Fortran subroutine: ga_elem_multiply(g_a, g_b, g_c)
  • C: void GA_Elem_multiply(int g_a, int g_b, int g_c)
  • C++: void GA::GlobalArray::elemMultiply(const GA::GlobalArray * g_a, const GA::GlobalArray * g_b)

Computes the element-wise product of the two arrays which must be of the same types and same number of elements. For two-dimensional arrays,

c(i, j) = a(i,j)*b(i,j)

The result (c) may replace one of the input arrays (a/b).

  • Fortran subroutine: ga_elem_multiply_patch(g_a, alo, ahi, g_b, blo, bhi, g_c, clo,chi)
  • C: void GA_Elem_multiply_patch(int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[], int g_c, int clo[], int chi[])
  • C++: void GA::GlobalArray::elemMultiplyPatch( const GA::GlobalArray * g_a, int alo[], int ahi[], const GA::GlobalArray * g_b, int blo[], int bhi[], int clo[], int chi[])

Computes the element-wise product of the two patches which must be of the same types and same number of elements. For two-dimensional arrays,

c(i, j) = a(i,j)*b(i,j)

The result (c) may replace one of the input arrays (a/b).

  • Fortran subroutine: ga_elem_divide(g_a, g_b, g_c)
  • C: void GA_Elem_divide(Integer g_a, Integer g_b, Integer g_c)
  • C++: void GA::GlobalArray::elemDivide(const GA::GlobalArray * g_a, const GA::GlobalArray * g_b)

Computes the element-wise quotient of the two arrays which must be of the same types and same number of elements. For two-dimensional arrays,

c(i, j) = a(i,j)/b(i,j)

The result (c) may replace one of the input arrays (a/b). If one of the elements of array g_b is zero, the quotient for the element of g_c will be set to GA_NEGATIVE_INFINITY.

  • Fortran subroutine: ga_elem_divide_patch(g_a, alo, ahi, g_b, blo, bhi, g_c, clo, chi)
  • C: void GA_Elem_divide_patch(int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[], int g_c, int clo[], int chi[])
  • C++: void GA::GlobalArray::elemDividePatch( const GA::GlobalArray * g_a, int alo[], int ahi[], const GA::GlobalArray * g_b, int blo[], int bhi[], int clo[], int chi[])

Computes the element-wise quotient of the two patches which must be of the same types and same number of elements. For two-dimensional arrays,

c(i, j) = a(i,j)/b(i,j)

The result (c) may replace one of the input arrays (a/b).

  • Fortran subroutine: ga_elem_maximum(g_a, g_b, g_c)
  • C: void GA_Elem_maximum(Integer g_a, Integer g_b, Integer g_c)
  • C++: void GA::GlobalArray::elemMaximum(const GA::GlobalArray * g_a, const GA::GlobalArray * g_b)

Computes the element-wise maximum of the two arrays which must be of the same types and same number of elements. For two dimensional arrays,

c(i, j) = max{a(i,j), b(i,j)}

The result (c) may replace one of the input arrays (a/b).

  • Fortran subroutine: ga_elem_maximum_patch(g_a, alo, ahi, g_b, blo, bhi, g_c, clo, chi)
  • C: void GA_Elem_maximum_patch(int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[], int g_c, int clo[], int chi[])
  • C++: void GA::GlobalArray::elemMaximumPatch(const GA::GlobalArray * g_a, int alo[], int ahi[], const GA::GlobalArray * g_b, int blo[], int bhi[], int clo[], int chi[])

Computes the element-wise maximum of the two patches which must be of the same types and same number of elements. For two-dimensional of noncomplex arrays,

c(i, j) = max{a(i,j), b(i,j)}

If the data type is complex, then c(i, j).real = max{ |a(i,j)|, |b(i,j)|} while c(i,j).image = 0.

The result (c) may replace one of the input arrays (a/b).

  • Fortran subroutine: ga_elem_minimum(g_a, g_b, g_c)
  • C: void GA_Elem_minimum(Integer g_a, Integer g_b, Integer g_c);
  • C++: void GA::GlobalArray::elemMinimum(const GA::GlobalArray * g_a, const GA::GlobalArray * g_b)

Computes the element-wise minimum of the two arrays which must be of the same types and same number of elements. For two dimensional arrays,

c(i, j) = min{a(i,j), b(i,j)}

The result (c) may replace one of the input arrays (a/b).

  • Fortran subroutine: ga_elem_minimum_patch(g_a, alo, ahi, g_b, blo, bhi, g_c, clo, chi)
  • C: void GA_Elem_minimum_patch(int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[], int g_c, int clo[], int chi[])
  • C++: void GA::GlobalArray::elemMinimumPatch (const GA::GlobalArray * g_a, int alo[], int ahi[], const GA::GlobalArray * g_b, int blo[], int bhi[], int clo[], int chi[])

Computes the element-wise minimum of the two patches which must be of the same types and same number of elements. For two-dimensional of noncomplex arrays,

c(i, j) = min{a(i,j), b(i,j)}

If the data type is complex, then

c(i, j).real = min{ |a(i,j)|, |b(i,j)|} while c(i,j).image = 0.

The result (c) may replace one of the input arrays (a/b).

Adds this constant to the diagonal elements of the matrix.

  • Fortran subroutine: ga_set_diagonal(g_a, g_v)
  • C: void GA_Set_diagonal(int g_a, int g_v)
  • C++: void GA::GlobalArray::setDiagonal (const GA::GlobalArray * g_v)

Sets the diagonal elements of this matrix g_a with the elements of the vector g_v.

Sets the diagonal elements of this matrix g_a with zeros.

  • Fortran subroutine: ga_add_diagonal(g_a, g_v)
  • C: void GA_Add_diagonal(int g_a, int g_v)
  • C++: void GA::GlobalArray::addDiagonal(const GA::GlobalArray * g_v)

Adds the elements of the vector g_v to the diagonal of this matrix g_a.

  • Fortran subroutine: ga_get_diag(g_a, g_v)
  • C: void GA_Get_diag(int g_a, int g_v)
  • C++: void GA::GlobalArray::getDiagonal (const GA::GlobalArray * g_v)

Inserts the diagonal elements of this matrix g_a into the vector g_v.

  • Fortran subroutine: ga_scale_rows( g_a, g_v)
  • C: void GA_Scale_rows(int g_a, int g_v)
  • C++: void GA::GlobalArray::scaleRows (const GA::GlobalArray * g_v)

Scales the rows of this matrix g_a using the vector g_v.

  • Fortran subroutine: ga_scale_cols(g_a, g_v)
  • C: void GA_Scale_cols(int g_a, int g_v)
  • C++: void GA::GlobalArray::scaleCols (const GA::GlobalArray * g_v)

Scales the columns of this matrix g_a using the vector g_v.

  • Fortran subroutine: ga_norm1(g_a, nm)
  • C: void GA_Norm1(int g_a, double *nm)
  • C++: void GA::GlobalArray::norm1(double *nm)

Computes the 1-norm of the matrix or vector g_a.

Computes the 1-norm of the matrix or vector g_a.

  • Fortran subroutine: ga_median( g_a, g_b, g_c, g_m)
  • C: void GA_Median(int g_a, int g_b, int g_c, int g_m)
  • C++: void GA::GlobalArray::median(const GA::GlobalArray
    * g_a, const GA::GlobalArray * g_b, const GA::GlobalArray * g_c)

Computes the componentwise Median of three arrays g_a, g_b, and g_c, and stores the result in this array g_m. The result (m) may replace one of the input arrays (a/b/c).

  • Fortran subroutine: ga_median_patch(g_a, alo, ahi, g_b, blo, bhi, g_c, clo, chi, g_m,mlo, mhi)
  • C: void GA_Median_patch(int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[], int g_c, int clo[], int chi[], int g_m, int mlo[],int mhi[])
  • C++: void GA::GlobalArray::medianPatch(const GA::GlobalArray* g_a, int alo[], int ahi[], const GA::GlobalArray * g_b, int blo[], int bhi[], const GA::GlobalArray * g_c, int clo[], int chi[], int mlo[],int mhi[])

Computes the componentwise Median of three patches g_a, g_b, and g_c, and stores the result in this patch g_m. The result (m) may replace one of the input patches (a/b/c).

  • Fortran subroutine: ga_step_max(g_a, g_b, step)
  • C: void GA_Step_max(int g_a, int g_b, double *step)
  • C++: void GA::GlobalArray::stepMax(const GA::GlobalArray *g_a, double *step)

Calculates the largest multiple of a vector g_b that can be added to this vector g_a while keeping each element of this vector nonnegative.

  • Fortran subroutine: ga_step_max2( g_xx, g_vv, g_xxll, g_xxuu, step2)
  • C: void GA_Step_max2(int g_xx, int g_vv, int g_xxll, int g_xxuu, double *step2)
  • C++: void GA::GlobalArray::stepMax2(const GA::GlobalArray *g_vv, const GA::GlobalArray *g_xxll, const GA::GlobalArray *g_xxuu, double *step2)

Calculates the largest step size that should be used in a projected bound line search.

  • Fortran subroutine: ga_step_max_patch(g_a, alo, ahi, g_b, blo, bhi, step)
  • C: void GA_Step_max_patch(int g_a, int *alo, int *ahi, int g_b, int *blo, int *bhi, double *step)
  • C++: void GA::GlobalArray::stepMaxPatch(int *alo, int *ahi, const GA::GlobalArray * g_b, int *blo, int *bhi, double *step)

Calculates the largest multiple of a vector g_b that can be added to this vector g_a while keeping each element of this vector nonnegative.

  • Fortran subroutine: ga_step_max2_patch( g_xx, xxlo, xxhi, g_vv,vvlo, vvhi, g_xxll, xxlllo, xxllhi, g_xxuu, xxuulo, xxuuhi, step2)
  • C: void GA_Step_max2_patch(int g_xx, int *xxlo, int *xxhi, int g_vv, int *vvlo, int *vvhi, int g_xxll, int *xxlllo, int *xxllhi, int g_xxuu, int *xxuulo, int *xxuuhi, double *step2)
  • C++: void GA::GlobalArray::stepMax2Patch(int *xxlo, int *xxhi, const GA::GlobalArray * g_vv, int *vvlo, int *vvhi,const GA::GlobalArray * g_xxll, int *xxlllo, int *xxllhi, const GA::GlobalArray * g_xxuu, int *xxuulo, int *xxuuhi, double *step2)

Calculates the largest step size that should be used in a projected bound line search.

Interfaces to Third Party Software Packages

There are many existing software packages designed for solving engineering problems. They are specialized in one or two problem domains, such as solving linear systems, eigen-vectors, and differential equations, etc. Global Arrays provide interfaces to several of these packages.

Scalapack

Scalapack is a well known software library for linear algebra computations on distributed memory computers. Global Arrays uses this library to solve systems of linear equations and also to invert matrices.

The function

  • Fortran integer function: ga_solve(g_a, g_b)
  • C: int GA_Solve(int g_a, int g_b)
  • C++: int GA::GlobalArray::solve(const GA::GlobalArray * g_a)

solves a system of linear equations A * X = B. It first will call the Cholesky factorization routine and, if successful, will solve the system with the Cholesky solver. If Cholesky is not able to factorizeA, then it will call the LU factorization routine and will solve the system with forward/backward substitution. On exit B will contain the solution X.

The function

  • Fortran integer function: ga_llt_solve(g_a, g_b)
  • C: int GA_Llt_solve(int g_a, int g_b)
  • C++: int GA::GlobalArray::lltSolve(const GA::GlobalArray * g_a)

also solves a system of linear equations A * X = B, using the Cholesky factorization of an NxN double precision symmetric positive definite matrix A (handle g_a). On successful exit B will contain the solution X.

The function

  • Fortran subroutine: ga_lu_solve(trans, g_a, g_b)
  • C: void GA_Lu_solve(char trans, int g_a, int g_b)
  • C++: void GA::GlobalArray::luSolve(char trans, const GA::GlobalArray * g_a)

solves the system of linear equations op(A)X = B based on the LU factorization. op(A) = A or A' depending on the parameter trans. Matrix A is a general real matrix. Matrix B contains possibly multiple rhs vectors. The array associated with the handle g_b is overwritten by the solution matrix X.

The function

computes the inverse of a double precision matrix using the Cholesky factorization of a NxN double precision symmetric positive definite matrix A stored in the global array represented by g_a. On successful exit, A will contain the inverse.

PeIGS

The PeIGS library contains subroutines for solving standard and generalized real symmetric eigensystems. All eigenvalues and eigenvectors can be computed. The library is implemented using a message-passing model and is portable across many platforms. Global Arrays use this library to solve eigenvalue problems.

The function

  • Fortran subroutine: ga_diag(ga_diag(g_a, g_s, g_v, eval)
  • C: void GA_Diag(int g_a, int g_s, int g_v, void *eval)
  • C++: void GA::GlobalArray::diag (const GA::GlobalArray*g_s, const GA::GlobalArray* g_v, void *eval)

solves the generalized eigenvalue problem returning all eigenvectors and values in ascending order. The input matrices are not overwritten or destroyed.

The function

  • Fortran subroutine: ga_diag_reuse (control, g_a, g_s, g_v, eval)
  • C: void GA_Diag_reuse (int control, int g_a, int g_s, int g_v,void *eval)
  • C++: void GA::GlobalArray::diagReuse (int control, const GA::GlobalArray* g_s, const GA::GlobalArray*g_v, void *eval)

solves the generalized eigen-value problem returning all eigenvectors and values in ascending order. Recommended for REPEATED calls if g_s is unchanged.

The function

  • Fortran subroutine: ga_diag_std(g_a, g_v, eval)
  • C: void GA_Diag_std(int g_a, int g_v, void *eval)
  • C++: void GA::GlobalArray::diagStd( const GA::GlobalArray* g_v, void *eval)

solves the standard (non-generalized) eigenvalue problem returning all eigenvectors and values in the ascending order. The input matrix is neither overwritten nor destroyed.

Interoperability with Others

Global Arrays are interoperable with several other libraries, but do not provide direct interfaces for them. For example, one can make calls to and link with these libraries:

PETSc (the Portable, Extensible Toolkit for Scientific Computation) is developed by Argonne National Laboratory. PETSc is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations. It employs the MPI standard for all message-passing communication, and is written in a data-structure-neutral manner to enable easy reuse and flexibility. Here are the instructionsfor using PETSc with GA.

CUMULVS(Collaborative User Migration User Library for Visualization and Steering) is developed by the Oak Ridge National Laboratory. CUMULVS is a software framework that enables programmers to incorporate fault-tolerance, interactive visualization and computational steering into existing parallel programs. Here are theinstructions for using CUMULVS with GA.

Synchronization Control in Collective Operations

GA collective array operations are implemented by exploiting locality information to minimize or even completely avoid interprocessor communication or data copying. Before each processor accesses its own portion of the GA data we must assure that the data is in a consistent state. That means that there are no outstanding communication operations targeting that given global array portion pending while the data owner is accessing it. To accomplish that the GA collective array operations have implicit synchronization points: at the beginning and at the end of the operation. However, in many cases when collective array operations are called back-to-back or if the user does an explicit sync just before a collective array operation, some of the internal synchronization points could be merged or even removed if user can guarantee that the global array data is in the consistent state. The library offers a call for the user to eliminate the redundant synchronization points based on his/her knowledge of the application.

The function

  • Fortran subroutine: ga_mask_sync(prior_sync_mask,post_sync_mask)
  • C: void GA_Mask_sync(int prior_sync_mask,int post_sync_mask)
  • C++: void GA::GlobalArray::maskSync(int prior_sync_mask,int post_sync_mask)

This operation should be used with a lot of care and only when the application code has been debugged and the user wishes to tune its performance. Making a call to this function with prior_sync_mask parameter set to false disables the synchronization done at the beginning of first collective array operation called after a call to this function. Similarly, making a call to this function by setting the post_sync_mask parameter to false disables the synchronization done at the ending of the first collective array operation called after a call to this function.