SafeMPI API Reference

The SafeMPI module provides distributed reference management for MPI-based parallel computing.

Core Types

SafePETSc.SafeMPI.DRefType
DRef{T}

A distributed reference to an object of type T that is managed across MPI ranks.

When all ranks have released their references via garbage collection, the object is collectively destroyed on all ranks using the type's destroy_obj! method.

Constructor

DRef(obj::T; manager=default_manager[]) -> DRef{T}

Create a distributed reference to obj. The type T must opt-in to distributed management by defining destroy_trait(::Type{T}) = CanDestroy() and implementing destroy_obj!(obj::T).

Finalizers automatically enqueue releases when the DRef is garbage collected. Call check_and_destroy!() to perform the actual collective destruction.

Example

# Define a type that can be managed
struct MyDistributedObject
    data::Vector{Float64}
end

SafeMPI.destroy_trait(::Type{MyDistributedObject}) = SafeMPI.CanDestroy()
SafeMPI.destroy_obj!(obj::MyDistributedObject) = println("Destroying object")

# Create a distributed reference
ref = DRef(MyDistributedObject([1.0, 2.0, 3.0]))
# ref.obj accesses the underlying object
# When ref is garbage collected and check_and_destroy!() is called, the object is destroyed

See also: DistributedRefManager, check_and_destroy!, destroy_trait

SafePETSc.SafeMPI.DistributedRefManagerType
DistributedRefManager

Manages reference counting and collective destruction of distributed objects across MPI ranks.

Every rank keeps an identical counter_pool/free_ids state and runs the same ID allocation algorithm simultaneously, so there is no special root role. Finalizers simply enqueue release IDs locally. At safe points (check_and_destroy!), ranks Allgather pending releases, update mirrored counters deterministically, and destroy ready objects together, pushing the released IDs back into free_ids on every rank for reuse.

See also: DRef, check_and_destroy!, default_manager

Reference Management

SafePETSc.SafeMPI.check_and_destroy!Function
check_and_destroy!(manager=default_manager[]; max_check_count::Integer=1)

MPI Collective

Perform garbage collection and process pending object releases, destroying objects when all ranks have released their references.

This function must be called explicitly to allow controlled cleanup points in the application. It performs a full garbage collection to trigger finalizers, then processes all pending release messages and collectively destroys objects that are ready.

The max_check_count parameter controls throttling: the function only performs cleanup every max_check_count calls. This reduces overhead in tight loops.

Example

SafeMPI.check_and_destroy!()  # Process releases immediately
SafeMPI.check_and_destroy!(max_check_count=10)  # Only cleanup every 10th call

See also: DRef, DistributedRefManager

SafePETSc.SafeMPI.destroy_obj!Function
destroy_obj!(obj)

Trait method called to collectively destroy an object when all ranks have released their references. Types that opt-in to distributed reference management must implement this method.

Example

SafeMPI.destroy_obj!(obj::MyType) = begin
    # Perform collective cleanup (e.g., free MPI/PETSc resources)
    cleanup_resources(obj)
end

See also: DRef, destroy_trait

SafePETSc.SafeMPI.default_managerConstant
default_manager

The default DistributedRefManager instance used by all DRef objects unless explicitly overridden. Automatically initialized when the module loads.

SafePETSc.SafeMPI.default_checkConstant
default_check

Reference to the default throttle count for check_and_destroy! calls. Set this to control how often automatic cleanup occurs during object creation. Default value is 10.

Example:

SafePETSc.default_check[] = 100  # Only cleanup every 100 object creations

Trait System

SafePETSc.SafeMPI.CanDestroyType
CanDestroy <: DestroySupport

Trait indicating that a type can be managed by DRef and supports collective destruction. Types must opt-in by defining destroy_trait(::Type{YourType}) = CanDestroy().

SafePETSc.SafeMPI.destroy_traitFunction
destroy_trait(::Type) -> DestroySupport

Trait function determining whether a type can be managed by DRef.

Returns CanDestroy() for types that opt-in to distributed reference management, or CannotDestroy() for types that don't support it (default).

Example

# Opt-in a custom type
SafeMPI.destroy_trait(::Type{MyType}) = SafeMPI.CanDestroy()

MPI Utilities

SafePETSc.SafeMPI.mpi_anyFunction
mpi_any(local_bool::Bool, comm=MPI.COMM_WORLD) -> Bool

MPI Collective

Collective logical OR reduction across all ranks in comm.

Returns true on all ranks if any rank has local_bool == true, otherwise returns false on all ranks. This is useful for checking whether any rank encountered an error or special condition.

Example

local_error = (x < 0)  # Some local condition
if SafeMPI.mpi_any(local_error)
    # At least one rank has an error, all ranks enter this branch
    error("Error detected on at least one rank")
end

See also: @mpiassert

SafePETSc.SafeMPI.mpi_uniformFunction
mpi_uniform(A) -> Bool

MPI Collective

Checks whether the value A is identical across all MPI ranks.

Returns true on all ranks if all ranks have the same value for A, otherwise returns false on all ranks. This is useful for verifying that distributed data structures are properly synchronized or that configuration values are consistent across all ranks.

The comparison is done by computing a SHA-1 hash of the serialized object on each rank and broadcasting rank 0's hash to all other ranks for comparison.

Example

# Verify that a configuration matrix is the same on all ranks
config = [1.0 2.0; 3.0 4.0]
SafeMPI.@mpiassert mpi_uniform(config) "Configuration must be uniform across ranks"

# Safe to use as a uniform object
config_petsc = Mat_uniform(config)

See also: @mpiassert, mpi_any

SafePETSc.SafeMPI.mpierrorFunction
mpierror(msg::AbstractString, trace::Bool; comm=MPI.COMM_WORLD, code::Integer=1)

MPI Collective

Best-effort MPI-wide error terminator that avoids hangs:

  • Prints [rank N] ERROR: msg on each process that reaches it
  • If trace is true, prints a backtrace
  • If MPI is initialized, aborts the communicator to cleanly stop all ranks (avoids deadlocks if other ranks are not in the same code path)
  • Falls back to exit(code) if MPI is not initialized or already finalized
SafePETSc.SafeMPI.@mpiassertMacro
@mpiassert cond [message]

MPI Collective

MPI-aware assertion that checks cond on all ranks and triggers collective error handling if any rank fails the assertion.

Each rank evaluates cond locally. If any rank has cond == false, all ranks are notified via mpi_any() and collectively enter error handling via mpierror(). Only ranks where the assertion failed will print a backtrace.

The assertion is skipped entirely if enable_assert[] == false (see set_assert).

Arguments

  • cond: Boolean expression to check (assertion passes when cond == true)
  • message: Optional custom error message (defaults to auto-generated message with file/line info)

Example

# Assert that all ranks have the same value
@mpiassert SafeMPI.mpi_uniform(A) "Matrix A must be uniform across ranks"

# Assert a local condition that must hold on all ranks
@mpiassert n > 0 "Array size must be positive"

See also: mpi_any, mpierror, set_assert

Configuration

SafePETSc.SafeMPI.set_assertFunction
set_assert(x::Bool) -> nothing

MPI Non-Collective

Enable (true) or disable (false) MPI assertion checks via @mpiassert.

Example

SafeMPI.set_assert(false)  # Disable assertions
SafeMPI.set_assert(true)   # Re-enable assertions