SafeMPI API Reference
The SafeMPI module provides distributed reference management for MPI-based parallel computing.
Core Types
SafePETSc.SafeMPI.DRef — Type
DRef{T}A distributed reference to an object of type T that is managed across MPI ranks.
When all ranks have released their references via garbage collection, the object is collectively destroyed on all ranks using the type's destroy_obj! method.
Constructor
DRef(obj::T; manager=default_manager[]) -> DRef{T}Create a distributed reference to obj. The type T must opt-in to distributed management by defining destroy_trait(::Type{T}) = CanDestroy() and implementing destroy_obj!(obj::T).
Finalizers automatically enqueue releases when the DRef is garbage collected. Call check_and_destroy!() to perform the actual collective destruction.
Example
# Define a type that can be managed
struct MyDistributedObject
data::Vector{Float64}
end
SafeMPI.destroy_trait(::Type{MyDistributedObject}) = SafeMPI.CanDestroy()
SafeMPI.destroy_obj!(obj::MyDistributedObject) = println("Destroying object")
# Create a distributed reference
ref = DRef(MyDistributedObject([1.0, 2.0, 3.0]))
# ref.obj accesses the underlying object
# When ref is garbage collected and check_and_destroy!() is called, the object is destroyedSee also: DistributedRefManager, check_and_destroy!, destroy_trait
SafePETSc.SafeMPI.DistributedRefManager — Type
DistributedRefManagerManages reference counting and collective destruction of distributed objects across MPI ranks.
Every rank keeps an identical counter_pool/free_ids state and runs the same ID allocation algorithm simultaneously, so there is no special root role. Finalizers simply enqueue release IDs locally. At safe points (check_and_destroy!), ranks Allgather pending releases, update mirrored counters deterministically, and destroy ready objects together, pushing the released IDs back into free_ids on every rank for reuse.
See also: DRef, check_and_destroy!, default_manager
Reference Management
SafePETSc.SafeMPI.check_and_destroy! — Function
check_and_destroy!(manager=default_manager[]; max_check_count::Integer=1)MPI Collective
Perform garbage collection and process pending object releases, destroying objects when all ranks have released their references.
This function must be called explicitly to allow controlled cleanup points in the application. It performs a full garbage collection to trigger finalizers, then processes all pending release messages and collectively destroys objects that are ready.
The max_check_count parameter controls throttling: the function only performs cleanup every max_check_count calls. This reduces overhead in tight loops.
Example
SafeMPI.check_and_destroy!() # Process releases immediately
SafeMPI.check_and_destroy!(max_check_count=10) # Only cleanup every 10th callSee also: DRef, DistributedRefManager
SafePETSc.SafeMPI.destroy_obj! — Function
destroy_obj!(obj)Trait method called to collectively destroy an object when all ranks have released their references. Types that opt-in to distributed reference management must implement this method.
Example
SafeMPI.destroy_obj!(obj::MyType) = begin
# Perform collective cleanup (e.g., free MPI/PETSc resources)
cleanup_resources(obj)
endSee also: DRef, destroy_trait
SafePETSc.SafeMPI.default_manager — Constant
default_managerThe default DistributedRefManager instance used by all DRef objects unless explicitly overridden. Automatically initialized when the module loads.
SafePETSc.SafeMPI.default_check — Constant
default_checkReference to the default throttle count for check_and_destroy! calls. Set this to control how often automatic cleanup occurs during object creation. Default value is 10.
Example:
SafePETSc.default_check[] = 100 # Only cleanup every 100 object creationsTrait System
SafePETSc.SafeMPI.DestroySupport — Type
DestroySupportAbstract type for the trait system controlling which types can be managed by DRef. See CanDestroy and CannotDestroy.
SafePETSc.SafeMPI.CanDestroy — Type
CanDestroy <: DestroySupportTrait indicating that a type can be managed by DRef and supports collective destruction. Types must opt-in by defining destroy_trait(::Type{YourType}) = CanDestroy().
SafePETSc.SafeMPI.CannotDestroy — Type
CannotDestroy <: DestroySupportTrait indicating that a type cannot be managed by DRef (default for all types).
SafePETSc.SafeMPI.destroy_trait — Function
destroy_trait(::Type) -> DestroySupportTrait function determining whether a type can be managed by DRef.
Returns CanDestroy() for types that opt-in to distributed reference management, or CannotDestroy() for types that don't support it (default).
Example
# Opt-in a custom type
SafeMPI.destroy_trait(::Type{MyType}) = SafeMPI.CanDestroy()MPI Utilities
SafePETSc.SafeMPI.mpi_any — Function
mpi_any(local_bool::Bool, comm=MPI.COMM_WORLD) -> BoolMPI Collective
Collective logical OR reduction across all ranks in comm.
Returns true on all ranks if any rank has local_bool == true, otherwise returns false on all ranks. This is useful for checking whether any rank encountered an error or special condition.
Example
local_error = (x < 0) # Some local condition
if SafeMPI.mpi_any(local_error)
# At least one rank has an error, all ranks enter this branch
error("Error detected on at least one rank")
endSee also: @mpiassert
SafePETSc.SafeMPI.mpi_uniform — Function
mpi_uniform(A) -> BoolMPI Collective
Checks whether the value A is identical across all MPI ranks.
Returns true on all ranks if all ranks have the same value for A, otherwise returns false on all ranks. This is useful for verifying that distributed data structures are properly synchronized or that configuration values are consistent across all ranks.
The comparison is done by computing a SHA-1 hash of the serialized object on each rank and broadcasting rank 0's hash to all other ranks for comparison.
Example
# Verify that a configuration matrix is the same on all ranks
config = [1.0 2.0; 3.0 4.0]
SafeMPI.@mpiassert mpi_uniform(config) "Configuration must be uniform across ranks"
# Safe to use as a uniform object
config_petsc = Mat_uniform(config)See also: @mpiassert, mpi_any
SafePETSc.SafeMPI.mpierror — Function
mpierror(msg::AbstractString, trace::Bool; comm=MPI.COMM_WORLD, code::Integer=1)MPI Collective
Best-effort MPI-wide error terminator that avoids hangs:
- Prints
[rank N] ERROR: msgon each process that reaches it - If
traceis true, prints a backtrace - If MPI is initialized, aborts the communicator to cleanly stop all ranks (avoids deadlocks if other ranks are not in the same code path)
- Falls back to
exit(code)if MPI is not initialized or already finalized
SafePETSc.SafeMPI.@mpiassert — Macro
@mpiassert cond [message]MPI Collective
MPI-aware assertion that checks cond on all ranks and triggers collective error handling if any rank fails the assertion.
Each rank evaluates cond locally. If any rank has cond == false, all ranks are notified via mpi_any() and collectively enter error handling via mpierror(). Only ranks where the assertion failed will print a backtrace.
The assertion is skipped entirely if enable_assert[] == false (see set_assert).
Arguments
cond: Boolean expression to check (assertion passes whencond == true)message: Optional custom error message (defaults to auto-generated message with file/line info)
Example
# Assert that all ranks have the same value
@mpiassert SafeMPI.mpi_uniform(A) "Matrix A must be uniform across ranks"
# Assert a local condition that must hold on all ranks
@mpiassert n > 0 "Array size must be positive"See also: mpi_any, mpierror, set_assert
Configuration
SafePETSc.SafeMPI.enable_assert — Constant
enable_assertGlobal flag controlling whether @mpiassert macros perform their checks. Set to false to disable all MPI assertions for performance. Default is true.
See also: set_assert, @mpiassert
SafePETSc.SafeMPI.set_assert — Function
set_assert(x::Bool) -> nothingMPI Non-Collective
Enable (true) or disable (false) MPI assertion checks via @mpiassert.
Example
SafeMPI.set_assert(false) # Disable assertions
SafeMPI.set_assert(true) # Re-enable assertions