Distributed Reference Management
SafePETSc enables native Julia syntax for distributed linear algebra by implementing automatic distributed reference management via the SafeMPI module. This implementation detail ensures that distributed objects are properly cleaned up across all MPI ranks, allowing users to write natural Julia expressions like A * B + C without manual memory management.
The Problem
In MPI-based parallel computing, objects like PETSc vectors and matrices exist on all ranks. Destroying such objects requires collective MPI calls—all ranks must participate. This creates challenges:
- Premature destruction: If one rank destroys an object while others still need it, the program crashes
- Memory leaks: If ranks don't coordinate cleanup, objects leak memory
- Complex coordination: Manual reference counting is error-prone
The Solution: DRef
SafePETSc uses DRef{T} (Distributed Reference) to automatically track object lifetimes:
using SafePETSc
# Create a distributed vector (returns a DRef{_Vec{Float64}})
v = Vec_uniform([1.0, 2.0, 3.0])
# Use it normally
y = v .+ 1.0
# When v goes out of scope and is garbage collected,
# SafePETSc coordinates cleanup across all ranksHow It Works
Reference Counting
- Mirrored Counters: Each rank runs the same deterministic ID allocation, keeping a mirrored
counter_pooland sharedfree_idsstack; all ranks recycle IDs identically without a designated root - Automatic Release: When a
DRefis garbage collected, its finalizer enqueues the ID locally (no MPI in finalizers) - Cleanup Points: At
check_and_destroy!calls (automatically invoked at object creation), SafePETSc:- Periodically triggers partial garbage collection (
GC.gc(false)) so finalizers run - Drains each rank's local release queue
- Allgathers counts and then Allgathervs the release IDs so every rank sees the same global sequence
- Each rank updates its mirrored counters identically and computes the same set of ready IDs
- All ranks destroy ready objects simultaneously
- Periodically triggers partial garbage collection (
Trait-Based Opt-In
Types must explicitly opt-in to distributed management:
# Define your distributed type
struct MyDistributedObject
data::Vector{Float64}
# ... MPI-based fields
end
# Opt-in to distributed management
SafeMPI.destroy_trait(::Type{MyDistributedObject}) = SafeMPI.CanDestroy()
# Implement cleanup
function SafeMPI.destroy_obj!(obj::MyDistributedObject)
# Perform collective cleanup (e.g., MPI_Free, PETSc destroy)
# This is called on ALL ranks simultaneously
cleanup_mpi_resources(obj)
end
# Now you can wrap it
ref = DRef(MyDistributedObject(...))Controlling Cleanup
Automatic Cleanup and Pooling
Cleanup is triggered automatically at object creation. Every SafePETSc.default_check[] object creations (default: 10), a partial garbage collection (GC.gc(false)) runs to trigger finalizers, and pending releases are processed via MPI communication. This throttling reduces cleanup overhead.
For PETSc vectors, the default behavior is to return released vectors to a reuse pool instead of destroying them. Disable pooling with ENABLE_VEC_POOL[] = false or call clear_vec_pool!() to free pooled vectors.
To guarantee all objects are destroyed (not just those finalized by partial GC), call GC.gc(true) followed by SafeMPI.check_and_destroy!():
# Force complete cleanup
GC.gc(true) # Full garbage collection
SafeMPI.check_and_destroy!() # Process all finalizersExplicit Cleanup
You can manually trigger cleanup:
# Trigger cleanup (does partial GC if throttle count reached)
SafeMPI.check_and_destroy!()
# With custom throttle count
SafeMPI.check_and_destroy!(max_check_count=10)The max_check_count parameter controls how often partial GC runs: check_and_destroy! only calls GC.gc(false) every max_check_count invocations, but MPI communication to process pending releases happens on every call.
Disabling Assertions
For performance in production:
SafeMPI.set_assert(false) # Disable @mpiassert checksBest Practices
1. Let Scoping Work for You
function compute_something()
A = Mat_uniform(...)
b = Vec_uniform(...)
x = A \ b
# A, b, x cleaned up when function exits
return extract_result(x)
end2. Explicit Cleanup in Long-Running Loops
for i in 1:1000000
v = Vec_uniform(data[i])
result[i] = compute(v)
# Periodic cleanup to avoid accumulation
if i % 100 == 0
SafeMPI.check_and_destroy!()
end
endDebugging
Check Reference Counts
# Access the default manager
manager = SafeMPI.default_manager[]
# Inspect state (mirrored on all ranks)
println(io0(), "Active objects: ", length(manager.counter_pool))
println(io0(), "Free IDs: ", length(manager.free_ids))Enable Verbose Assertions
# Assertions are enabled by default
SafeMPI.enable_assert[] # true
# Use @mpiassert for collective checks
@mpiassert all_data_valid "Data validation failed"Performance Considerations
- Cleanup Cost:
check_and_destroy!uses collectiveAllgather/Allgathervoperations and periodically triggers partial garbage collection - Throttling: Adjust
SafePETSc.default_check[]to control how often partial garbage collection (GC.gc(false)) is triggered incheck_and_destroy!()(default: 10). Higher values reduce GC overhead but may delay object finalization