Matrices
SafePETSc provides distributed matrices through the Mat{T} type, which wraps PETSc's distributed matrix functionality with GPU-friendly operations and automatic memory management.
Creating Matrices
Uniform Distribution
Use Mat_uniform when all ranks have the same data:
# Create from dense matrix
A = Mat_uniform([1.0 2.0; 3.0 4.0])
# With custom partitions
row_part = [1, 2, 3] # 2 ranks
col_part = [1, 2, 3]
A = Mat_uniform(data; row_partition=row_part, col_partition=col_part)
# With PETSc options prefix
A = Mat_uniform(data; prefix="my_mat_")Sum Distribution
Use Mat_sum when ranks contribute sparse entries:
using SparseArrays
# Each rank contributes different sparse entries
# All contributions are summed
rank = MPI.Comm_rank(MPI.COMM_WORLD)
I = [1, rank+1]
J = [1, rank+1]
V = [1.0, 2.0]
A = Mat_sum(sparse(I, J, V, 10, 10))
# Assert local ownership for validation
A = Mat_sum(sparse_local; own_rank_only=true)Matrix Operations
Linear Algebra
# Matrix-vector multiplication
y = A * x
# In-place
mul!(y, A, x)
# Matrix-matrix multiplication
C = A * B
# In-place
mul!(C, A, B)
# Transpose
B = A'
B = Mat(A') # Materialize transpose
# In-place transpose (reuses B)
transpose!(B, A)Concatenation
# Vertical concatenation
C = vcat(A, B)
C = cat(A, B; dims=1)
# Horizontal concatenation
D = hcat(A, B)
D = cat(A, B; dims=2)
# Block diagonal
E = blockdiag(A, B, C)Sparse Diagonal Matrices
using SparseArrays
# Create diagonal matrix from vectors
diag_vec = Vec_uniform(ones(100))
upper_vec = Vec_uniform(ones(99))
lower_vec = Vec_uniform(ones(99))
# Tridiagonal matrix
A = spdiagm(-1 => lower_vec, 0 => diag_vec, 1 => upper_vec)
# Explicit dimensions
A = spdiagm(100, 100, 0 => diag_vec, 1 => upper_vec)Transpose Operations
SafePETSc uses PETSc's efficient transpose operations:
# Create transpose (new matrix)
B = Mat(A')
# Reuse transpose storage
B = Mat(A') # Initial creation
# ... later, after A changes:
transpose!(B, A) # Reuse B's storageNote: For transpose! to work correctly with PETSc's reuse mechanism, B should have been created as a transpose of A initially.
Properties
# Element type
T = eltype(A)
# Size
m, n = size(A)
m = size(A, 1)
n = size(A, 2)
# Partition information
row_part = A.obj.row_partition
col_part = A.obj.col_partition
prefix = A.obj.prefixPartitioning
Matrices have both row and column partitions.
Default Partitioning
m, n = 100, 80
nranks = MPI.Comm_size(MPI.COMM_WORLD)
row_part = default_row_partition(m, nranks)
col_part = default_row_partition(n, nranks)Requirements
- Row operations require matching row partitions
- Column operations require matching column partitions
- Matrix multiplication:
C = A * BrequiresA.col_partition == B.row_partition
GPU-Friendly Operations
SafePETSc prioritizes PETSc's native GPU-compatible operations:
# Good: Uses PETSc's MatTranspose (GPU-friendly)
B = Mat(A')
# Good: Uses PETSc's MatMatMult (GPU-friendly)
C = A * B
# Good: Uses PETSc's MatConvert (GPU-friendly)
# (internal to SafePETSc operations)Avoid element-by-element access patterns that cause excessive GPU↔CPU transfers.
Advanced Features
Iterating Over Dense Matrix Rows
For MATMPIDENSE matrices:
# Iterate over local rows efficiently
for row in eachrow(A)
# row is a view of the matrix row
process(row)
endThis uses a single MatDenseGetArrayRead call for the entire iteration.
PETSc Options
Configure matrix behavior via options:
# Set global options
petsc_options_insert_string("-dense_mat_type mpidense")
# Use prefix for specific matrices
A = Mat_uniform(data; prefix="my_mat_")Examples
Assemble a Sparse Matrix
using SafePETSc
using SparseArrays
using MPI
SafePETSc.Init()
rank = MPI.Comm_rank(MPI.COMM_WORLD)
nranks = MPI.Comm_size(MPI.COMM_WORLD)
n = 100
# Each rank builds a local piece
row_part = default_row_partition(n, nranks)
lo = row_part[rank + 1]
hi = row_part[rank + 2] - 1
# Build local sparse matrix (only owned rows)
I = Int[]
J = Int[]
V = Float64[]
for i in lo:hi
# Diagonal
push!(I, i)
push!(J, i)
push!(V, 2.0)
# Off-diagonal
if i > 1
push!(I, i)
push!(J, i-1)
push!(V, -1.0)
end
if i < n
push!(I, i)
push!(J, i+1)
push!(V, -1.0)
end
end
local_sparse = sparse(I, J, V, n, n)
# Assemble global matrix
A = Mat_sum(local_sparse; own_rank_only=true)Build Block Matrices
# Create blocks
A11 = Mat_uniform(...)
A12 = Mat_uniform(...)
A21 = Mat_uniform(...)
A22 = Mat_uniform(...)
# Assemble block matrix
A = vcat(hcat(A11, A12), hcat(A21, A22))
# Or equivalently
A = [A11 A12; A21 A22] # (if block-matrix syntax is supported)Tridiagonal System
n = 1000
# Create diagonal vectors
diag = Vec_uniform(2.0 * ones(n))
upper = Vec_uniform(-1.0 * ones(n-1))
lower = Vec_uniform(-1.0 * ones(n-1))
# Build tridiagonal matrix
A = spdiagm(-1 => lower, 0 => diag, 1 => upper)
# Create RHS
b = Vec_uniform(ones(n))
# Solve
x = A \ bPerformance Tips
- Use Native Operations: Prefer PETSc operations over element access
- Batch Assembly: Build sparse matrices locally, then sum once
- Appropriate Matrix Type: Use dense vs. sparse based on structure
- Reuse Solver Objects: Create
Solveronce, reuse for multiple solves - GPU Configuration: Set PETSc options for GPU matrices
# Good: bulk assembly
local_matrix = sparse(I, J, V, m, n)
A = Mat_sum(local_matrix)
# Less good: element-by-element (if it were supported)
# A = Mat_sum(...)
# for each element
# set_value(A, i, j, val) # Repeated MPI callsCompatibility Notes
- Transpose Reuse:
transpose!(B, A)requires thatBwas created viaMat(A')or has a compatible precursor - Matrix Multiplication Reuse:
mul!(C, A, B)requires pre-allocatedCwith correct partitions - Dense Operations: Some operations (e.g.,
\with matrix RHS) require dense matrices