Matrix Representations¶
Storage-efficient matrix representations for use as strategies in linear maps.
A MatrixRep defines how to store a matrix as a flat parameter array and how to perform linear algebra (matvec, transpose, inverse, etc.) while respecting structural constraints. EmbeddedMap in linear.py plugs a rep into the manifold system; this module is purely about the matrix operations themselves.
The hierarchy from most to least general is:
Rectangular > Square > Symmetric > PositiveDefinite > Diagonal > Scale > Identity
Each level exploits additional structure for cheaper storage and operations:
Representation |
Storage |
Matmul |
Inverse/Det |
|---|---|---|---|
Rectangular |
\(O(n^2)\) |
\(O(n^2)\) |
\(O(n^3)\) |
Symmetric |
\(O(n^2/2)\) |
\(O(n^2)\) |
\(O(n^3)\) |
Pos. Definite |
\(O(n^2/2)\) |
\(O(n^2)\) |
\(O(n^3)\) (Cholesky) |
Diagonal |
\(O(n)\) |
\(O(n)\) |
\(O(n)\) |
Scale |
\(O(1)\) |
\(O(n)\) |
\(O(1)\) |
Identity |
\(O(1)\) |
\(O(1)\) |
\(O(1)\) |
TODO: A Convolutional rep could fit naturally here. It would store a kernel and
implement matvec via convolution on a flat array (i.e. a compactly-stored Toeplitz
matrix), with shape = (output_len, input_len) preserving the existing contract.
Multi-channel and 2D structure would be handled in linear.py via BlockMap
(one block per channel pair) or embeddings that reshape between flat and spatial layouts.
Class Hierarchy¶
Base Matrix Classes¶
- class MatrixRep[source]¶
Bases:
ABCStrategy interface for matrix storage and operations.
Each subclass defines how to pack a matrix into a flat 1D parameter array and how to perform linear algebra (matvec, transpose, outer product, etc.) while preserving the structural constraint (symmetry, diagonal, etc.). Representations are stateless — two instances of the same class are equal.
The
embed_params/project_paramsmethods convert between representations by walking the linear inheritance chain (e.g. Diagonal -> PositiveDefinite -> Symmetric -> Square -> Rectangular).- abstractmethod classmethod matvec(shape: tuple[int, int], params: Array, vector: Array) Array[source]¶
Matrix-vector multiplication.
- classmethod matmat(shape: tuple[int, int], params: Array, right_rep: MatrixRep, right_shape: tuple[int, int], right_params: Array) tuple[MatrixRep, tuple[int, int], Array][source]¶
Multiply matrices, returning optimal representation type and parameters.
- abstractmethod classmethod transpose(shape: tuple[int, int], params: Array) Array[source]¶
Transform parameters to represent the transposed matrix.
- abstractmethod classmethod to_matrix(shape: tuple[int, int], params: Array) Array[source]¶
Convert 1D parameters to dense matrix form.
- abstractmethod classmethod num_params(shape: tuple[int, int]) int[source]¶
Shape of 1D parameter array needed for matrix dimensions.
- abstractmethod classmethod from_matrix(matrix: Array) Array[source]¶
Convert dense matrix to 1D parameters.
- abstractmethod classmethod outer_product(v1: Array, v2: Array) Array[source]¶
Construct parameters from outer product \(v_1 \otimes v_2\).
- abstractmethod classmethod map_diagonal(shape: tuple[int, int], params: Array, f: Callable[[Array], Array]) Array[source]¶
Apply function f to diagonal elements while preserving matrix structure.
- classmethod get_diagonal(shape: tuple[int, int], params: Array) Array[source]¶
Extract diagonal elements from the matrix.
- classmethod embed_params(shape: tuple[int, int], params: Array, target_rep: MatrixRep) Array[source]¶
Recursively embed params into more complex representation.
- classmethod project_params(shape: tuple[int, int], params: Array, target_rep: MatrixRep) Array[source]¶
Recursively project params to simpler representation.
- class Rectangular[source]¶
Bases:
MatrixRepFull \(m \times n\) matrix, stored in row-major order. No structural constraints.
- classmethod matvec(shape: tuple[int, int], params: Array, vector: Array) Array[source]¶
Matrix-vector multiplication.
- classmethod transpose(shape: tuple[int, int], params: Array) Array[source]¶
Transform parameters to represent the transposed matrix.
- classmethod to_matrix(shape: tuple[int, int], params: Array) Array[source]¶
Convert 1D parameters to dense matrix form.
- classmethod num_params(shape: tuple[int, int]) int[source]¶
Shape of 1D parameter array needed for matrix dimensions.
- classmethod outer_product(v1: Array, v2: Array) Array[source]¶
Create parameters from outer product.
- classmethod map_diagonal(shape: tuple[int, int], params: Array, f: Callable[[Array], Array]) Array[source]¶
Map function over diagonal elements of matrix.
- class Square[source]¶
Bases:
RectangularSquare \(n \times n\) matrix, adding inverse, determinant, and positive-definiteness checks.
- classmethod is_positive_definite(shape: tuple[int, int], params: Array) Array[source]¶
Check if symmetric matrix is positive definite using eigenvalues.
Structured Matrices¶
- class Symmetric[source]¶
Bases:
SquareSymmetric matrix (\(A = A^T\)), stored as upper-triangular elements. Roughly half the storage of a full square matrix.
- classmethod transpose(shape: tuple[int, int], params: Array) Array[source]¶
Symmetric matrices are self-transpose.
- classmethod to_matrix(shape: tuple[int, int], params: Array) Array[source]¶
Convert 1D parameters to dense matrix form.
- classmethod num_params(shape: tuple[int, int]) int[source]¶
Shape of 1D parameter array needed for matrix dimensions.
- classmethod get_diagonal(shape: tuple[int, int], params: Array) Array[source]¶
Extract diagonal from packed upper-triangular storage in O(n).
- classmethod map_diagonal(shape: tuple[int, int], params: Array, f: Callable[[Array], Array]) Array[source]¶
Apply f to diagonal elements in packed upper-triangular storage in O(n).
- class PositiveDefinite[source]¶
Bases:
SymmetricSymmetric positive-definite matrix, using Cholesky decomposition for stable inverse and log-determinant.
Mathematically, \(A\) is positive definite iff \(x^T A x > 0\) for all \(x \neq 0\), equivalently iff a unique Cholesky factorization \(A = LL^T\) exists.
- classmethod cholesky_matvec(shape: tuple[int, int], params: Array, vector: Array) Array[source]¶
Compute cholesky factorization and apply to vector or batch thereof.
- classmethod cholesky_whiten(shape: tuple[int, int], mean1: Array, params1: Array, mean2: Array, params2: Array) tuple[Array, Array][source]¶
Whiten a distribution (mean1, params1) with respect to another (mean2, params2).
Transforms the first Gaussian into a coordinate system in which the second Gaussian standard normal N(0, I). This is useful for computing KL divergences and other relative measures between Gaussians.
Perform the transformation: - new_mean = L^(-1) @ (mean1 - mean2) - new_params = L^(-1) @ params1 @ L^(-T)
Where L is the Cholesky factor of params2 such that L @ L.T = params2.
- classmethod is_positive_definite(shape: tuple[int, int], params: Array) Array[source]¶
Check positive definiteness via Cholesky decomposition.
- classmethod inverse(shape: tuple[int, int], params: Array) Array[source]¶
Inverse via Cholesky decomposition.
- classmethod logdet(shape: tuple[int, int], params: Array) Array[source]¶
Log determinant via Cholesky.
Specialized Structures¶
- class Diagonal[source]¶
Bases:
PositiveDefiniteDiagonal matrix \(A = \text{diag}(a_1, \ldots, a_n)\), storing only the \(n\) diagonal entries.
All operations reduce to element-wise arithmetic, giving \(O(n)\) storage and compute.
- classmethod is_positive_definite(shape: tuple[int, int], params: Array) Array[source]¶
Check if all diagonal elements are positive.
- classmethod matvec(shape: tuple[int, int], params: Array, vector: Array) Array[source]¶
Matrix-vector multiplication.
- classmethod transpose(shape: tuple[int, int], params: Array) Array[source]¶
Symmetric matrices are self-transpose.
- classmethod to_matrix(shape: tuple[int, int], params: Array) Array[source]¶
Convert 1D parameters to dense matrix form.
- classmethod num_params(shape: tuple[int, int]) int[source]¶
Shape of 1D parameter array needed for matrix dimensions.
- classmethod inverse(shape: tuple[int, int], params: Array) Array[source]¶
Inverse via Cholesky decomposition.
- classmethod logdet(shape: tuple[int, int], params: Array) Array[source]¶
Log determinant via Cholesky.
- classmethod outer_product(v1: Array, v2: Array) Array[source]¶
Create parameters from outer product, keeping only diagonal.
- classmethod cholesky_matvec(shape: tuple[int, int], params: Array, vector: Array) Array[source]¶
Compute cholesky factorization and apply to vector or batch thereof.
- classmethod cholesky_whiten(shape: tuple[int, int], mean1: Array, params1: Array, mean2: Array, params2: Array) tuple[Array, Array][source]¶
Whiten a distribution (mean1, params1) with respect to another (mean2, params2).
Transforms the first Gaussian into a coordinate system in which the second Gaussian standard normal N(0, I). This is useful for computing KL divergences and other relative measures between Gaussians.
Perform the transformation: - new_mean = L^(-1) @ (mean1 - mean2) - new_params = L^(-1) @ params1 @ L^(-T)
Where L is the Cholesky factor of params2 such that L @ L.T = params2.
- classmethod map_diagonal(shape: tuple[int, int], params: Array, f: Callable[[Array], Array]) Array[source]¶
Apply f to diagonal elements in packed upper-triangular storage in O(n).
- classmethod get_diagonal(shape: tuple[int, int], params: Array) Array[source]¶
Extract diagonal from packed upper-triangular storage in O(n).
- class Scale[source]¶
Bases:
DiagonalScalar multiple of the identity, \(A = \alpha I\). Single parameter, \(O(1)\) storage.
- classmethod is_positive_definite(shape: tuple[int, int], params: Array) Array[source]¶
Check if scale factor is positive.
- classmethod matvec(shape: tuple[int, int], params: Array, vector: Array) Array[source]¶
Matrix-vector multiplication.
- classmethod to_matrix(shape: tuple[int, int], params: Array) Array[source]¶
Convert 1D parameters to dense matrix form.
- classmethod num_params(shape: tuple[int, int]) int[source]¶
Shape of 1D parameter array needed for matrix dimensions.
- classmethod logdet(shape: tuple[int, int], params: Array) Array[source]¶
Log determinant via Cholesky.
- classmethod outer_product(v1: Array, v2: Array) Array[source]¶
Average outer product to single scale parameter.
- classmethod map_diagonal(shape: tuple[int, int], params: Array, f: Callable[[Array], Array]) Array[source]¶
Apply f to diagonal elements in packed upper-triangular storage in O(n).
- classmethod get_diagonal(shape: tuple[int, int], params: Array) Array[source]¶
Extract diagonal from packed upper-triangular storage in O(n).
- class Identity[source]¶
Bases:
ScaleThe identity matrix \(A = I\). Zero parameters — fully determined by shape.
- classmethod matvec(shape: tuple[int, int], params: Array, vector: Array) Array[source]¶
Matrix-vector multiplication.
- classmethod is_positive_definite(shape: tuple[int, int], params: Array) Array[source]¶
Identity is always positive definite.
- classmethod to_matrix(shape: tuple[int, int], params: Array) Array[source]¶
Convert 1D parameters to dense matrix form.
- classmethod num_params(shape: tuple[int, int]) int[source]¶
Shape of 1D parameter array needed for matrix dimensions.
- classmethod inverse(shape: tuple[int, int], params: Array) Array[source]¶
Inverse via Cholesky decomposition.
- classmethod logdet(shape: tuple[int, int], params: Array) Array[source]¶
Log determinant via Cholesky.
- classmethod cholesky_matvec(shape: tuple[int, int], params: Array, vector: Array) Array[source]¶
Compute cholesky factorization and apply to vector or batch thereof.
- classmethod cholesky_whiten(shape: tuple[int, int], mean1: Array, params1: Array, mean2: Array, params2: Array) tuple[Array, Array][source]¶
Whiten a distribution (mean1, params1) with respect to another (mean2, params2).
Transforms the first Gaussian into a coordinate system in which the second Gaussian standard normal N(0, I). This is useful for computing KL divergences and other relative measures between Gaussians.
Perform the transformation: - new_mean = L^(-1) @ (mean1 - mean2) - new_params = L^(-1) @ params1 @ L^(-T)
Where L is the Cholesky factor of params2 such that L @ L.T = params2.
- classmethod get_diagonal(shape: tuple[int, int], params: Array) Array[source]¶
Extract diagonal from packed upper-triangular storage in O(n).