Linear Gaussian Models¶
This module provides implementations of linear Gaussian models (LGMs), including factor analysis and principal component analysis. LGMs model linear, Gaussian relationships between observable and latent variables. The conjugacy of LGMs enables exact inference and EM.
Class Hierarchy¶
Generic LGM¶
- class LGM(obs_dim: int, obs_rep: ObsRep)[source]¶
Bases:
DifferentiableConjugated[Normal,PostGaussian,PriorGaussian],ABC,GenericA linear Gaussian model (LGM) implemented as a harmonium with Gaussian latent variables.
Linear Gaussian Models represent a joint distribution over observable variables \(X\) and latent variables \(Z\) where both are Gaussian and the relationship between them is linear. In generative terms, this can be viewed as:
\[x = Az + \mu + \epsilon\]- where:
\(z\) is drawn from a multivariate normal (typically a standard normal),
\(A\) is the loading matrix mapping latent to observable space,
\(\mu\) is the observable bias term, and
\(\epsilon \sim \mathcal{N}(0, \Sigma)\) is Gaussian noise.
Posterior vs Prior Structure: The posterior latent distribution (conditioned on observables) uses the PostGaussian parameterization, which may employ a restricted covariance structure (e.g., diagonal) for computational efficiency during frequent inference. The prior latent distribution uses the PriorGaussian parameterization, whose shape is dictated by the conjugation parameters. When PostGaussian is more restricted than PriorGaussian, the prior is constructed by embedding the restricted posterior covariance structure into the fuller prior structure, ensuring compatibility with the required conjugation parameter computation.
As a harmonium, the joint distribution takes the form
\[p(x,z) \propto \exp(\theta_X \cdot s_X(x) + \theta_Z \cdot s_Z(z) + x \cdot \Theta^m_{XZ} \cdot z),\]where
\(s_X(x) = (x, \text{tril}(x \otimes x))\) is the sufficient statistic of the observable normal,
\(s_Z(z) = (z, \text{tril}(z \otimes z))\) is the sufficient statistic of the latent normal, and
and \(\Theta^m_{XZ}\) are the first-order interaction terms between \(X\) and \(Z\).
- The conjugation parameters are \(\rho = (\rho^m, P^{\sigma})\) where
\(\rho^m = -\frac{1}{2} \Theta^m_{ZX} \cdot {\Theta_X^{\sigma}}^{-1} \cdot \theta^m_X\)
\(P^{\sigma} = -\frac{1}{4} \Theta^m_{ZX} \cdot {\Theta_X^{\sigma}}^{-1} \cdot \Theta^m_{XZ}\)
- obs_rep: ObsRep¶
Covariance structure of the observable variables.
- property obs_man: Normal[ObsRep]¶
Override to construct directly from fields, avoiding circular dependency.
- property int_obs_emb: GeneralizedGaussianLocationEmbedding[Normal[ObsRep]]¶
- property int_pst_emb: LinearEmbedding[Euclidean, PostGaussian]¶
Embedding of Euclidean location into posterior latent - general for all GeneralizedGaussians.
Normal LGM¶
- class NormalLGM(obs_dim: int, obs_rep: ObsRep, lat_dim: int, pst_rep: PstRep)[source]¶
Bases:
LGM[ObsRep,Normal,FullNormal],GenericDifferentiable Linear Gaussian Model with Normal latent variables.
Extends the abstract LGM with Normal-specific implementations for computing observable distributions and converting to joint Normal form.
- pst_rep: PstRep¶
- property pst_man: Normal[PstRep]¶
Override to construct directly from fields, avoiding circular dependency.
- property pst_prr_emb: NormalCovarianceEmbedding[PstRep, PositiveDefinite]¶
Embedding of posterior Normal into prior Normal via covariance structure.
- observable_distribution(params: Array) tuple[FullNormal, Array][source]¶
Returns the marginal normal distribution over observable variables.
- whiten_prior(means: Array) Array[source]¶
Reparameterize so latent prior is N(0,I) while preserving the observable marginal.
In mean coordinates: - obs_means: unchanged (observable marginal E[s_X(x)] is preserved) - lat_means: set to standard_normal() (mean coords of N(0,I)) - int_means: updated to WL where W Sigma_z = E[x otimes z] - E[x] otimes E[z] and L = chol(Sigma_z)
- class NormalAnalyticLGM(obs_dim: int, obs_rep: ObsRep, lat_dim: int)[source]¶
Bases:
AnalyticConjugated[Normal,FullNormal],NormalLGM[ObsRep,PositiveDefinite],GenericAnalytic Linear Gaussian Model that extends the differentiable LGM with full analytical tractability, adding conversions between mean and natural coordinates, and a closed-form implementation of EM.
- property lat_man: FullNormal¶
The latent manifold is a full Normal distribution.
- property pst_prr_emb: NormalCovarianceEmbedding[PositiveDefinite, PositiveDefinite]¶
Embedding of posterior Normal into prior Normal via covariance structure.
Boltzmann LGM¶
- class BoltzmannLGM(obs_dim: int, obs_rep: ObsRep, lat_dim: int)[source]¶
Bases:
SymmetricConjugated[Normal,Boltzmann],LGM[ObsRep,Boltzmann,Boltzmann],GenericDifferentiable Linear Gaussian Model with Boltzmann latent variables.
This model combines a Normal observable distribution with Boltzmann (binary) latent variables. The latent states are discrete binary vectors, making this suitable for discrete representation learning and binary latent factor models.
The observable distribution remains Gaussian (continuous), while the latent distribution is a Boltzmann machine (discrete binary). This enables learning discrete latent representations of continuous data.
- property pst_man: Boltzmann¶
Override to construct directly from fields, avoiding circular dependency.
- property pst_prr_emb: LinearEmbedding[Boltzmann, Boltzmann]¶
Embedding of posterior Boltzmann into prior Boltzmann.
For Boltzmann machines, both posterior and prior use the same manifold structure (no covariance simplification like in Normal case), so we use the identity embedding.
- property lat_man: Boltzmann¶
The latent manifold is a Boltzmann machine.
Embeddings¶
- class GeneralizedGaussianLocationEmbedding(gau_man: G)[source]¶
Bases:
LinearEmbedding[Euclidean,G],GenericEmbedding of the Euclidean location component into a GeneralizedGaussian distribution.
Projects a GeneralizedGaussian point in mean coordinates to its Euclidean location component, or embeds a location vector in natural coordinates into a GeneralizedGaussian with zero shape parameters.
- gau_man: G¶
The GeneralizedGaussian distribution.
- property amb_man: G¶
The ambient manifold.
- project(means: Array) Array[source]¶
Project to Euclidean location component.
Works on mean coordinates, extracting the location component from the full generalized Gaussian parameterization. If given a data point (size == data_dim), converts it to sufficient statistics (mean coordinates) first.
- Parameters:
means (Array) – Mean coordinate parameters or data point in GeneralizedGaussian space.
- Returns:
Mean parameters in Euclidean space (location only).
- Return type:
Array
- embed(params: Array) Array[source]¶
Embed Euclidean location into GeneralizedGaussian with zero shape.
- Parameters:
params (Array) – Natural parameters in Euclidean space.
- Returns:
Natural parameters in GeneralizedGaussian space.
- Return type:
Array
- translate(params: Array, delta: Array) Array[source]¶
Translate by adding Euclidean offset to location.
- Parameters:
params (Array) – Natural parameters in GeneralizedGaussian space.
delta (Array) – Euclidean offset to add.
- Returns:
Translated natural parameters in GeneralizedGaussian space.
- Return type:
Array
- class NormalCovarianceEmbedding(_sub_man: Normal[SubRep], _amb_man: Normal[AmbRep])[source]¶
Bases:
LinearEmbedding[Normal,Normal],GenericEmbedding of a normal distribution with a simpler covariance structure into a more complex one.
- project(means: Array) Array[source]¶
Project from ambient to sub-manifold representation.
- Parameters:
means (Array) – Mean parameters in ambient manifold.
- Returns:
Mean parameters in sub-manifold.
- Return type:
Array
Specializations¶
- class FactorAnalysis(obs_dim: int, lat_dim: int)[source]¶
Bases:
NormalAnalyticLGM[Diagonal]A factor analysis model with Gaussian latent variables.
- expectation_maximization(params: Array, xs: Array) Array[source]¶
Perform a single iteration of the EM algorithm.
Without further constraints the latent Normal of FA is not identifiable, and so we hold it fixed at standard normal.
- Parameters:
params (Array) – Current natural parameters.
xs (Array) – Observation data.
- Returns:
Updated natural parameters.
- Return type:
Array
- from_loadings(loadings: Array, means: Array, diags: Array) Array[source]¶
Convert standard factor analysis parameters to natural parameters.
- Parameters:
loadings (Array) – Loading matrix (obs_dim, lat_dim).
means (Array) – Observation means.
diags (Array) – Diagonal noise variances.
- Returns:
Natural parameters for the factor analysis model.
- Return type:
Array
- class PrincipalComponentAnalysis(obs_dim: int, lat_dim: int)[source]¶
Bases:
NormalAnalyticLGM[Scale]A principal component analysis model with Gaussian latent variables.
- expectation_maximization(params: Array, xs: Array) Array[source]¶
Perform a single iteration of the EM algorithm.
Without further constraints the latent Normal of PCA is not identifiable, and so we hold it fixed at standard normal.
- Parameters:
params (Array) – Current natural parameters.
xs (Array) – Observation data.
- Returns:
Updated natural parameters.
- Return type:
Array