Poisson Distributions

Poisson distributions as exponential families.

This module provides two count models in the exponential family framework:

  1. Base Distributions: - Poisson: Standard Poisson distribution for count data - CoMPoisson: Conway-Maxwell-Poisson distribution for flexible dispersion

  2. Components: - CoMShape: Shape component for the COM-Poisson distribution

The Poisson distribution models count data with a single rate parameter, where the mean equals the variance. The Conway-Maxwell-Poisson extends this with a dispersion parameter, allowing for both over- and under-dispersed count data.

Class Hierarchy

Inheritance diagram of goal.models.base.poisson

Poisson

class Poisson[source]

Bases: Analytic

The Poisson distribution models counts and is defined by a single rate parameter \(\eta > 0\). The probability mass function at count \(k \in \mathbb{N}\) is given by

As an exponential family:

  • Natural parameter: \(\theta = \log(\eta)\)

  • Probability mass function: \(p(k; \theta) = e^{\theta k - \log(k!)}\)

  • Sufficient statistic: \(s(x) = x\)

  • Base measure: \(\mu(k) = -\log(k!)\)

  • Log-partition function: \(\psi(\theta) = e^{\theta}\)

  • Negative entropy: \(\phi(\eta) = \eta\log(\eta) - \eta\)

Properties:

  • Mean = Variance = \(\eta\)

  • Mode = \(\lfloor \eta \rfloor\)

property dim: int

Single rate parameter.

property data_dim: int

Dimension of the data space.

sufficient_statistic(x: Array) Array[source]

Compute the sufficient statistic \(\mathbf{s}(x)\) of an observation.

log_base_measure(x: Array) Array[source]

Compute \(\log \mu(x)\) for an observation.

log_partition_function(params: Array) Array[source]

Compute the log-partition function \(\psi\) at the given natural parameters.

negative_entropy(means: Array) Array[source]

Compute negative entropy \(\phi\) at the given mean parameters.

sample(key: Array, params: Array, n: int = 1) Array[source]

Draw n samples from the distribution with the given natural parameters.

statistical_mean(params: Array) Array[source]
statistical_covariance(params: Array) Array[source]
initialize_from_sample(key: Array, sample: Array, location: float = 0.0, shape: float = 0.1) Array[source]

Initialize Poisson parameters from sample data.

Handles the case where some observations are 0 by clipping the mean to a small positive value before converting to natural parameters.

Parameters:
  • key – Random key

  • sample – Sample data (count values)

  • location – Mean of noise distribution

  • shape – Std dev of noise distribution

Returns:

Natural parameters (log rate).

Conway-Maxwell-Poisson

class CoMShape[source]

Bases: ExponentialFamily

Shape component of a CoMPoisson distribution. This represents the dispersion structure with sufficient statistic log(x!). It captures deviations from the standard Poisson variance-mean relationship.

The dispersion parameter \(\nu\) controls whether the distribution is:

  • Equidispersed (\(\nu = 1\)): Variance = Mean (standard Poisson)

  • Underdispersed (\(\nu > 1\)): Variance < Mean

  • Overdispersed (\(\nu < 1\)): Variance > Mean

property dim: int

The dimension of the manifold.

property data_dim: int

Dimension of the data space.

sufficient_statistic(x: Array) Array[source]

Compute the sufficient statistic \(\mathbf{s}(x)\) of an observation.

log_base_measure(x: Array) Array[source]

Compute \(\log \mu(x)\) for an observation.

class CoMPoisson(window_size: int = 200)[source]

Bases: LocationShape[Poisson, CoMShape], Differentiable

The Conway-Maxwell Poisson distribution is a generalization of the Poisson distribution that can model both over- and under-dispersed count data. Its probability mass function is:

\[p(x; \mu, \nu) = \frac{\mu^x}{(x!)^\nu Z(\mu, \nu)}\]

where:

  • \(\mu > 0\) is related to the mode of the distribution

  • \(\nu > 0\) is the dispersion parameter (pseudo-precision)

  • \(Z(\mu, \nu)\) is the normalizing constant:

    \[Z(\mu, \nu) = \sum_{j=0}^{\infty} \frac{\mu^j}{(j!)^\nu}\]

Special cases:

  • When \(\nu = 1\): Standard Poisson distribution

  • When \(\nu < 1\): Over-dispersed (variance > mean)

  • When \(\nu > 1\): Under-dispersed (variance < mean)

  • When \(\nu \to \infty\): Bernoulli distribution

  • When \(\nu = 0\): Geometric distribution

As an exponential family:

  • Natural parameters: \(\theta_1 = \nu\log(\mu)\), \(\theta_2 = -\nu\)

  • Sufficient statistics: \(s(x) = (x, \log(x!))\)

  • Log-partition function: log of the normalizing constant \(Z\)

window_size: int = 200

Fixed number of terms to evaluate in series expansions.

split_mode_dispersion(params: Array) tuple[Array, Array][source]

Convert from natural parameters to mode-shape parameters.

The COM-Poisson distribution can be parameterized by either natural parameters \((\theta_1, \theta_2)\) or by mode-shape parameters \((\mu, \nu)\). The conversion is given by:

\[\nu = -\theta_2\]
\[\mu = \exp(-\theta_1/\theta_2)\]

join_mode_dispersion(mu: Array, nu: Array) Array[source]

Convert from mode-shape parameters to natural parameters.

The COM-Poisson distribution can be parameterized by either mode-shape parameters \((\mu, \nu)\) or natural parameters \((\theta_1, \theta_2)\). The conversion is given by:

  • \(\theta_1 = \nu\log(\mu)\)

  • \(\theta_2 = -\nu\)

approximate_mean_variance(params: Array) tuple[Array, Array][source]

Compute approximate mean and variance of COM-Poisson distribution.

Given mode \(\mu\) and shape \(\nu\) parameters, the approximations are: \(E(X) \approx \mu + 1/(2\nu) - 1/2\) \(\text{Var}(X) \approx \mu / \nu\)

numerical_mean_variance(params: Array) tuple[Array, Array][source]

Compute mean and variance using numerical integration.

Uses window-based approach centered on mode to compute:

\[E[X] = \sum_{x=0}^\infty x p(x)\]
\[E[X^2] = \sum_{x=0}^\infty x^2 p(x)\]
\[\text{Var}(X) = E[X^2] - E[X]^2\]

statistical_mean(params: Array) Array[source]

Numerical approximation of the mean.

statistical_covariance(params: Array) Array[source]

Numerical approximation of the covariance.

property fst_man: Poisson

First component manifold.

property snd_man: CoMShape

Second component manifold.

log_base_measure(x: Array) Array[source]

Compute \(\log \mu(x)\) for an observation.

log_partition_function(params: Array) Array[source]

Compute log partition function using fixed-width window strategy.

Evaluates:

\[\psi(\theta) = \log\sum_{j=0}^{\infty} \exp(\theta_1 j + \theta_2 \log(j!))\]

using a fixed number of terms centered on the mode.

Parameters:

params – Array of natural parameters \((\theta_1, \theta_2)\)

Returns:

Value of log partition function \(\psi(\theta)\)

sample(key: Array, params: Array, n: int = 1) Array[source]

Generate random COM-Poisson samples using Algorithm 2 from Benson & Friel (2021).

check_natural_parameters(params: Array) Array[source]

Check if natural parameters are valid for COM-Poisson.

For parameters \((\theta_1, \theta_2)\), the following conditions must hold: - \(\theta_1\) is finite, \(\theta_2 < 0\)

initialize(key: Array, location: float = 0.0, shape: float = 0.1) Array[source]

Initialize COM-Poisson parameters.

initialize_from_sample(key: Array, sample: Array, location: float = 0.0, shape: float = 0.1) Array[source]

Initialize COM-Poisson parameters from sample.

Estimates mode and shape parameters using method of moments based on sample mean and variance, with added noise for regularization.