Poisson Distributions¶
Poisson distributions as exponential families.
This module provides two count models in the exponential family framework:
Base Distributions: - Poisson: Standard Poisson distribution for count data - CoMPoisson: Conway-Maxwell-Poisson distribution for flexible dispersion
Components: - CoMShape: Shape component for the COM-Poisson distribution
The Poisson distribution models count data with a single rate parameter, where the mean equals the variance. The Conway-Maxwell-Poisson extends this with a dispersion parameter, allowing for both over- and under-dispersed count data.
Class Hierarchy¶
Poisson¶
- class Poisson[source]¶
Bases:
AnalyticThe Poisson distribution models counts and is defined by a single rate parameter \(\eta > 0\). The probability mass function at count \(k \in \mathbb{N}\) is given by
As an exponential family:
Natural parameter: \(\theta = \log(\eta)\)
Probability mass function: \(p(k; \theta) = e^{\theta k - \log(k!)}\)
Sufficient statistic: \(s(x) = x\)
Base measure: \(\mu(k) = -\log(k!)\)
Log-partition function: \(\psi(\theta) = e^{\theta}\)
Negative entropy: \(\phi(\eta) = \eta\log(\eta) - \eta\)
Properties:
Mean = Variance = \(\eta\)
Mode = \(\lfloor \eta \rfloor\)
- sufficient_statistic(x: Array) Array[source]¶
Compute the sufficient statistic \(\mathbf{s}(x)\) of an observation.
- log_partition_function(params: Array) Array[source]¶
Compute the log-partition function \(\psi\) at the given natural parameters.
- negative_entropy(means: Array) Array[source]¶
Compute negative entropy \(\phi\) at the given mean parameters.
- sample(key: Array, params: Array, n: int = 1) Array[source]¶
Draw
nsamples from the distribution with the given natural parameters.
- initialize_from_sample(key: Array, sample: Array, location: float = 0.0, shape: float = 0.1) Array[source]¶
Initialize Poisson parameters from sample data.
Handles the case where some observations are 0 by clipping the mean to a small positive value before converting to natural parameters.
- Parameters:
key – Random key
sample – Sample data (count values)
location – Mean of noise distribution
shape – Std dev of noise distribution
- Returns:
Natural parameters (log rate).
Conway-Maxwell-Poisson¶
- class CoMShape[source]¶
Bases:
ExponentialFamilyShape component of a CoMPoisson distribution. This represents the dispersion structure with sufficient statistic log(x!). It captures deviations from the standard Poisson variance-mean relationship.
The dispersion parameter \(\nu\) controls whether the distribution is:
Equidispersed (\(\nu = 1\)): Variance = Mean (standard Poisson)
Underdispersed (\(\nu > 1\)): Variance < Mean
Overdispersed (\(\nu < 1\)): Variance > Mean
- class CoMPoisson(window_size: int = 200)[source]¶
Bases:
LocationShape[Poisson,CoMShape],DifferentiableThe Conway-Maxwell Poisson distribution is a generalization of the Poisson distribution that can model both over- and under-dispersed count data. Its probability mass function is:
\[p(x; \mu, \nu) = \frac{\mu^x}{(x!)^\nu Z(\mu, \nu)}\]where:
\(\mu > 0\) is related to the mode of the distribution
\(\nu > 0\) is the dispersion parameter (pseudo-precision)
\(Z(\mu, \nu)\) is the normalizing constant:
\[Z(\mu, \nu) = \sum_{j=0}^{\infty} \frac{\mu^j}{(j!)^\nu}\]
Special cases:
When \(\nu = 1\): Standard Poisson distribution
When \(\nu < 1\): Over-dispersed (variance > mean)
When \(\nu > 1\): Under-dispersed (variance < mean)
When \(\nu \to \infty\): Bernoulli distribution
When \(\nu = 0\): Geometric distribution
As an exponential family:
Natural parameters: \(\theta_1 = \nu\log(\mu)\), \(\theta_2 = -\nu\)
Sufficient statistics: \(s(x) = (x, \log(x!))\)
Log-partition function: log of the normalizing constant \(Z\)
- split_mode_dispersion(params: Array) tuple[Array, Array][source]¶
Convert from natural parameters to mode-shape parameters.
The COM-Poisson distribution can be parameterized by either natural parameters \((\theta_1, \theta_2)\) or by mode-shape parameters \((\mu, \nu)\). The conversion is given by:
\[\nu = -\theta_2\]\[\mu = \exp(-\theta_1/\theta_2)\]
- join_mode_dispersion(mu: Array, nu: Array) Array[source]¶
Convert from mode-shape parameters to natural parameters.
The COM-Poisson distribution can be parameterized by either mode-shape parameters \((\mu, \nu)\) or natural parameters \((\theta_1, \theta_2)\). The conversion is given by:
\(\theta_1 = \nu\log(\mu)\)
\(\theta_2 = -\nu\)
- approximate_mean_variance(params: Array) tuple[Array, Array][source]¶
Compute approximate mean and variance of COM-Poisson distribution.
Given mode \(\mu\) and shape \(\nu\) parameters, the approximations are: \(E(X) \approx \mu + 1/(2\nu) - 1/2\) \(\text{Var}(X) \approx \mu / \nu\)
- numerical_mean_variance(params: Array) tuple[Array, Array][source]¶
Compute mean and variance using numerical integration.
Uses window-based approach centered on mode to compute:
\[E[X] = \sum_{x=0}^\infty x p(x)\]\[E[X^2] = \sum_{x=0}^\infty x^2 p(x)\]\[\text{Var}(X) = E[X^2] - E[X]^2\]
- log_partition_function(params: Array) Array[source]¶
Compute log partition function using fixed-width window strategy.
Evaluates:
\[\psi(\theta) = \log\sum_{j=0}^{\infty} \exp(\theta_1 j + \theta_2 \log(j!))\]using a fixed number of terms centered on the mode.
- Parameters:
params – Array of natural parameters \((\theta_1, \theta_2)\)
- Returns:
Value of log partition function \(\psi(\theta)\)
- sample(key: Array, params: Array, n: int = 1) Array[source]¶
Generate random COM-Poisson samples using Algorithm 2 from Benson & Friel (2021).
- check_natural_parameters(params: Array) Array[source]¶
Check if natural parameters are valid for COM-Poisson.
For parameters \((\theta_1, \theta_2)\), the following conditions must hold: - \(\theta_1\) is finite, \(\theta_2 < 0\)