PH distributions

All PH distributions in this package subtype AbstractPHDist <: ContinuousUnivariateDistribution, so the full Distributions.jl interface works on every subtype. Specialized subtypes use closed-form implementations where available; the general PHDist falls back to matrix-exponential formulas.

using PhaseTypeDistributions, Distributions, Random, Statistics

Constructors

General PHDist(α, T)

ph = PHDist([0.6, 0.4], [-3.0 1.0; 0.0 -2.0])
PHDist(α=[0.6, 0.4], T=[-3.0 1.0; 0.0 -2.0])

HyperExponentialDist — mixture of exponentials

A diagonal sub-generator: SCV is always ≥ 1.

he  = HyperExponentialDist([0.4, 0.6], [2.0, 5.0])
he2 = HyperExponentialDist(2.0, 3.0)        # by mean and SCV
mean(he2), scv(he2)
(2.0, 3.0)

HypoExponentialDist — convolution of exponentials

A bidiagonal sub-generator with α = [1, 0, …, 0]: SCV is always ≤ 1. Repeated rates fall back to the matrix-exponential form (partial fractions diverge for equal rates).

ho  = HypoExponentialDist([3.0, 5.0])
ho2 = HypoExponentialDist(2.0, 0.5)         # by mean and SCV
mean(ho2), scv(ho2)
(2.0, 0.5)

ErlangPHDistk equal phases

er = ErlangPHDist(3, 2.0)
mean(er), var(er)
(1.5, 0.75)

CoxianDist — sequential phases with exit probabilities

cox = CoxianDist([3.0, 4.0, 5.0], [0.2, 0.3])
mean(cox), var(cox)
(0.6453333333333333, 0.21456711111111104)

From Distributions.jl

ph_exp = PHDist(Exponential(2.0))           # 1-phase
ph_erl = PHDist(Erlang(3, 0.5))             # 3-phase
mean(ph_exp), mean(ph_erl)
(2.0, 1.5)

Standard Distributions.jl interface

Every PH type supports the standard density / cdf / sampling API:

pdf(er, 1.0), logpdf(er, 1.0), cdf(er, 1.0), ccdf(er, 1.0)
(0.5413411329464507, -0.6137056388801095, 0.3233235838169365, 0.6766764161830635)

ccdf is the natively-computed quantity for PH distributions (α' exp(Tx) 𝟙) — cdf is derived from it — so tail precision is preserved into the deep tail:

ccdf(er, 50.0)        # ≪ 1, but still finite
1.8976107553682285e-40

Support is [0, ∞):

minimum(er), maximum(er), insupport(er, 1.0), insupport(er, -1.0)
(0.0, Inf, true, false)

Moments and shape

mean(he), var(he), std(he), scv(he)
(0.32, 0.1456, 0.38157568056677826, 1.421875)
skewness(he), kurtosis(he)      # excess kurtosis
(2.8125136580841357, 12.240550658133078)
kth_moment(he, 3), mgf(he, 0.5)
(0.3288, 1.2)

mgf extends Distributions.mgf. It is defined for t < min(-diag(T)); the linear solve diverges otherwise.

Quantile / median

quantile(er, 0.95), median(er)
(3.1478968109358902, 1.337030156861374)

quantile for the general AbstractPHDist uses bisection; specialized subtypes that are exactly Distributions.Erlang etc. could use closed forms but currently also bisect via the fallback.

Sampling

rng = Random.MersenneTwister(42)
rand(rng, er)
0.5840034701546643
xs = rand(rng, er, 1000);
length(xs), extrema(xs), Statistics.mean(xs)
(1000, (0.08957195116427663, 7.008321490323527), 1.5269136616230647)

Internally, sampling simulates the underlying CTMC until absorption. Each specialized subtype overrides rand with its natural recipe — e.g. HyperExponentialDist picks a component and draws one exponential; ErlangPHDist sums k independent exponentials.

Accessors — the underlying (α, T) representation

Every subtype exposes its parameters in the canonical PH form:

initial_prob(cox), subgenerator(cox), exit_rates(cox), nphases(cox)
([1.0, 0.0, 0.0], [-3.0 2.4000000000000004 0.0; 0.0 -4.0 2.8; 0.0 0.0 -5.0], [0.6000000000000001, 1.2, 5.0], 3)
params(cox)                     # natural parameters: (rates, exit_probs)
params(he)                      # (probs, rates)
params(er)                      # (shape, rate)
params(ph)                      # (α, T)
([0.6, 0.4], [-3.0 1.0; 0.0 -2.0])

Conversion to the general form

Any subtype converts to the general (α, T) form via PHDist(d):

ph_from_he = PHDist(he)
ph_from_er = PHDist(er)
nphases(ph_from_he), nphases(ph_from_er)
(2, 3)

PHDist(d::PHDist) is a no-op.

Comparison helpers for non-identifiable distributions

Two different (α, T) representations can describe the same distribution. The package provides comparison routines that work across representations:

moments_isapprox(he, ph_from_he)        # by moments
distribution_isapprox(he, ph_from_he)   # by CDF on an adaptive grid
moment_vector(he, 4)                    # [E[X], E[X²], E[X³], E[X⁴]]
4-element Vector{Float64}:
 0.32
 0.248
 0.3288
 0.62304

moments_isapprox matches a fixed number of moments (necessary but not sufficient); distribution_isapprox evaluates the CDF on a grid spanning out to several standard deviations and so is a stronger check.

Reference — types

PhaseTypeDistributions.AbstractPHDistType
AbstractPHDist <: ContinuousUnivariateDistribution

Supertype for every phase-type distribution in this package. Subtypes must implement initial_prob and subgenerator; generic implementations of the Distributions.jl interface (pdf, cdf, ccdf, mean, var, rand, …) are provided by fallback and can be overridden by specialized subtypes for efficiency.

source
PhaseTypeDistributions.PHDistType
PHDist(α, T; αpattern=nothing, Tpattern=nothing)

General phase-type distribution parameterized by an initial probability vector α (length m, non-negative, sums to 1) and a sub-generator matrix T (m × m, with non-positive diagonal and non-negative off-diagonal entries such that each row sum is non-positive). The exit-rate vector is t⁰ = -T · 1_m. These conditions on T are validated at construction.

α and T are stored as a FixedSparsityVector and a FixedSparsityMatrix: they carry a fixed sparsity pattern marking which entries are allowed to be nonzero, so the structural zeros of a PH family (e.g. a Coxian's α = [1, 0, …] and bidiagonal T) are preserved and cannot be filled in — important for structure-preserving fitting.

By default the pattern is inferred from the current nonzeros of α/T (and a fixed-sparsity input keeps its own pattern). Pass αpattern / Tpattern — boolean arrays of matching shape — to set it explicitly, e.g. to keep a position that happens to be zero now but should remain free to become nonzero.

PHDist is the most general PH type in the package; specialized subtypes (HyperExponentialDist, HypoExponentialDist, ErlangPHDist, CoxianDist) can be converted to it via PHDist(d), which carries each family's structural zeros into the pattern.

source
PhaseTypeDistributions.HypoExponentialDistType

Hypoexponential distribution — convolution of exponentials with distinct rates. Represents a PH distribution with bidiagonal sub-generator matrix T and α = [1, 0, ..., 0]. SCV is always ≤ 1 (when rates are distinct and positive).

source
PhaseTypeDistributions.ErlangPHDistType

Erlang PH distribution — k phases with equal rate λ. A special case of hypoexponential where all rates are identical. Equivalent to Gamma(k, 1/λ) or Distributions.Erlang(k, 1/λ).

source
PhaseTypeDistributions.CoxianDistType

Coxian PH distribution — a sequential phase-type distribution where at each phase the process either absorbs (with probability pᵢ) or advances to the next phase. The last phase always absorbs (pₖ = 1 is implied and not stored).

Generalizes hypoexponential (all exitprobs = 0 except last) and Erlang (all rates equal, all exitprobs = 0 except last).

source

Reference — accessors and moments

Reference — comparison helpers

PhaseTypeDistributions.moments_isapproxFunction
moments_isapprox(d1::AbstractPHDist, d2::AbstractPHDist;
                 order=4, atol=1e-6, rtol=1e-6) -> Bool

Compare two PH distributions by their moments up to the given order. Returns true if all moments (and derived quantities: mean, var, scv) are approximately equal.

Note: Two distributions can match in moments but still differ in shape (e.g., in the tails). Moment matching is necessary but not sufficient for distributional equality.

source
PhaseTypeDistributions.distribution_isapproxFunction
distribution_isapprox(d1::AbstractPHDist, d2::AbstractPHDist;
                      n_points=100, atol=1e-6) -> Bool

Compare two PH distributions by their CDF values on an adaptive grid. This is a stronger test than moment comparison — it checks the full distributional shape.

The grid spans from 0 to a point well into the tail, chosen based on the mean and standard deviation of both distributions.

source