Running Statistics #
This module implements Welford's one-pass algorithm for calculating the mean and standard deviation of a sample or a population. The advantage of this algorithm is that it is not necessary to store the data.
The algorithm uses the recurrence formulas for the mean μ, variance σ²
and the sample variance s²:
μₖ = μₖ₋₁ + (xₖ − μₖ₋₁)/k
σ²ₖ = σ²ₖ₋₁*(k-1)/k + (xₖ − μₖ₋₁)*(xₖ − μₖ)/k
s²ₖ = s²ₖ₋₁*(k-2)/(k-1) + (xₖ - μₖ₋₁)²/k
To improve performance, Welford's algorithm keeps track of the two running quantities:
Mₖ = Mₖ₋₁ + (xₖ - Mₖ₋₁)/k
Sₖ = Sₖ₋₁ + (xₖ - Mₖ₋₁)*(xₖ - Mₖ)
Then: μₖ = Mₖ, σ²ₖ = Sₖ/k, s²ₖ = Sₖ/(k-1).
Compute running statistics of a data stream using Welford's algorithm.
- init :: (
- count : Nat
Number of data points,
- mean : Float
Mean of data points.
- var : Float
Variance of data points times the number of data points.
- )
Instances For
Add a new data point to running statistics.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Variance of running data stream.
Instances For
Unbiased variance of running data stream.
Equations
- s.sampleVariance = if s.count ≤ 2 then 0.0 else s.var / (s.count - 1).toFloat
Instances For
Standard deviation of running data stream.
Equations
- s.standardDeviation = s.sampleVariance.sqrt