From the power law to extreme value mixture distributions

Clement Lee (Newcastle University)



  • Networks \(\times\) extreme value theory
  • Extending the power law for degrees
  • Mixture distribution
    • even for data simulated from “true” random graph models
    • finite sample behaviour
  • Compare results for real & simulated data
    • make models more realistic?
  • Studying the evolution of real data


1. Introduction

Degree distribution


Why the power law?

  • Seemingly ubiquitous
  • Potentially generated from “nice” models
    • Preferential attachment (Barabasi-Albert)
    • Generalised random graphs


Really a straight line?

  • Survival function / complementary CDF


  • “Curved” downwards

Tackling various issues

2. Extreme value mixture distribution

Primer: relationships

Schematic, on log-log scale

  • Slope: \(-\alpha(<-1)\)

  • Tail heaviness: \(1/(\alpha-1)\)

  • \(\theta\in(0,1]\)

  • Reduced to Zipf\((\alpha)\) when \(\theta=1\)

  • Blue’s tail heaviness: \(\xi\)
  • Brown’s tail heaviness: \(1/(\alpha-1)\)
    • had the power law extended beyond \(u\)

Bayesian inference

CRAN dependencies


  • Mixture distribution improves fit
  • Better than alternatives

Simulated data


  • Not clear-cut even when simulated from true model
  • Uncertainty & finite sample behaviour

Tail heaviness: actual vs implied

  • Indication whole of data could follow the power law
  • Huge uncertainty of \(\xi\) still


  • Away from \(y=x\) line
  • Difficult to have sustained growth according to \(\alpha\)

3. Evolution over time

Tail heaviness relatively stable

So is the power law exponent

Tail steadily lighter than implied by power law


  • Degrees seemingly follow the power law
    • wholly or partially
  • Extreme value mixture distribution
    • power law \((\alpha)\) in the body
    • integer generalised Pareto \((\xi,\ldots)\) in the tail
  • Comparing \(\xi\) and \(1/(\alpha-1)\)
    • indicates if power law applies to whole of data
  • Application: CRAN dependencies
    • seeming stability over time
  • Next steps
    • evolution of the raw data
    • what model (modification) leads to such behaviour



