Technical documentation

Methodology

How the British Society Stress Index is constructed, what it measures, and what it does not.

What this index measures

The British Society Stress Index is a non-partisan data project measuring stress across key systems of national life. It does not assign blame to political parties. It tracks whether core systems are improving, deteriorating or entering dangerous levels of strain.

The index combines 11 domains, each measuring a different national system, weighted by their estimated contribution to systemic resilience. Each domain score is the median of its component metric stress scores. The national score is a weighted sum of domain medians.

Current release: June 2026 · National score: 52/100 · Band: Fragile

What this index does NOT measure

The index does not:

  • Attribute blame to any political party, government, or named individual.
  • Predict policy outcomes or future conditions.
  • Model cascading system failures or tipping points.
  • Provide individual-level data on any person.
  • Measure cultural, moral, or ideological conditions.
  • Claim to know at what score a society becomes ungovernable.

It tracks whether measured systems are under more or less stress than in previous periods. The interpretation of why conditions have changed is left to the reader.

What midnight means — and does not mean

Midnight does not mean apocalypse. It means multiple national systems are under severe stress at the same time, increasing the risk that failures become self-reinforcing.

The clock metaphor is a communication device, not a literal prediction. The 0–100 stress scale maps to minutes to midnight via the formula: minutes = 60 × (1 − score / 100). A score of 60 = 24 minutes to midnight; 90 = 6 minutes.

Domain model and weights

The index currently comprises 11 domains. Weights sum to 100. Each domain is scored as the median of its component metric stress scores. Weights represent the estimated share of systemic national resilience carried by each domain. These weights are currently set by editorial judgement and will be reviewed formally.

Metric scoring (0.6 / 0.3 / 0.1 formula)

Each metric produces a stress score 0–100, composed of three sub-scores:

metric_stress = clamp(0.6 × level_score + 0.3 × trend_score + 0.1 × volatility_score, 0, 100)
  • Level score (60%): where the current value sits relative to the metric’s own historical distribution. Metrics with ≥ 8 verified historical points use historical-percentile normalisation: a value at the 90th percentile of its own history scores ~90; the median scores ~50. Direction of badness is automatically accounted for. Shorter series fall back to provisional best/worst bounds, flagged in the metadata as the weaker method.
  • Trend score (30%): direction and velocity of recent change. trend = clamp(50 + 500 × worse_frac, 0, 100) where worse_frac is the signed fractional change (positive = worsening). A 10% deterioration or improvement spans the full 0–100 range.
  • Volatility score (10%): standard deviation of period-over-period fractional changes (if ≥ 4 history points; else 50). volatility = clamp(sd × 800, 0, 100). Marked as provisional.

Trend scoring and direction

trendDirection is derived from the trend score: >55 = deteriorating; <45 = improving; else stable.

The retrospective national trend line shown on the dashboard uses a level-only method (no trend or volatility component) applied consistently back to 2018. This ensures the historical trajectory uses a single method end-to-end. The current headline composite includes all three components and may differ from the final plotted point; this difference is disclosed beneath the chart.

Confidence scoring and thresholds

Each metric carries a confidenceScore (0–1) reflecting source quality, update frequency, and known weaknesses. The index dataConfidence is the mean of all metric confidence scores.

ThresholdLevel
≥ 0.85High
≥ 0.70Medium
< 0.70Low

Current index data confidence: 0.82. Every colour-coded confidence signal includes a text label; colour is never the sole indicator.

Weighting model

The national score is a weighted sum of domain medians. Weights are expressed as shares summing to 100 and are stored in the source data as decimals (e.g. 0.15 = 15%).

national_score = Σ(weight_d × domain_median_d) / 100

Higher-weighted domains have a proportionally larger influence on the headline figure. The Stress Contribution chart on the dashboard visualises this: each bar equals score × weight.

Systemic risk penalty

When three or more domains simultaneously reach the Critical band (score ≥ 75), a systemic penalty is added to the national score to reflect the increased probability that concurrent failures become self-reinforcing:

  • 3–4 domains in Critical: +5 points
  • 5 or more domains in Critical or Near Midnight: +10 points

Penalties are not cumulative. The national score is capped at 100. The applied penalty is recorded as systemicPenaltyApplied (0, 5, or 10). Current release: +0 pts.

Source hierarchy and tiers

All metrics must be sourced. Sources are classified into five tiers reflecting data quality and methodological rigour:

Official Statistics

Published by national statistical authorities (ONS, DWP, DfE, NHS England, etc.) under the Code of Practice for Statistics. Highest confidence.

Official (In Development)

Officially published but methodology or coverage is still maturing (e.g. some NHS Digital series). Medium confidence.

Administrative

Operational records collected for administrative rather than statistical purposes (e.g. Environment Agency event monitoring). Medium confidence — may reflect reporting changes.

Survey

Probability or omnibus surveys (e.g. Crime Survey for England and Wales, Opinions and Lifestyle Survey). Confidence depends on response rates and survey design.

Independent Body

Published by independent public bodies with statutory remits (OBR, CCC, Bank of England). High authority; methodology may differ from ONS conventions.

Full source list: data sources page.

Revision policy

All releases are versioned and preserved. When source data is revised (as is common with ONS administrative series), the metric value is updated but the original release reading is retained in the metric history, labelled with its release period.

Methodology changes are disclosed in the Monthly Review section 10 of the relevant release. Significant methodology changes trigger a re-calibration notice and a comparison table showing how scores would have differed under the old method.

Political neutrality rules

The following ten rules govern the construction of every metric, domain, and narrative in this index. They are reproduced verbatim from the project contracts.

  1. The index does not assign blame to parties.
  2. The index tracks conditions and trends.
  3. Every metric must have a source.
  4. Every metric must have a stated weakness.
  5. Weightings are transparent.
  6. Revisions are preserved.
  7. Survey/perception data is separated from hard administrative data.
  8. Culture-war claims are excluded from the core score.
  9. Immigration is only measured through neutral capacity-pressure indicators, not moral framing.
  10. The project distinguishes between deterioration, low baseline, and poor data quality.

Approved vocabulary: under pressure, elevated stress, fragile, critical, deteriorating, improving, material movement, data confidence, requires monitoring, stable, resilient.

Banned: “collapse”, “collapsing”, “broken”, any UK political party name, any named politician, “the government has failed”. “crisis” is allowed only inside a registered source’s own dataset name.

Known weaknesses

The following limitations are documented honestly. They do not invalidate the index but should be borne in mind when interpreting any specific reading.

  • England-only bias: several public services and housing metrics use England-only administrative data, reflecting devolved structures. Scottish, Welsh and Northern Irish equivalents are registered for future addition.
  • Survey lag: welfare and perception indicators (HBAI, CSEW, community surveys) carry a 12–18 month lag. Current readings may understate or overstate current conditions.
  • Short-history metrics: 27 of 58 metrics currently use provisional best/worst bounds rather than percentile normalisation; the other 31 (those with eight or more verified annual points) are scored as a historical-percentile rank. Each metric carries a normalisation field ("percentile" or "bounds") for transparency. Bounds-based metrics migrate to percentile as their series lengthen — which can shift the headline slightly: scoring more metrics against their own record is more accurate but moves the number as data is added.
  • Band boundaries are thresholds on a continuous score, not sharp categories: a national or domain reading within a point or two of a threshold should be read as sitting between bands rather than as a sharp category change.
  • Percentile interpretation: a metric that has been chronically stressed will score around its own historical median (~50) even if the absolute level is high, because percentile is relative to its own track record. The trend component (30% weight) separately captures whether stress is worsening or improving.
  • Domain weights: the eleven domain weights are currently set by editorial judgement, not empirical optimisation, and the V2 additions (Population Health, Environment & Climate) make a formal weighting review the priority. A future version will derive weights from a systematic expert elicitation.
  • Volatility score: the volatility component is marked provisional. With fewer than four history points per metric it defaults to 50, adding noise to the composite.
  • Metric independence: some metrics within a domain (e.g. housing affordability and social-housing waiting lists) are positively correlated; using the median partially mitigates double-counting but does not eliminate it.

Future improvements

The following improvements are planned. They are listed in rough priority order.

  • Migrate the remaining 27 metrics to historical-percentile normalisation as verified historical series are accumulated.
  • Regional and sub-national breakdowns for all eleven domains.
  • Live data ingestion pipeline with automated source refresh.
  • Formal domain-weight elicitation exercise.
  • Independent methodology review by a panel of statisticians.
  • Confidence intervals on domain and national scores, not just point estimates.
  • Machine-readable API for the full index and metric time series.

Further technical documentation is available in the repository’s methodology/ directory. Source ingestion specifications are in CONTRACTS.md.

Risk band reference

Score rangeBand idLabel
0–24stableStable
25–49strainedStrained
50–74fragileFragile
75–89criticalCritical
90–100near_midnightNear Midnight