Methodology · Score components
The four-bar component breakdown. Why each weight is what it is.
Every Cipherwake report shows the DBR score as a sum of four weighted components. This page is a focused companion to the core DBR methodology — it documents only the visualizer, the weights, and how to read each component in isolation.
What the visualizer shows
Four horizontal bars on every report, each scored 0-10, each labeled with its weight. The total DBR score is:
DBR = (keyExchange × 0.50) + (certLifetime × 0.10) + (keyPersistence × 0.20) + (subdomainScale × 0.20)
The four components
| Component | Weight | What low (0-3) means | What high (7-10) means |
| keyExchange | 50% | Hybrid post-quantum TLS (e.g., X25519MLKEM768) — handshake is quantum-resistant | RSA-only or RSA-fallback accepted — handshake decryptable post-CRQC |
| certLifetime | 10% | Aggressive rotation (≤30 days, e.g. ACM 14-day) — minimal classical-compromise window | Multi-year cert lifetime — long classical blast window |
| keyPersistence | 20% | Fresh keypair per cert renewal (CT-log verified) | Same public key reused across many years of cert rotations — single compromise unlocks years of harvested traffic |
| subdomainScale | 20% | Narrow hostname coverage, no wildcard | Wildcard SAN across many active subdomains, all sharing one key |
Why these weights
The weights reflect what each component actually measures for the HNDL threat model:
- keyExchange (50%) — the gating factor and dominant mechanism. The literal HNDL question is "will harvested handshake material be decryptable when a CRQC arrives?" — and the key-exchange algorithm is the answer. Hybrid PQC means no; RSA/ECDHE means yes. Everything else is secondary. Note: for multi-CDN domains whose edges may have different cipher posture, the keyExchange score uses 7-day cross-scan hysteresis — the worst-case observation persists. See v1.1.4 changelog.
- keyPersistence (20%) — measured directly from CT-log mining: did this domain rotate to a fresh keypair on each cert renewal, or did the same key span multiple cert lifetimes? A reused key turns one classical compromise into multi-year retroactive exposure. Our cross-rotation reuse detection is something no other ASM/TLS scanner exposes.
- subdomainScale (20%) — the breadth multiplier. A wildcard cert shared across 47 subdomains makes one key compromise into a 47× multiplier on traffic exposure.
- certLifetime (10%) — kept as a lighter signal. It's a useful proxy for crypto-agility and operational maturity (a domain auto-rotating every 14 days is a domain that can adopt PQC quickly). But it's not a dominant HNDL mechanism on its own; with TLS 1.3 + forward secrecy, cert key compromise doesn't decrypt past traffic. Weighted accordingly.
HSTS, headers, email security, and other ASM signals are reported as findings on every report but do not contribute to the DBR score. They measure different threat models (downgrade defense, email spoofing, XSS) and conflating them with HNDL would create perverse incentives.
Methodology changelog
v1.1.4 — 2026-05-23 (cross-scan worst-case cipher-class hysteresis, scoring topology unchanged)
- Why we added this. The cipher-class probe sometimes flaps on multi-CDN domains: stripe.com / github.com / cloudflare.com and similar route different scan requests to different edges, and the edges may have different cipher posture. One scan hits a TLS-1.2-enabled edge → RSA-fallback detected → keyExchange raw=7 → score 6.2. The next scan hits a TLS-1.3-only edge → RSA never offered → ephemeral-only → raw=5 → score 5.2. Result: the same site oscillates between two scores day-to-day with no underlying posture change. A 14-day analysis of our audit corpus pre-fix showed stripe.com with stddev=0.481 across an exact 1.0 range — the alternating signature of bin-edge flapping.
- How the hysteresis works. When a scan finishes, before the score is computed, Cipherwake reads the last 7 days of
score_history.key_exchange_score for that domain. If any prior scan in the window observed a worse cipher-class posture than this scan, the score is computed against the worst-seen classification. So a domain whose Tuesday scan saw rsa-fallback (raw=7) and whose Wednesday scan saw ephemeral-only (raw=5) is scored Wednesday at raw=7 — the worst observation persists for 7 days.
- The rationale shown to customers. When the hysteresis fires, the keyExchange rationale field is prefixed with
HYSTERESIS and explicitly names which prior scan triggered the override + when. Example: "HYSTERESIS — current scan saw ephemeral-only, but a scan within the last 7 days detected RSA-fallback support (at 2026-05-15T04:01:51Z). Multi-CDN domains can show different cipher posture per edge; Cipherwake commits to the worst-case observation across a 7-day rolling window." So customers never see a silent override — they're told why their score didn't improve immediately.
- Why "worst-case sticky" is the right call. Security improvements should be confirmed over time before scoring is loosened (consistent with the SSL Labs / Mozilla Observatory convention). Regressions, by contrast, should land immediately. A domain that genuinely upgrades from RSA-fallback to ephemeral-only takes up to 7 days for the score improvement to land — fair price for not punishing customers when their own infra flaps.
- What this does NOT change: the scoring formula, weights, bin thresholds, or any other component.
cipherClassification is still produced by the probe per-scan; the hysteresis only OVERRIDES the value passed into computeBlastRadius when prior observations warrant. The probe itself, the moat records (observation_history), and the publicSurface.cipherClassification shown to API consumers all reflect what the probe ACTUALLY saw — no truth gets rewritten.
- Fail-open behavior. If the Supabase read fails (network hiccup, table unreachable), the hysteresis is skipped and the current scan's classification is used unchanged. A moat read can never block scoring.
- Operational read. The 7-day window matches the cert-rotation cadence of most Let's Encrypt-style automation. It's long enough to catch CDN-rotation flapping (which usually happens on the hour-to-day timescale) but short enough that a real configuration cleanup propagates within a week.
v1.1.3 — 2026-05-22 (recency-weighted median smoothing on subdomainScale, scoring topology unchanged)
- Why we smooth subdomainScale. Certificate Transparency mirrors and indexes update asynchronously. Consecutive crt.sh queries against the same domain can return slightly different result sets — usually within ±10% but enough to flip a count across the 20-subdomain or 100-subdomain bin edges, which would otherwise produce day-to-day score drift on stable, unchanged domains. A 14-day analysis of our audit corpus pre-fix showed 8 of 19 flagship domains (stripe.com, github.com, cloudflare.com, vercel.com, supabase.com, netlify.com, notion.so, nytimes.com) drifting with stddev ≥ 0.3 on a 0-10 score — drift the customer would see as "the scanner is broken." We now smooth this out at the scoring layer.
- How we smooth. Each successful crt.sh probe appends its
totalSubdomains count to a 7-entry rolling history stored in ct_log_cache.result.recentCounts. The subdomainScale component scores against a recency-weighted median of that history, with weights oldest→newest of [1, 1, 1, 2, 2, 3, 3] — so the most recent scan has 3× the influence of the oldest, but no single scan ever fully controls the score. We reviewed three alternatives before picking weighted median:
- Max-of-recent (initially shipped 2026-05-22 morning, reverted same day): eliminated the drift bug but ratcheted up forever within the window — unfair to customers who legitimately decommissioned subdomains.
- Plain median: robust against noise but too laggy on genuine surface shrinkage (a real 120→35 subdomain drop wouldn't fully reflect for 4+ scans).
- Weighted average: a single outlier scan (e.g., a one-off 150-subdomain probe alongside consistent 22-count scans) could pull the score noticeably, defeating the noise-smoothing intent.
Weighted median strikes the balance: less noisy than raw count, less sticky than max-of-recent, more responsive than plain median.
- What this means for customers. If you scan the same domain twice in a week and your infrastructure hasn't changed, the score stays the same. If you genuinely shrink your subdomain footprint, the score reflects the change within 2-3 successful scans. If you genuinely grow it, same. A single anomalous CT-log result doesn't move the score. When the smoothed count differs from this scan's raw observation (the typical noise case), the rationale shows both numbers — e.g., "42 subdomains; latest CT observation was 39 — moderate surface." Cipherwake does not invent subdomain counts; the scoring input is a weighted median of recent successful CT observations, and the latest individual observation is always shown alongside it for transparency.
- Trust framing (R74-confirm 2026-05-22, GPT review). The earlier rationale wording "smoothed from N this-scan" was retired same day as launched — it could be misread as "Cipherwake changed your number." The current phrasing distinguishes scoring input (a stabilized series, used for the grade) from this-scan CT observation (a raw upstream signal). Both appear in the customer report when they differ. The grade does not move on a single noisy CT-log response, but persistent infrastructure changes — including genuine subdomain reductions — do affect the score within 2-3 successful scans.
- What this does NOT change: scoring weights, the four scoring formulas, the grade thresholds, or any other component.
subdomainScale still uses bin thresholds at >20 and >100; only the input to those thresholds changed from "this scan's count" to "weighted median over recent successful counts."
- Adversarial review: R74 (2026-05-22). Recommendation came from GPT, reasoning included: avoid weighted average (single-outlier sensitivity), avoid plain median (lag on genuine shrinkage), avoid max (ratchet-up unfairness). Implementation matches their suggested weight schedule.
v1.1.2 — 2026-05-21 (data-source consolidation, scoring algorithm unchanged)
- CT-log enumeration is now crt.sh-only. Earlier versions raced crt.sh against a CertSpotter backup to cover the ~5% of cold-scan cases where crt.sh internally times out on huge result sets. As of 2026-05-21 we've disabled the CertSpotter integration. Rationale: the stale-cache fallback shipped in v1.1.1 already covers transient crt.sh failures for any previously-scanned domain (14-day window), which absorbs most of CertSpotter's prior value. Disabling the integration removes one third-party data-provider dependency and one billing relationship. The integration code remains in the codebase (
lib/ctLogs.ts fetchFromCertSpotter) and is reactivated by setting CERTSPOTTER_API_KEY in the Vercel env — no code change required if we ever want it back. Per Rule 1 transparency: Cipherwake is the only ASM/TLS scanner we know of that publicly documents its CT-log source. Most competitors treat data provenance as opaque.
- What this means for your scan accuracy: ~95% of scans see no change — crt.sh handles them fine and always did. The ~5% that previously relied on CertSpotter as a backup now fall through to the stale-cache (if available, ≤14d old) or to the neutral baseline (
totalSubdomains=1) marked as degraded in the report. This affects first-time scans of very-large-result-set domains (think enterprise giants with thousands of certs) where crt.sh's internal query times out.
v1.1.1 — 2026-05-20 (data-path hardening, scoring algorithm unchanged)
- Stale-cache fallback in
subdomainScale when CT-log probes fail. Pre-fix: when the CT-log probe returned empty for a domain (crt.sh internal timeout on huge result sets, transient upstream failure, or genuinely low-volume domains), subdomainScale collapsed to neutral (totalSubdomains=1) and the score dropped by ~1 point for that scan. Next scan, probe succeeded, score recovered. Customer-visible: flagship domains like stripe.com / github.com / cloudflare.com / google.com oscillated between consecutive daily grades (C ↔ D) without their actual posture changing — and customers monitoring them got daily false score_drop → score_recover alerts. Fix: when the upstream fails, reach for a stale-but-recent (≤14 days) cached count from a previous successful probe before falling back to neutral. Score stability is now invariant to transient CT-log probe failures.
- How to spot the path in your scan output: a
subdomainScale rationale of "Subdomain enumeration unavailable for this scan — surface size unknown" means the probe failed AND no stale cache was available. A rationale of "X subdomains observed in CT logs" with no "stale" qualifier means real-time probe data was used. Stale-cache reuse is transparent to the score itself but is recorded in scan_cache._meta.ctSource = "cache" if you parse JSON output.
- What this does NOT change: the four scoring weights, the four scoring formulas, the grade thresholds. Pure data-availability fix.
v1.1 — 2026-05-09
- Removed HSTS from score. v1.0 penalized HSTS in
keyPersistence on the rationale that "HSTS pinning amplifies key-compromise window." This was wrong: HSTS is a downgrade-defense mechanism, not a key-persistence signal. It does not pin specific keys, does not affect harvested-traffic decryptability, and is generally a security positive. Including it created an incentive to weaken downgrade protection in pursuit of a better grade — exactly the failure mode this scanner exists to surface in others. HSTS is now a finding (in httpHeaders) but not a score input.
- Rebranded
keyPersistence to measure actual key reuse. Now scored from CT-log data: distinct public keys observed across cert rotations, longest-reuse window in years. Fresh-key-per-renewal scores 1; multi-year reuse scores 6-8.
- Reweighted to put keyExchange first. 35% → 50%. The literal HNDL mechanism deserves dominance.
- Reduced certLifetime weight. 25% → 10%. Useful agility signal but not a primary HNDL mechanism.
- Tightened certLifetime granularity at the short end. 14-day cert was raw=4 (under-rewarded); now raw=2.
- Removed the universal cross-component PQC discount (was 0.7×). v1.0 applied a discount across non-keyExchange components when hybrid PQC was detected. v1.1 removes this: each component now honestly measures what it claims, and PQC's effect is fully credited inside
keyExchange (raw=1 for hybrid PQC). Other components (cert lifetime, key reuse, subdomain breadth) measure non-HNDL risks PQC does not mitigate, so they shouldn't be artificially discounted.
v1.0 — 2026-05-06 (original)
Weights: keyExchange 35%, certLifetime 25%, keyPersistence 15%, subdomainScale 25%. PQC discount: 0.7× multiplier on cert/keyPersistence/subdomainScale when hybrid PQC detected. keyPersistence measured HSTS rather than CT-log key reuse. Methodology bug: HSTS-as-key-persistence created incentive to weaken downgrade defense. Replaced by v1.1.
Why no PQC discount in v1.1?
v1.0 applied a 0.7× multiplier to all non-keyExchange components when hybrid PQC TLS was detected, on the rationale that "PQC mitigates direct HNDL risk so secondary components should drop." On reflection, this is methodologically muddy:
- Cert lifetime measures classical-compromise blast window, not HNDL. PQC doesn't reduce this risk at all — a stolen RSA-2048 cert key in 2026 is just as dangerous regardless of whether the handshake was PQC.
- Key reuse history measures whether a single private key is sprawled across multi-year cert lifetimes. This is a real, directly-observable signal whose weight should not depend on PQC presence.
- Subdomain scale measures breadth multiplier. PQC doesn't make a wildcard cert any narrower in scope.
v1.1 makes the design honest: PQC's effect is fully captured inside keyExchange (raw=1 for hybrid PQC, vs. raw=5 for ECDHE-only and raw=7-9 for RSA fallback / RSA-only). Combined with the 50% weight on keyExchange, that already gives PQC adopters a strong, defensible advantage — without artificially discounting unrelated components.
Score range — why the floor is ~1.5, not 0
The lowest achievable score in 2026 is approximately 1.5/10 (Grade A), even for a domain doing everything right: hybrid PQC TLS, ≤30-day cert rotation with a fresh key per renewal, and a narrow non-wildcard public surface. The floor exists because subdomainScale bottoms out at raw=3 ("narrow public surface") — any reachable HTTPS domain has at least some blast radius, and the score honestly reflects that.
A literal 0 would require technology that is not yet deployed anywhere on the public internet: pure ML-KEM TLS without classical hybrid, ML-DSA (FIPS 204) certificate signatures throughout the chain, and zero public-facing surface. Hitting 1.5 today means you've done everything currently possible — the score is calibrated against present-day cryptographic reality, not a theoretical future where the floor moves down as deployment of pure-PQC primitives becomes practical.
Coverage gaps (roadmap)
Things this score does NOT yet capture, queued for future versions:
- Session ticket TTL — long-lived TLS resumption tickets enable replay/persistence risk. Phase B detection in roadmap.
- 0-RTT (early data) — TLS 1.3 0-RTT has special replay semantics. Detection in roadmap.
- Cert chain weakest link — currently surfaced as a finding, not in the score. Future v1.2 may promote this to a score input.
How to read the bars
- One dominant bar. When a single component drives most of the score, the remediation path is obvious. A failing keyExchange bar means "disable RSA fallback"; a failing keyPersistence bar means "rotate the keypair."
- Two roughly-equal contributors. Suggests a structural issue (e.g. wildcard cert + long key reuse = systemic key-management practice).
- Four balanced bars. Rare. Indicates a generally weak crypto posture across the board, not a single fixable misconfiguration.
What this view does NOT claim
- It does not say which fix is cheapest. A 50% weight on keyExchange means it dominates the score, not that fixing keyExchange is the cheapest path to a better grade.
- It does not capture internal exposure. Public-surface only. Internal Blast Radius is empirically 12-40× the public score. See DBR: what the score does not measure.
- It does not cover post-handshake risks. Session-ticket TTL, 0-RTT, and early-data risks are not in v1. Roadmap items.
Try it
- Web: every report shows the four-bar visualizer.
- API:
/api/scan?domain=... returns components object with the four values.