The integration spine is in place. Each future change publishes a regenerated EVAL.md so we can tell which moves help and which add complexity without a measurable win. None of these are speculative product features — they are concrete extensions of the typed schema, the semiring layer, and the Lean theory.
Shipped
Per-path attribution
emit_biology_v2.py enumerates every signed mechanism path through biology_v2 (23 paths across 14 peptide × endpoint pairs). The Swift LeanExternalVerifier surfaces per-path verdicts so opposing edges from the same peptide no longer cancel into a silent zero — both the protective and risk paths reach the reviewer.
Shipped
BiologyV2.lean signed-sum theorems
BiologyV2.lean encodes 14 signed-sum theorems and 28 path-count theorems decided by native_decide; lake build is green. auditIdentity.artifactChecksum binds to TensorLang's biology_v2 semantic hash so drift between the two backends is an audit failure.
Shipped
Weighted PathwayLean.Core predicates
PathwayLean.Core gained SignedPathRecord plus signedSum / riskPathCount / protectivePathCount / netActivator / netProtective — generic over Peptide and Endpoint inductive types. BiologyV3.lean ships 30 qualitative theorems (net_activator / net_protective / no_net_effect, one per non-zero peptide × endpoint cell) decided by native_decide.
Shipped
Reactome / Open Targets transpiler
reactome_loader + opentargets_loader + transpile_biology compose public mechanism data with the Peptiter-specific BIOLOGY_OVERLAY (cell-state and endpoint annotations) into a SignedV2Spec. biology_v3 (10 peptides, 7 targets, 9 pathways via the IL-23/IL-17/JAK-STAT fixture) and biology_v3_signed (47 enumerated signed paths) are emitted via the same pipeline; tl verify + lake build green. Real Reactome download bootstrap (download_reactome.py + build_biology_v4.py) ships gitignored under transpiler/real/ — pyarrow-gated Open Targets Parquet adapter for the next data scale-up.
Shipped
Closed-loop active learning seeded from biology_v3
V3LabLoopOrchestrator composes ActiveLearningEngine + AssayPlanDrafter + LabJobQueue: pick highest-uncertainty candidate via Beta(α, β) priors, draft an AssayPlan (vehicle / dose schedule / plate map / risk screen all gated), route through the approval gate, walk the LabJob lifecycle drafted → approved → queued → running → returned → analyzed, attach the simulated outcome, update the belief. SCIENTIST.md §11.6 contract demonstrated end-to-end.
Shipped
Probabilistic third tier (log-sum-exp)
A log-sum-exp semiring over the committed biology_v2 schema — same domains, same einsum spec. semiring_spine.py reproduces the hard tiers exactly (14/14 decisive claims) and adds calibrated P(net-protective) + evidence strength. Now Lean-verified to v2 parity: biology_prob_v1.tl (tl check green) + BiologyProbV1.lean (native_decide, lake build green) re-derive P = protective/(protective+risk) exactly — efficacy = 100%, the broad-JAK infection catch = 0%, MAP equals the signed verdict — all three artifacts bound to one semantic hash (3d7281c9…).
Long-term
Tiny neural backbone emitting BiologySpec
Today emit_biology.py is hand-curated. A small transformer that consumes (peptide, context) and emits a BiologySpec JSON validates the boundary in INTERFACE.md before any large pretraining run.
Shipped
Multi-step temporal dynamics
Beyond static reachability to perturbation-conditioned trajectories. The v2 signed mechanism is a per-step influence operator; a discrete Euler update state_t + intervention → state_t+1 over scaled-integer cell-state is now Lean-verified: BiologyTemporalV1.lean (native_decide, lake build green, hash 486594b9…) proves a guselkumab_like intervention is monotone-down on treatment, halved at trough, bounded, and rebounds after washout — plus a discriminator theorem that an adverse Th17-promoting driver is NOT suppressive.
Shipped
Real per-edge confidences (OpenTargets), Lean-verified
The probabilistic tier's uniform prior is replaced with live OpenTargets Platform target↔disease association scores. A noisy-OR over each path's MoA × association gives calibrated confidence varying 0.00–0.83 where the hard tier flattens to ~1. Validated: secukinumab → IBD scores only 0.23 (IL17A–IBD 0.13) — IL-17 blockade is genuinely weak/harmful in IBD — correctly down-weighted. Lean-verified and tightened: BiologyConfV1.lean encodes each score as a ±5% permille interval and native_decide pins an EXACT confidence band [lo,hi] for all 10 claims (width ≤120), proves the robust strongest/weakest, and — honestly — proves the overlapping middle is NOT strictly orderable (no false precision). hash da856445….
Shipped
Claim-specific evidence bands (per-datatype)
BiologyConfV2.lean tightens the uniform band into a claim-specific one from OpenTargets evidence breadth — band = clamp(80−20·n_datatypes,15,80) permille, so well-corroborated associations earn narrower bands. native_decide proves 2 robust orderings the uniform band could not: e.g. guselkumab→IBD [615,653] now sits above guselkumab→psoriasis [511,587] (same drug; IL23R↔IBD has 3 corroborating datatypes vs 2). And it proves the genuinely-similar IBD mid-cluster still overlaps — no false precision. hash 116deb02….
Shipped
Swift Semiring engine + catalog
The explicit semiring abstraction is lifted into the production engine: SemiringSpine.swift defines a Semiring protocol (zero/one/⊕/⊗/edge) with Bool / signed / log-sum-exp instances and one generic contraction that runs the same mechanism chain under any tier. All three new tiers are registered in the MCP catalog; SemiringSpineTests checks the engine and that each tier's manifest hash binds its Lean auditIdentity.
Continuous
Ground-truth coverage
peptide_ground_truth.json now covers 10 candidates (the original 5 plus risankizumab, ixekizumab, adalimumab, baricitinib, upadacitinib) with FDA-label provenance. Every release ships an EVAL.md diff via eval_harness --diff-against-base, plus an SVG badge consumable from any README or dashboard.