Pre-registered evaluation protocol
Every model release is scored against a frozen v2 benchmark: held-out receptor families, leave-one-target-out splits, and signed eval cards. No metric is reported without the matching protocol hash.
DiscoveryLab is pre-launch. In place of awards, here is the engineering and scientific basis we are willing to defend in review: pre-registered evaluation, calibrated probabilities, typed mechanism verification, and reproducible audit artifacts.
Every model release is scored against a frozen v2 benchmark: held-out receptor families, leave-one-target-out splits, and signed eval cards. No metric is reported without the matching protocol hash.
Candidate scores and BioFoundation outputs stay rank-only until a frozen calibration artifact exists. The current Bio-JEPA heads and biological foundation-model scorecards are context for prioritization, not calibrated biological probabilities.
Pathway hypotheses are encoded as graphs with intervention, blockade, safety, and conservation claims — then checked by an in-process verifier before being attached to a candidate.
Each mechanism report emits a Lean 4 module with theorem names and a checksum binding the artifact to the candidate. A CLI verifier (peptiter-lean-verify) replays it in CI.
Designed sequences are screened against configured select-agent and toxin deny-lists before any wet-lab handoff. Deny-list hits are blocked, and the screen fails closed if its database cannot load. It is an operational, rank-only gate — not a comprehensive biorisk clearance.
Auditable evidence boundaries: every score, structure, heuristic, and wet-lab readout is tagged with what it is allowed to support — and what language it explicitly blocks.
DiscoveryLab does not predict clinical efficacy, replace IND-enabling studies, or assert mechanism without graph-level verification and perturbation evidence. BioFoundation scorecards add biological context, not target engagement or safety evidence.
Each candidate ships a reproducible artifact bundle: protocol hash, calibration curve, mechanism graph, Lean receipt, and perturbation ledger. Reviewers can re-run the verifier offline.
Findable, accessible, interoperable, reusable artifact bundles per candidate.
Deterministic seeds, pinned weights, signed eval manifests.
Attributable, legible, contemporaneous, original, accurate audit trail.
Lean source and CLI published so receipts can be replayed by third parties.