Credibility · DiscoveryLab

Technical credibility

Methodology, validation, and the boundary of what DiscoveryLab is allowed to claim.

DiscoveryLab is pre-launch. In place of awards, here is the engineering and scientific basis we are willing to defend in review: pre-registered evaluation, calibrated probabilities, typed mechanism verification, and reproducible audit artifacts.

methodology · M01

Pre-registered evaluation protocol

Every model release is scored against a frozen v2 benchmark: held-out receptor families, leave-one-target-out splits, and signed eval cards. No metric is reported without the matching protocol hash.

methodology · M02

Calibration before ranking

Candidate scores and BioFoundation outputs stay rank-only until a frozen calibration artifact exists. The current Bio-JEPA heads and biological foundation-model scorecards are context for prioritization, not calibrated biological probabilities.

methodology · M03

Mechanism as a typed object

Pathway hypotheses are encoded as graphs with intervention, blockade, safety, and conservation claims — then checked by an in-process verifier before being attached to a candidate.

methodology · M04

Formal audit artifacts

Each mechanism report emits a Lean 4 module with theorem names and a checksum binding the artifact to the candidate. A CLI verifier (peptiter-lean-verify) replays it in CI.

methodology · M05

Generation-time biosecurity screen

Designed sequences are screened against configured select-agent and toxin deny-lists before any wet-lab handoff. Deny-list hits are blocked, and the screen fails closed if its database cannot load. It is an operational, rank-only gate — not a comprehensive biorisk clearance.

validation · eval card v2 (signed)protocol hash · a3f1…9c2e

Repository candidates

evaluated for wet-lab selection and ESM-2 embedding

221

Wet-lab selection

one candidate across each represented target

Synthesis specs

10 candidate specs plus 3 epigenome placeholders

Assay protocols

binding, functional, and Lyapunov-shape tiers

ESM-2 embeddings

facebook/esm2_t36_3B_UR50D, 2560 dimensions

221

BioFoundation providers

deterministic plus TranscriptFormer, Geneformer, scVI, AIDO.Cell, Arc State, AlphaGenome, and BioHub ESM contracts

BioFoundation tests

adapter unit tests passing

5/5

Evidence graph tests

Build 1 graph tests passing with BioFoundation ingestion

6/6

Backbone tests

Python package tests passing

21/21

latest local artifacts: experiments/wetlab-prep and backbone/artifacts

Peer-review style claim boundary

What we claim

Auditable evidence boundaries: every score, structure, heuristic, and wet-lab readout is tagged with what it is allowed to support — and what language it explicitly blocks.

What we do not claim

DiscoveryLab does not predict clinical efficacy, replace IND-enabling studies, or assert mechanism without graph-level verification and perturbation evidence. BioFoundation scorecards add biological context, not target engagement or safety evidence.

How to challenge a result

Each candidate ships a reproducible artifact bundle: protocol hash, calibration curve, mechanism graph, Lean receipt, and perturbation ledger. Reviewers can re-run the verifier offline.

Standards we hold ourselves to

FAIR data

Findable, accessible, interoperable, reusable artifact bundles per candidate.

Reproducible by default

Deterministic seeds, pinned weights, signed eval manifests.

ALCOA+ aligned

Attributable, legible, contemporaneous, original, accurate audit trail.

Open verifier

Lean source and CLI published so receipts can be replayed by third parties.