DiscoveryLab — Discovery Strategy
Peptiter · Technical specification of the eight-stage peptide discovery workflow, now extended with pathway-mechanism verification and Lean 4 audit receipts
DiscoveryLab is a condition-first peptide discovery application. This document summarises the workflow stages, the artefact each stage produces, the diagram elements shown in the product UI, and the underlying methods or data sources. The current implementation adds an explicit mechanism-verification layer: AI and graph methods can propose pathway hypotheses, Swift checks finite graph properties, Lean 4 artifacts provide an audit path, and perturbation or wet-lab evidence tests the assumptions.
NEW Pathway intelligence and formal mechanism checks
Candidate peptides can now carry typed pathway mechanisms: biological nodes, causal edges, intervention-blocked nodes, therapeutic reachability goals, adverse-pathway blockade claims, protected-pathway safety claims, conservation laws, and perturbation evidence anchors.
Implemented artifacts
PathwayMechanismHypothesis |
Structured candidate mechanism with proposal source, evidence anchors, causal graph, intervention context, reachability, blockade, safety, and conservation claims. |
PathwayMechanismVerifier |
Local graph-level verifier that checks reachability, blockade, protected-node safety, and reaction conservation before wet-lab handoff. |
LeanVerificationArtifact |
Generated Lean 4 module containing node types, baseline and intervention reachability, theorem names, and checksum-bound report identity. |
peptiter-lean-verify |
External CLI that writes Lean source, invokes Lean or dry-run mode, captures diagnostics, validates checksums, and emits JSON verification receipts. |
PerturbationEvidenceRecord |
Assay, omics, CRISPR, chemical perturbation, or partner wet-lab evidence attached to mechanism assumptions and scored for support, contradiction, or gaps. |
Current assays: mechanismVerification · perturbationEvidence · receptor fit · stability · solubility · aggregation · synthesis · off-target · assay readiness
01 Select condition
Define therapeutic intent or phenotype as the entry point. Anchored to standard vocabularies so downstream queries are reproducible.
Diagram elements
Condition node (E66 / M06 / G56) |
ICD-10 code with surrounding evidence ring; standard vocabulary anchor. |
phenotype |
Observable clinical traits (HPO terms) used to constrain the search. |
intent |
Agonism, antagonism, allosteric modulation, or biased signalling. |
comorbid |
Co-occurring conditions adjusting target prioritisation and selectivity requirements. |
contra-ind. |
Receptor or pathway interactions to avoid (off-target, safety liabilities). |
References: ICD-10 · MeSH · MONDO · HPO
02 Map pathways & receptors
Build a directed graph from the condition through pathway layers to candidate receptors with ligand class metadata and candidate mechanism claims that can later be verified.
Diagram elements
Condition vertex |
Entry point of the directed pathway graph. |
Upstream / direct / downstream pathway |
Signalling cascade reachable from the condition (e.g. incretin → GLP-1/cAMP → insulin secretion for obesity). |
Receptor candidates (4 nodes) |
Druggable receptors retrieved with ligand-class metadata, e.g. GLP1R, GIPR, GCGR, Y2R. |
Convergence node |
Receptors with overlapping endogenous ligands. |
References: Reactome · KEGG · IUPHAR / GtoPdb · UniProt · Lean 4
03 BioScout source systems
Evidence-backed mimicry plans drawn from evolved peptide systems; converted into a curated motif library.
Diagram elements
Source organism rows |
Program-specific lineages (e.g. Gila monster / amphibian / fish gut for obesity; cone snail / spider / scorpion for ion channels). |
Motif library |
Curated sequence motifs and pharmacophores indexed from APD3, DRAMP, ConoServer. |
References: APD3 · DRAMP · ConoServer
04 Seeded evolution
Start from validated bioactive peptide seeds and evolve local analogs with full ancestry, parent IDs, and operator history.
Diagram elements
Seed node |
Validated parent peptide with known bioactivity (e.g. exendin-4, magainin-2, ω-MVIIA). |
Branch operator |
Substitution / cyclisation / N-methylation under family constraints. |
Analog leaves |
Candidates with operator history and rationale for ranking. |
References: Pfam / InterPro · Hopp & Woods 1981
05 Visualize peptide–receptor fit
Structure-aware 3D review of binding orientation; RealityKit-based visualisation on macOS / visionOS.
Diagram elements
Receptor scaffold |
Schematic of the target (Class A/B GPCR 7TM bundle, cytokine receptor, ion channel). |
Peptide backbone |
Cα trace of the candidate placed against the receptor pocket. |
Cα atoms |
Per-residue selectable nodes for side-chain inspection in the 3D view. |
Key contact |
Predicted polar / hydrophobic interaction with a pocket residue (distance < 4 Å). |
References: RCSB PDB · AlphaFold DB · PEP-FOLD
06 In-silico lab assessment + mechanism verification
Multi-criteria scoring with explicit rejection gates pre-wet-lab. Each gate is conservative until calibrated outcome data exists. Mechanism claims are also checked for pathway reachability, intervention blockade, protected-node safety, conservation, Lean auditability, and perturbation evidence support.
Diagram elements
fit |
Composite score from docking pose quality and contact-residue agreement. |
stab |
Predicted resistance to proteolysis and conformational entropy penalty. |
sol |
CamSol / SolubiS-style intrinsic solubility proxy. |
aggr |
Zyggregator / Tango-style β-aggregation score. |
synth |
SPPS coupling-difficulty estimate plus length and modification penalties. |
tox |
ToxinPred-class classifier; failing candidates are gated out. |
mech |
Graph-level mechanism verification: desired endpoint reachable, adverse endpoint blocked, safety claim explicit. |
Lean receipt |
Checksum-bound verification artifact generated for CI or reviewer-facing audit. |
perturb |
Evidence coverage across wet-lab, omics, CRISPR, chemical perturbation, or partner assay records. |
References: ATTRACT / AttractKit · Lean 4 · ToxinPred · CamSol · Tango / Zyggregator
07 Prepare wet-lab batch
Hand off through LabSpace to capability-matched partners using machine-readable batch manifests.
Diagram elements
Batch manifest |
Sequences, modifications, and assay plan in a SiLA 2 / Allotrope ADF-compatible format. |
Candidate IDs |
Per-program identifier prefixes (e.g. DL-GLP-0421, DL-IL17-0438, DL-CAV-0455). |
Vials |
Synthesised quantity and assay readout — height encodes activity proxy (EC50 / IC50). |
References: SiLA 2 · Allotrope ADF
08 Receive results & refine
Wet-lab and perturbation feedback update the surrogate model and the encoded mechanism assumptions; re-ranking selects the next batch or the next assay under an acquisition function.
Diagram elements
Closed loop |
Bayesian / active-learning loop between wet-lab and re-rank nodes. |
Wet-lab node |
Assay results (binding affinity, cytotoxicity, stability) returned through LabSpace. |
Re-rank node |
Surrogate model updated with new evidence; acquisition function selects next batch. |
References: Shahriari et al. 2016 (Bayesian Opt.) · Settles 2009 (Active Learning)
Program variants
DiscoveryLab ships with three reference programs that swap condition codes, pathway labels, source organisms, seed peptides, gate thresholds, and candidate ID prefixes throughout the workflow.
| Program | Condition | Primary target | Seed | Sources |
|---|---|---|---|---|
| Obesity | ICD-10 E66 | GLP1R · Class B GPCR | exendin-4 | Gila monster, amphibian, fish gut |
| Inflammation | ICD-10 M06 | IL-17RA · cytokine receptor | magainin-2 | amphibian, marine invert., human defensin |
| Neuropathic pain | ICD-10 G56 | Cav2.2 · N-type Ca²⁺ channel | ω-MVIIA | cone snail, spider venom, scorpion |
Language and posture
The system uses constrained search, receptor-conditioned design, evidence-gated generation, candidate-family evolution, in-silico triage, pathway-mechanism verification, Lean audit receipts, perturbation evidence scoring, and a wet-lab feedback loop. It does not promise instant discovery, guaranteed binding, clinical efficacy, or fully automated drug discovery. Formal verification means the encoded claims follow from encoded assumptions; it does not prove that the biology is complete or clinically true. Scoring is intentionally conservative until calibrated outcome data exists, then specific gates are replaced by validated production packages and curated pathway importers.