Peptiter / DiscoveryLab
Platform
Platform & handoff

Operations — run, review, and inspect surfaces.

The operational surfaces used to run, review, and inspect DiscoveryLab in production.

Operational surface · production scaffolding

The MCP catalog, persistence layer, and CI bench that make TensFormer + AIScientist deployable.

Every typed surface above ships with the production scaffolding to run it: a stdio JSON-RPC binary external agents can plug into, an on-disk cache that survives restarts, recursive improvement proposal tools, smart short-circuits for repeated work, an offline inspector, and CI that posts the regression badge on every PR. The new improvement loop is designed to sit behind the same typed dispatcher and review gates as the existing MCP surfaces.

Live eval-harness badge — F1 score from the most recent regression runsynced from tensorlang/eval/eval_badge.svg on every build
01 · MCP catalog

Five first-party MCP servers behind one composite dispatcher.

PeptiterMCPCatalog.makeComposite(...) returns a CompositeMCPServer routing literature/* → LiteratureMCPServer (4 adapters: fixture, PubMed, Semantic Scholar, Europe PMC), lean/* → LeanVerifierMCPServer (TensorLang manifest hash binding + lake build round-trip), lab_loop/* → LabLoopMCPServer (V3LabLoopOrchestrator behind a typed MCP surface), improvement/* → ImprovementMCPServer (recursive evidence improvement proposals), model/* → PeptiterModelMCPServer (semantic_hash-pinned biology_v* introspection plus world-model source, pathway assertion, and body-twin model-card inspection). Same code path drives the in-process app, the stdio binary, and Swift tests — they cannot drift.

Sources/PeptiterDiscovery/MCP/{PeptiterMCPCatalog,CompositeMCPServer,…}.swift
02 · peptiter-mcp-stdio

Newline-JSON-RPC stdio server, single binary, ready for Claude Code.

Drop-in for any MCP client that speaks newline-delimited JSON-RPC over stdio. .claude/mcp.example.json + docs/MCP_CONFIG.md show the wiring with three flavors (built-binary path, swift-run dev, concurrent dispatcher). Runs sequentially by default; --concurrent N opts into a TaskGroup-backed pool with an actor-serialized stdout writer so JSON lines never interleave at the byte level.

peptiter-mcp-stdio --probe                  # tools/list as JSON, exit
peptiter-mcp-stdio --tools-list-version     # SHA256 of canonical catalog
peptiter-mcp-stdio --concurrent 4           # batch-friendly dispatch
peptiter-mcp-stdio --cache-dir ~/peptiter   # persist trace + verifier receipts
Sources/PeptiterMCPStdioCLI/main.swift
03 · Cached receipts + traces

Lake builds and orchestrator runs persist across restarts.

OnDiskCache<T: Codable & Sendable> generalizes the persistence pattern: actor-backed in-memory dict, optional <dir>/<name>.json file per entry, sanitized filenames, corruption-tolerant load. lean/verify caches receipts keyed by manifest semantic hash so identical-artifact requests skip lake build (or pass noCache: true to force a fresh verify). lab_loop/run caches Trace + fingerprint so cold restarts answer lab_loop/inspect without re-dispatching the orchestrator. improvement/* keeps proposal state inside the MCP server instance; persistence can be added with the same cache pattern when promotion records need to survive process restarts. --prune-cache 7d evicts at startup; peptiter-mcp-cache --evict <ns>/<key> evicts a single entry on demand.

Sources/PeptiterDiscovery/MCP/OnDiskCache.swift
04 · Smart short-circuits

cachedHash and cachedFingerprint let agents skip identical work.

Every expensive tool accepts an optional cache hint from the caller. lean/verify takes cachedHash; lab_loop/run and lab_loop/inspect take cachedFingerprint (FNV-1a over attribution_hash + overrides). Match → compact { unchanged: true, … } delta. Mismatch → full receipt with a delta block flagging the change reason. Stateless on the wire — caller owns the cache.

Sources/PeptiterDiscovery/MCP/{LeanVerifierMCPServer,LabLoopMCPServer}.swift
05 · Inspector CLI

peptiter-mcp-cache reads what's persisted without spinning up the server.

Read-only inspector. Walks the cache directory, reports per-namespace counts and bytes, lists each entry's key + size + age + fingerprint. --json mode emits a stable schema; --details prints the first 160 chars of each payload (text) or the full decoded JSON. Lets operators audit deployment state offline.

peptiter-mcp-cache --cache-dir ~/peptiter
  peptiter cache @ ~/peptiter
  ## lab_loop (1 entries, 11.3 KB)
    v3-il23-axis · 11538 bytes · 0s old · fingerprint 14ebfd30a3a9e488
  ## lean (3 entries, 4.7 KB)
    biology_v2__e065d18796a17c58… · 1820 bytes · 12m old
Sources/PeptiterMCPCacheCLI/main.swift
06 · CI + regression bench

Every PR posts an EVAL.md badge; overlay drift fails the build.

.github/workflows/eval-badge.yml runs the transpiler Python tests, regenerates BiologyV2/V3.lean from the (potentially modified) overlay JSON, runs lake build, executes the eval harness, and posts a sticky PR comment with the headline F1. check_overlay_sync.py blocks PRs where BIOLOGY_OVERLAY.md drifts from build_biology_v3. EVAL.md gains a 'Diff vs base ref' section via --diff-against-base origin/main so reviewers see cumulative regression.

EVAL · v2 signed — F1 0.818 · safety precision 1.000 · 25 cells, 4 models
.github/workflows/eval-badge.yml + tensorlang/eval/eval_harness.py
Operational contracts

What the deployment surface promises an operator.

Seven guarantees enforced at the dispatcher level. Each is a line of code an operator can grep, not a policy memo.

Untrusted MCP

first-party servers only; tool prefixes pre-registered; no community MCP for sensitive data

Hash binding

TensorLang and Lean prove the same finite fragment under one semantic_hash

Audit ledger

every tool call logged through MCPDispatcher; every cache entry timestamped

Fail-loud-don't-fall-silent

missing real-data downloads exit code 2 with instructions, never silently use fixtures

Model-card boundaries

world-model MCP tools expose source licenses, context overlays, executable islands, VVUQ, and blocked claims

Stateless overrides

agents pass their own cache hints; servers don't track per-client state

L4 ceiling

no autonomous wet-lab loops until human approval system is solid (per SCIENTIST.md §7)

Live-system numbers

The surface, in commits and tests.

tests pass
recursive loop test coverage added
tools/list-version
in-band JSON-RPC method · stable SHA256
executables
peptiter-mcp-stdio · peptiter-mcp-cache · peptiter-research-copilot · peptiter-calibration-import
model tools
list_world_model_sources · inspect_pathway_assertions · body_twin_model_card
eval badge
JSON · MD · SVG · sticky PR comment
CI workflows
eval-badge.yml · pathway-lean.yml

Setup, end-to-end. Two commands:swift build produces every executable; cp .claude/mcp.example.json ~/.claude/mcp.json wires Claude Code into the catalog. From there an agent can search literature, verify mechanisms, run closed-loop experiments, and inspect the cache — all behind one dispatcher.