← Docs
Helix CLI docs
Browse Helix CLI docs

Off-target public benchmark (in silico, pack-bound)

Helix supports deterministic, offline-verifiable AUROC evaluation against frozen evaluation packs.

This page records the current off-target risk ranking demo on the frozen evaluation pack:

  • evaluation_pack_id: helix_eval_v1_offtarget_public
  • evaluation_pack.json SHA256: 00d1b9f8f691bb5018c945115279ad3b34e576edf42a62b3cea332e7e3ac4391
  • Pack fingerprint SHA256: 6dc4ad385faf34df76eea41841abf67a7d9521917389b7651670d44d0b819da2
  • Pack manifest SHA256: 2b9fce77b0ae7cc798a274fa8be66e6708c6ed6b81ba9b22a18c6fed32faa229
  • Labels SHA256: 0526773b6217949f79159a0b2b630ca59fd162e65c5251b7027d119aff80b34b

This demo is a pack verification and evaluation pipeline proof, not a claim of state of the art off target prediction.

!!! important AUROC reported by Helix is computed only against a declared in silico evaluation pack. It is not a claim of biological validation, real‑world performance, or real‑world accuracy.

Tie note: if a scorer assigns identical scores to positives and decoys, AUROC can mathematically collapse to 0.5 under the pack’s deterministic tie policy (tie_policy=average). Helix transcripts include tie counts so ties are explicit rather than hidden.

One-command run (evidence bundle emitted)

make eval-offtarget-auroc-baseline

Outputs (deterministic paths):

  • artifacts/evaluations/offtarget_public/baseline_helix_batch_report.zip
  • artifacts/evaluations/offtarget_public/baseline_transcript.txt

Verification checklist

  1. Confirm the transcript includes:
    • Evaluation Pack: helix_eval_v1_offtarget_public
    • Evaluation pack.json SHA256: ...
    • Scorer: helix.offtarget.demo.mismatch_count/v1
    • Scorer code: tools/eval_offtarget_pack_auroc.py::_scores_by_key_mismatch_count
    • Scorer params: score = -hamming_distance(join.guide20, join.offtarget_seq)
    • Pack fingerprint SHA256: ...
    • Pack manifest SHA256: ...
    • Labels SHA256: ...
    • Bundle SHA256: ...
  2. Confirm the pack digests match the shipped pack directory:
    • src/helix/datasets/evaluation_packs/helix_eval_v1_offtarget_public/evaluation_pack.json
    • src/helix/datasets/evaluation_packs/helix_eval_v1_offtarget_public/manifest.json
    • src/helix/datasets/evaluation_packs/helix_eval_v1_offtarget_public/labels.v1.json
  3. Confirm the evidence bundle is offline-verifiable:
    • PYTHONPATH=src python3 -c 'from helix.studio.batch.bundle import verify_batch_bundle_zip_bytes; import pathlib; b=pathlib.Path(\"artifacts/evaluations/offtarget_public/baseline_helix_batch_report.zip\").read_bytes(); errs=verify_batch_bundle_zip_bytes(b); raise SystemExit(0 if not errs else 1)'

Recorded transcript (pack-bound)

From make eval-offtarget-auroc-baseline:

Evaluation Pack: helix_eval_v1_offtarget_public
Evaluation pack.json SHA256: 00d1b9f8f691bb5018c945115279ad3b34e576edf42a62b3cea332e7e3ac4391
AUROC (off_target_risk_ranking, in silico)

Scorer: helix.offtarget.demo.mismatch_count/v1
Scorer code: tools/eval_offtarget_pack_auroc.py::_scores_by_key_mismatch_count
Scorer params: score = -hamming_distance(join.guide20, join.offtarget_seq)

Macro AUROC (across guides): 1.0000
Pooled AUROC: 1.0000

Pairs (positive vs decoy): 2 | strict=2 ties=0 inversions=0
Pair fractions: strict=1.000 ties=0.000 inversions=0.000
Mean Δscore (pos - decoy): 1

Pack fingerprint SHA256: 6dc4ad385faf34df76eea41841abf67a7d9521917389b7651670d44d0b819da2
Pack manifest SHA256: 2b9fce77b0ae7cc798a274fa8be66e6708c6ed6b81ba9b22a18c6fed32faa229
Labels SHA256: 0526773b6217949f79159a0b2b630ca59fd162e65c5251b7027d119aff80b34b
Metric schema: helix.metrics.auroc/v1

Bundle SHA256: 6365b9bb7e7af1279e39afeb59255d9bc0d57c6ca3c20305f8d12843da1bd1ff