← Docs
Helix CLI docs
Browse Helix CLI docs

Azimuth public benchmark (in silico, pack-bound)

Helix supports deterministic, offline-verifiable AUROC evaluation against frozen public benchmarks via evaluation packs.

This page records the current, externally defensible Azimuth replication on the frozen evaluation pack:

  • evaluation_pack_id: helix_eval_v1_azimuth_public
  • Pack manifest SHA256: 4fc8e5c82ff2e0e4d6bb9ec000de3a5f873c1f5b00e7be46f3c0ba2de5bb2848
  • Labels SHA256: 2d4231b24fda7a429cb21cf5fce35db8c90f227d3d66d8650abb62aa95228a3d

!!! important AUROC reported by Helix is computed only against a declared in silico evaluation pack. It is not a claim of biological validation, real‑world performance, or real‑world accuracy.

One-command run (evidence bundle emitted)

If the OnTargetX bundle is missing, run:

make train-ontargetx-azimuth
make eval-azimuth-auroc-helix-ontarget

Outputs (deterministic paths):

  • artifacts/evaluations/azimuth_public/physics_helix_batch_report.zip
  • artifacts/evaluations/azimuth_public/ontargetx_helix_batch_report.zip
  • artifacts/evaluations/azimuth_public/physics_transcript.txt
  • artifacts/evaluations/azimuth_public/ontargetx_transcript.txt

Current scorer configuration (default)

OnTargetX (Helix) is evaluated as an on-target guide activity-like score source:

  • embedding_id: seq30_v3_kmer2
  • model_kind: logit_residual_v2 (affine baseline + residual logit, novelty-gated)
  • Learner: logit_residual_logistic_l2_activity_v2 (trained against the pack’s continuous activity field; evaluated via pack labels)
  • OnTargetX bundle dir: models/ontargetx/helix_eval_v1_azimuth_public/v3_kmer2_logistic_activity_v2
  • bundle.json SHA256: a240b735c09f8cafec4a737fe60825cd14223aad8c5e2b3072375dfe3180e57b

Physics is included as the fixed baseline comparator:

  • score_source: helix_crispr_on_target_physics

Recorded results (pack-bound)

From make eval-azimuth-auroc-helix-ontarget:

  • Physics AUROC: macro 0.5277, pooled 0.5167
  • OnTargetX AUROC: macro 0.7489, pooled 0.7383
  • Δ macro AUROC: +0.2212 (computed only when pack id + digest match)

From make train-ontargetx-azimuth (holdout is the primary generalization gate):

  • Holdout macro AUROC: 0.7170
  • Full-pack macro AUROC: 0.7489
  • Full-pack pooled AUROC: 0.7383

Train twice with the same container digest and confirm hashes match:

make train-ontargetx-azimuth
sha256sum \
  models/ontargetx/helix_eval_v1_azimuth_public/v3_kmer2_logistic_activity_v2/tensors.npz \
  models/ontargetx/helix_eval_v1_azimuth_public/v3_kmer2_logistic_activity_v2/training_receipt.sha256
make train-ontargetx-azimuth
sha256sum \
  models/ontargetx/helix_eval_v1_azimuth_public/v3_kmer2_logistic_activity_v2/tensors.npz \
  models/ontargetx/helix_eval_v1_azimuth_public/v3_kmer2_logistic_activity_v2/training_receipt.sha256

How to reproduce this transcript

make build-eval-pack-azimuth
make train-ontargetx-azimuth
make eval-azimuth-auroc-helix-ontarget

Expected digests (pack-bound):

  • Pack manifest SHA256: 4fc8e5c82ff2e0e4d6bb9ec000de3a5f873c1f5b00e7be46f3c0ba2de5bb2848
  • Labels SHA256: 2d4231b24fda7a429cb21cf5fce35db8c90f227d3d66d8650abb62aa95228a3d
  • bundle.json SHA256: a240b735c09f8cafec4a737fe60825cd14223aad8c5e2b3072375dfe3180e57b
  • tensors.npz SHA256: f862022e7517b0916a568ae5fc62f359d505fb6f8aaef7590139ca1c99ecd047
  • training_receipt.json SHA256: ed0c158639ed1c98e40da9898e1f6a7172d47cbdf065d9a011c82415676b0599