Azimuth public benchmark (in silico, pack-bound)
Helix supports deterministic, offline-verifiable AUROC evaluation against frozen public benchmarks via evaluation packs.
This page records the current, externally defensible Azimuth replication on the frozen evaluation pack:
evaluation_pack_id:helix_eval_v1_azimuth_public- Pack manifest SHA256:
4fc8e5c82ff2e0e4d6bb9ec000de3a5f873c1f5b00e7be46f3c0ba2de5bb2848 - Labels SHA256:
2d4231b24fda7a429cb21cf5fce35db8c90f227d3d66d8650abb62aa95228a3d
!!! important AUROC reported by Helix is computed only against a declared in silico evaluation pack. It is not a claim of biological validation, real‑world performance, or real‑world accuracy.
One-command run (evidence bundle emitted)
If the OnTargetX bundle is missing, run:
make train-ontargetx-azimuth
make eval-azimuth-auroc-helix-ontarget
Outputs (deterministic paths):
artifacts/evaluations/azimuth_public/physics_helix_batch_report.zipartifacts/evaluations/azimuth_public/ontargetx_helix_batch_report.zipartifacts/evaluations/azimuth_public/physics_transcript.txtartifacts/evaluations/azimuth_public/ontargetx_transcript.txt
Current scorer configuration (default)
OnTargetX (Helix) is evaluated as an on-target guide activity-like score source:
embedding_id:seq30_v3_kmer2model_kind:logit_residual_v2(affine baseline + residual logit, novelty-gated)- Learner:
logit_residual_logistic_l2_activity_v2(trained against the pack’s continuousactivityfield; evaluated via pack labels) - OnTargetX bundle dir:
models/ontargetx/helix_eval_v1_azimuth_public/v3_kmer2_logistic_activity_v2 bundle.jsonSHA256:a240b735c09f8cafec4a737fe60825cd14223aad8c5e2b3072375dfe3180e57b
Physics is included as the fixed baseline comparator:
score_source:helix_crispr_on_target_physics
Recorded results (pack-bound)
From make eval-azimuth-auroc-helix-ontarget:
- Physics AUROC: macro
0.5277, pooled0.5167 - OnTargetX AUROC: macro
0.7489, pooled0.7383 - Δ macro AUROC:
+0.2212(computed only when pack id + digest match)
From make train-ontargetx-azimuth (holdout is the primary generalization gate):
- Holdout macro AUROC:
0.7170 - Full-pack macro AUROC:
0.7489 - Full-pack pooled AUROC:
0.7383
Training determinism check (recommended once per release)
Train twice with the same container digest and confirm hashes match:
make train-ontargetx-azimuth
sha256sum \
models/ontargetx/helix_eval_v1_azimuth_public/v3_kmer2_logistic_activity_v2/tensors.npz \
models/ontargetx/helix_eval_v1_azimuth_public/v3_kmer2_logistic_activity_v2/training_receipt.sha256
make train-ontargetx-azimuth
sha256sum \
models/ontargetx/helix_eval_v1_azimuth_public/v3_kmer2_logistic_activity_v2/tensors.npz \
models/ontargetx/helix_eval_v1_azimuth_public/v3_kmer2_logistic_activity_v2/training_receipt.sha256
How to reproduce this transcript
make build-eval-pack-azimuth
make train-ontargetx-azimuth
make eval-azimuth-auroc-helix-ontarget
Expected digests (pack-bound):
- Pack manifest SHA256:
4fc8e5c82ff2e0e4d6bb9ec000de3a5f873c1f5b00e7be46f3c0ba2de5bb2848 - Labels SHA256:
2d4231b24fda7a429cb21cf5fce35db8c90f227d3d66d8650abb62aa95228a3d bundle.jsonSHA256:a240b735c09f8cafec4a737fe60825cd14223aad8c5e2b3072375dfe3180e57btensors.npzSHA256:f862022e7517b0916a568ae5fc62f359d505fb6f8aaef7590139ca1c99ecd047training_receipt.jsonSHA256:ed0c158639ed1c98e40da9898e1f6a7172d47cbdf065d9a011c82415676b0599