SNPTX - Results

Benchmark surface

Seven unimodal benchmarks plus one tri-modal fusion run, each tied to a named dataset, model, and validation protocol.

Evidence

Per-modality benchmark accuracy

Eight evaluated settings across seven biomedical modalities and one fusion run.

Reported accuracies range from 0.610 (clinical text, 5-class classification with a 0.20 random baseline) to 0.993 (PathMNIST, 9-class). The validation column on each row records the protocol that produced the number; it is not omitted from any row.

Modality	Dataset	Model	Accuracy	Validation
Clinical tabular	Synthea readmission (n=6,625)	XGBoost + Optuna	0.777	5-fold CV, 0.774 ± 0.008
Omics	Visium breast cancer (n=3,798 spots)	VAE	0.928	Validation split, 14-class Leiden clustering
Knowledge graphs	Hetionet ego-graphs (n=1,913)	GAT	0.815	Validation split, center-node readout, 8-class
Histopathology	PathMNIST (n=107,180)	DenseNet-121	0.993	Validation split, 9-class
Clinical text	MTSamples (n≈3,500)	ClinicalBERT	0.610	5-class specialty, random baseline 0.20
Single-cell imaging	BloodMNIST (n=17,092)	DenseNet-121	0.969	Test split, 8-class
Drug discovery	ChEMBL bioactivity (n=4,685)	GCN	0.975	Validation split, molecular graph classification
Multi-modal fusion	Visium (omics + vision + spatial)	Attention fusion	0.930	Validation split; vs. 0.928 omics-only on same dataset

Sorted view

Per-modality accuracy, descending

The same eight numbers as the table, ordered by accuracy. The bar widths are linear over [0, 1].

Histopathology · DenseNet-121

0.993

Drug discovery · GCN

0.975

Single-cell · DenseNet-121

0.969

Fusion · attention (Visium)

0.930

Omics · VAE

0.928

Knowledge graphs · GAT

0.815

Clinical tabular · XGBoost

0.777

Clinical text · ClinicalBERT

0.610

Reading these numbers

Benchmark accuracy, not clinical performance

Every row reports accuracy on a held-out split or cross-validation protocol of a public or synthetic dataset. These are infrastructure and methodology measurements that show the framework can train, evaluate, and report across heterogeneous biomedical modalities under one execution surface. They are not estimates of clinical utility. Clinical evaluation, where applicable, lives on the Validation page.

Pipeline that produced these numbers

Every row in the table above is the output of the same staged DAG. Stage transitions persist artifacts; a feedback edge from reporting drives next-run selection.

Artifact handoffs

Each stage persists state

Ingestion writes a dataset state, training writes a model, evaluation writes metrics, reporting writes the bundle that produced the rows above. Stage transitions cross artifact boundaries rather than in-process state.

Feedback edge

Reporting drives next-run selection

The feedback path links the reporting bundle back to next-run selection through declared interfaces. It influences which experiments run next; it does not bypass the staged sequence or the artifact record.

Delivery state of the program

Phase markers for the build that produced the table above. Status is reported as completed or partial; nothing here implies clinical or production readiness.

Data foundation

Complete

A.6

Theoretical hardening

Complete

Intelligence layer

Complete

B.6

Theoretical intelligence

Complete

Multi-modal expansion

Complete

C.6

Multi-modal hardening

Complete

Deployment & compliance

Partial

D.6

Deployment hardening

Partial

Platform & scale

Partial

Status taxonomy

Complete means the phase has shipped artifacts that the table above depends on. Partial means in-progress work toward declared deliverables; deployment, compliance posture, and platform scaling are not finished and are not claimed as production-ready.