Methods
Predicting what chemists used to have to synthesize.
Our computational stack integrates structure-based docking, QSAR ensemble modeling, and off-target proteome screening into one ranked output.
Architecture
From PDB structure to ranked dossier: seven stages, one data pipeline.
Binding Affinity
Binding affinity from graph-level protein-ligand representation
We represent each protein-ligand complex as a heterogeneous graph: atoms as nodes, bonds and inter-atomic contacts as edges. A message-passing GNN propagates atom-level features through 8 layers, pooling to a complex-level ΔG estimate. Trained on 14.3M complexes from PDB + ChEMBL + proprietary assay sets. Pearson r = 0.91 on held-out PDB benchmark (vs. 0.76 for standard AutoDock Vina).
ADMET
48 ADMET endpoints, not just logP and solubility
Most computational ADMET tools cover 8-12 endpoints. We run 48. The additional endpoints — including reactive metabolite flags, time-dependent CYP inhibition, mitochondrial toxicity, and phototoxicity — are routinely missed early and surface as Phase I surprises. Ensemble of 6 model architectures (RF, XGBoost, MPNN, AttentiveFP, GCN, DNN) with calibrated uncertainty estimates.
| Endpoint | Category |
|---|---|
| CYP3A4 inhibition | Metabolic |
| CYP2D6 inhibition | Metabolic |
| hERG block | Cardiotoxicity |
| PAMPA permeability | Absorption |
| Pgp substrate | Transport |
| BBB penetration | Distribution |
| Plasma protein binding | Distribution |
| Aqueous solubility (kinetic) | Physical |
| Reactive metabolite flag | Metabolic |
| Hepatotoxicity flag | Toxicity |
| Phototoxicity flag | Toxicity |
| Mitochondrial toxicity | Toxicity |
Off-Target
2,300 off-target structures. Screened before synthesis.
Selectivity failures cause 22% of Phase II discontinuations. We run pan-proteome docking against 2,300 non-redundant protein structures covering all major target classes: kinases (519), GPCRs (387), nuclear receptors (48), ion channels (156), proteases (234), and epigenetic regulators (211). Selectivity index (SI) reported per candidate as the ratio of off-target docking score to primary target score. This screen does not replace experimental selectivity assays — it prioritizes which off-targets to test first, reducing the number of counterscreens from hundreds to a targeted panel of 8–12.
Validation
Our models are validated against experimental data, not internal benchmarks.
Validation is prospective: we score candidates, then compare against experimental results from our collaborating wet labs. We do not back-fill training data with our own predictions.
Request the methods brief.
We share a technical methods document — model architecture details, training data provenance, validation protocols, and calibration certificates — with research collaborators. Write to us with your institutional email and the target class you're working on.