Skip to main content

Technical writing from the team that builds the models.

Binding affinity, ADMET prediction, off-target screening, and the mechanics of running a target-to-lead campaign in silico — written by the people doing the work, not a content team.

Abstract generative AI creating molecular scaffold variations with electric green and purple gradients

Scaffold Hopping with Generative Models: When It Works and When It Hallucinates

Generative models have changed scaffold hopping throughput and scope — but the hallucination problem is real. Here's what distinguishes proposals that bind from those that only satisfy pharmacophore geometry.

Marcus Osei

Abstract visualization of diverging metabolic pathways between species with warm orange and cool blue contrast

ADMET Species Translation: Why Rat Liver Microsomes Don't Predict Human Clearance

Rat liver microsome data is cheap and fast. It's also an unreliable predictor of human hepatic clearance for a meaningful fraction of drug-like chemical space. What drives the gap and how we model the translation.

Lena Buchhardt

Abstract Pareto frontier visualization with three competing objectives as intersecting gradient surfaces

Multi-Objective Lead Optimization: Balancing Potency, Selectivity, and Synthesizability

Potency, selectivity, and synthesizability don't point in the same direction. We use Pareto frontier methods to navigate that tension — and the frontier reveals trade-off structure that composite scoring hides.

Siddhartha Mukherjee

Abstract visualization of protein allosteric sites detected by graph neural network with glowing node connections

Allosteric Site Discovery with Graph Neural Networks: Lessons from 40 Targets

We ran GNN-based allosteric site detection across 40 protein targets. Here's what the models find reliably, where they miss, and what we still need crystallography to confirm.

Priya Venkataraman

Visualization of KRAS G12C covalent binding site in molecular model

KRAS G12C: Why Covalent Inhibitor Design Rewards Computational Methods

KRAS G12C was considered undruggable until covalent chemistry opened the GDP-binding pocket. The covalent warhead placement problem is exactly where physics-based and ML scoring models diverge most.

Marcus Osei

Abstract visualization of resource efficiency in drug discovery

Where Computational Drug Discovery Actually Saves Money (and Where It Doesn't)

The $2.6 billion drug development cost figure is often cited to justify every computational tool ever made. An honest accounting of where in silico methods genuinely reduce cost.

Siddhartha Mukherjee

Abstract visualization of computational to IND pathway

What IND-Enabling Studies Look Like When Your Lead Came From a Computer

When a candidate reaches IND-enabling stage having never been synthesized before, the wet-lab package looks different. Here's what preclinical teams should expect.

Lena Buchhardt

Molecular scaffold optimization visualization

Computational Hit-to-Lead: How We Do SAR Without Synthesizing Anything

Structure-activity relationship optimization traditionally requires iterative synthesis rounds. We do SAR computationally — generating and scoring scaffold variants until the model converges.

Siddhartha Mukherjee

Dense molecular cloud with selected candidates highlighted

Beyond Virtual Screening: The Case for Integrated In Silico Campaigns

Running binding, ADMET, and off-target as separate tools with manual hand-offs loses information at every step. What changes when you treat them as one integrated model.

Priya Venkataraman

Radial network diagram representing off-target proteome screening

Pan-Proteome Off-Target Screening: What 2,300 Structures Reveal

22% of Phase II failures trace to selectivity. We run off-target docking against 2,300 protein structures before synthesis. This is what that screen consistently surfaces — and what it misses.

Marcus Osei

Ribbon protein structure in teal on dark background

AlphaFold Changed Our Inputs. It Didn't Change the Scoring Problem.

AlphaFold-predicted structures now routinely feed docking pipelines. The structural coverage problem is largely solved. The scoring function problem — predicting whether a ligand actually binds — is not.

Lena Buchhardt

Abstract visualization of computational target-to-lead process

Running Target-to-Lead Entirely In Silico: What It Actually Takes

Three years ago, purely computational target-to-lead was a stretch claim. Today it's a workflow. What the actual computational stack looks like, where the remaining hard problems are.

Siddhartha Mukherjee

Split visualization comparing physics-based and machine learning approaches

Physics-Based Scoring vs. Machine Learning: Not a Competition

The debate between force-field docking and GNN scoring misses the point. The real question is where each method fails and how to combine them. We've run both on the same targets.

Siddhartha Mukherjee

Visualization of ADMET prediction data grid

ADMET Prediction in 2024: Eight Endpoints Are Not Enough

Standard ADMET tools cover CYP inhibition, solubility, and a few permeability endpoints. We surveyed the causes of Phase I failures over 10 years and found the endpoints that keep being missed.

Priya Venkataraman

3D molecular surface showing protein-ligand binding interaction

A Practitioner's Guide to Binding Affinity Prediction Models in 2024

AutoDock Vina, Glide, FEP+, GNN-based models — a systematic comparison of what each does well, where each breaks down, and the training data coverage problem that most comparisons miss.

Marcus Osei

Abstract funnel visualization of virtual screening attrition

Why 95% of Virtual Screening Hits Fail When They Reach the Wet Lab

The hit rate problem isn't a wet lab problem — it's a scoring function problem. Here's why standard docking scores correlate poorly with experimental binding, and what the data says about fixing it.

Siddhartha Mukherjee