Skip to main content

Research Notes

What IND-Enabling Studies Look Like When Your Lead Came From a Computer

Lena Buchhardt
Preclinical laboratory setting showing GLP toxicology study preparation alongside structural biology data visualizations

There's a moment in every computational drug discovery program when the transition from in silico to IND-enabling package forces a confrontation with what the computational dossier actually is and isn't. Up to that point, the conversation is largely internal: does the model predict good binding? Does the ADMET profile look acceptable? Does the selectivity index clear our threshold? These are questions the team answers with its own tools and judgment.

The IND-enabling package changes the audience. Now you're writing for an FDA reviewer who has never encountered a program structured around computational lead identification, CRO partners who need to run GLP toxicology studies on a compound they haven't seen synthesized, and eventually clinical investigators who need to explain the compound's safety basis to an IRB. The computational dossier needs to translate into a safety story that those audiences can evaluate on its own terms — not as a replacement for standard preclinical data, but as context that shapes the interpretation of it.

This post is about how that translation works in practice: what the IND-enabling experimental package looks like for a computationally derived lead, where the handoff from computational to experimental happens, and what the structural biology component contributes to the preclinical narrative.

What IND-Enabling Studies Require

The IND-enabling package under 21 CFR Part 312 requires, at minimum: adequate chemistry, manufacturing, and controls (CMC) information; pharmacology and toxicology data sufficient to establish a reasonable basis for human safety; and clinical protocol and investigator qualifications. The toxicology component typically includes GLP single-dose and 14-28 day repeat-dose tox studies in two species (one rodent, one non-rodent), genotoxicity studies (Ames test, in vitro clastogenicity, in vivo micronucleus), and safety pharmacology studies covering the cardiac, CNS, and respiratory systems per ICH S7A guidelines.

None of this changes because the lead was computationally derived. The GLP tox studies are identical in design and scope to those run for any small molecule candidate. What changes is the thickness and character of the supporting context that accompanies those studies.

For a classically derived lead — coming from a medicinal chemistry optimization campaign with 3-4 rounds of synthesis and in vitro profiling — the IND package contains extensive experimental SAR data: potency series, metabolite identification studies, selectivity counterscreen panels, in vitro DMPK profiling. That data history tells the safety reviewer how the compound was arrived at, what was tried and rejected, and what the experimental basis is for believing the liabilities of earlier compounds have been addressed.

For a computationally derived lead, that experimental SAR history is thinner by design — that's the point of the approach. You synthesized fewer compounds, ran fewer experimental cycles. The compensating element is the computational dossier: the scoring models, the training data they were built on, the pan-proteome selectivity screen results, the predicted versus measured values for the few compounds that were synthesized during model validation. This is not a substitute for experimental data — it's a documented basis for believing the computational predictions are calibrated and the lead was selected by a principled process.

The Structural Biology Contribution

From a structural biology standpoint, the most valuable IND-enabling study for a computationally derived lead is the co-crystal structure of the lead compound in complex with the target protein. This does several things simultaneously.

First, it validates the computational binding mode. If the co-crystal pose matches the predicted docking pose within 1.5-2 Å RMSD, you have direct evidence that the scoring model found the correct binding geometry. If it diverges significantly, you have a calibration problem that needs to be understood before the IND package is submitted — because it raises questions about whether the selectivity predictions, which depend on the same scoring model, are reliable.

Second, it grounds the mechanism section of the IND application. An X-ray or cryo-EM structure showing exactly how the lead compound contacts the target active site is the most compelling mechanistic evidence available for a small molecule program. It demonstrates target engagement at atomic resolution, shows the structural basis for selectivity over related family members (if their structures are available for comparison), and gives the SAB, collaborating clinicians, and FDA reviewers a concrete visual anchor for the mechanism of action narrative.

Third, it informs metabolic soft spot identification. Co-crystal structures often reveal solvent-exposed regions of the bound compound that are accessible for CYP-mediated metabolism. Identifying these regions from the crystal structure, rather than purely from microsomal incubation, allows the synthetic chemistry team to focus metabolic stabilization efforts efficiently — blocking the soft spots that the structure predicts are accessible rather than running a broad synthetic screen.

In our experience, X-ray crystallography is preferable to cryo-EM for small molecule lead structures below ~600 Da when crystals can be obtained — resolution and unambiguous ligand placement are typically better with synchrotron data. For membrane proteins or targets that don't crystallize well in complex with small molecules, cryo-EM is increasingly viable at resolutions sufficient for lead-quality density. We aim for at least one structure at 2.2 Å or better before the IND is submitted. Structures at 2.8 Å and above are acceptable with appropriate uncertainty quantification, but do not definitively resolve binding mode ambiguities.

PK Profiling: In Vivo Follows In Silico

Pharmacokinetic profiling for an IND candidate typically involves single-dose IV and oral administration in mouse and rat (or other species depending on the therapeutic context), returning half-life, oral bioavailability, volume of distribution, and clearance rate. For a computationally derived lead, the predicted ADMET profile should be compared explicitly against the in vivo PK results as a calibration check.

In a recent oncology program, our lead candidate had predicted microsomal stability (t½ > 60 min in human liver microsomes) and predicted oral bioavailability of 42-55% based on the ADMET model. In vivo mouse PK returned t½ = 4.8 hours after IV dosing, oral bioavailability of 38% — within the predicted range. The model had predicted low Pgp efflux for this compound class, which the PK data supported. These calibration confirmations become part of the IND package as evidence of model reliability.

Where predicted and measured PK diverge, the discrepancy needs to be explained, not dismissed. If the model predicted high oral bioavailability but in vivo bioavailability is low, you need to determine whether first-pass metabolism, poor aqueous solubility, or active efflux is responsible. Each has a different structural chemistry implication and each affects how you interpret the selectivity predictions for the same compound class.

GLP Tox Studies: What the Computational Dossier Contributes

GLP toxicology studies are conducted under standard ICH M3(R2) guidance, and the core study design is not altered by the computational origin of the lead. However, the computational off-target screen results can materially inform dose selection and monitoring parameters for GLP tox studies.

If the pan-proteome docking screen flagged potential activity against a specific receptor class — for example, moderate predicted docking scores against some adrenergic receptor subtypes — the GLP tox study monitors should include blood pressure and heart rate telemetry at doses above the expected therapeutic range. This is not a change to the GLP study design per se; these monitors are standard in repeat-dose tox anyway. What changes is the basis for paying close attention to specific readouts and the doses at which you expect to see adverse signals if the computational prediction is correct.

We structure the computational dossier contribution to GLP tox as a "watch list" — off-target predictions above the selectivity concern threshold, presented with predicted vs. measured binding data where available for those targets, translated into specific monitoring parameters and histopathology focus areas for the CRO. The CRO's pathologists can then pay targeted attention to tissues where predicted off-target activity would be expected to manifest, rather than discovering unexpected histopathology during full slide review.

The Regulatory Conversation

FDA has not issued specific guidance on the use of computational lead identification in IND applications — the IND requirements remain as written. In pre-IND meetings for programs that explicitly frame their leads as computationally derived, the questions we've encountered from FDA reviewers cluster around three areas: the training data quality and coverage for the binding and ADMET models used; the validation evidence showing the models are calibrated (predicted vs. measured for a validation set of compounds, not just the lead itself); and the mechanism by which off-target liabilities were screened.

These are reasonable questions, and the computational dossier should address all three explicitly rather than relying on reviewers to infer them from the methods description. We structure the computational section of our IND applications with a model validation subsection upfront — before the lead-specific predictions — precisely because establishing model credibility is a prerequisite for the lead-specific results being interpretable.

We're not suggesting the FDA will apply different standards to computationally derived leads — the preclinical safety data requirements are the same regardless of how the lead was found. What we're saying is that a computational program with a well-documented dossier can support a stronger mechanistic narrative in the IND application than is sometimes achievable from empirical SAR alone, because the structural and modeling data add resolution to the safety story rather than simply meeting the minimum evidentiary bar.

More from Research Notes