From Plate to Patient: Phenotypic Drug Discovery's Grand Challenge
This essay outlines some common goals for AI/ ML-enabled PDD companies, and builds on observations/ motivations from Inflection Points
Simply put, the current primary goal of all AI/ML enabled techbio drug discovery companies is to reduce preclinical development time and cost. In a little more detail, these companies would want their primary screen to tell them as much (maybe via a loop of training, prediction, and re-testing) about downstream functional assays (DFAs) as possible. Perhaps a future goal is for primary screens to predict response in targeted patient groups. As they run more primary screens and DFAs, it’s possible that the chemical structure alone will be more than enough to trim lists and maximize productivity of the primary screen itself. The current secondary goal, would be to unlock diverse and novel chemical space that was not available with the traditional drug discovery hammers of the past (CTG, qPCR, reporter cell lines). In a dream scenario, your company should shrink the window and unlock this unexplored chemical territory using this imaging-based primary screen.
For a phenotypic drug discovery company in 2023, the primary screen would likely follow the path of cells treated with drugs, incubated, stained, and imaged with a high throughput microscope, and analysis. The cells, incubation times, choice of stained targets, and method of analysis would be different from company to company. I’m being purposely vague about styles of imaging, because this is a very personal question for companies to answer based on the biology they would like to interrogate. The primary dimensions here are likely resolution vs. acquisition time.
The goal again is to use images of drug treated cells to predict favorable functional assay response (numerical content of DFAs). This prediction is really where time saving will come from (when you save time, you can also test many more compounds, ultimately opening the doors to the new chemical space). Typically there is some weeks-months window within which primary screen, hit triaging, and DFA completion/ analysis would take place. The ‘deliverable’ so to speak would be a map of phenotypic clustering, with corresponding numerical outputs from imaging (number/ intensity of features) and functional assays as a sense check.
After picking your favorite phenotypes, you can bias hit calling in all future primary screens for the program, with the hope that this group of diverse phenotypes will continue to give you favorable responses within the primary dimension you sorted your phenotypes by. There are two main questions here: 1) Does your imaging phenotype biologically represent the numerical output of the functional data? 2) How many times are you going to run primary screens, sort your clustering, and retest to convince yourself that the computer got it right?
I’ll start with the first question, and then move on to the to the second. There may be differences between cell-health-restoration programs (cardio, neuro) and cell-killing programs (oncology, antibacterial) in terms of where in the spectrum of phenotypic response you would identify your target phenotype, so I’ll start with cell-killing.
In a cell killing program, the goal is to develop efficacious and safe compounds that selectively kill diseased cells (sparing healthy cell lines). The killing should be linked to MOA, which is mainly composed of target-gene down regulation, and proof of target engagement/ promiscuity. So, the end goal is a potent compound that interacts with known factors and differentially modulates abundances of known factors. The assays are typically in plates (384 or 1536), and eventually a subset of these compounds would be tested in mice to show translatability of plate-proven-phenoemena. Sometimes in vitro and in vivo profiling proceeds in parallel, which is a non AI/ML enabled way of speeding up a discovery campaign.
One extra piece for the data linking, is that teams must decide what features in phenotypic assays and DFAs will be linked? There are many, many choices here, so I’ll let you take a break and let you do some drawing/ thinking for yourself…. Again, there are many, many choices here, especially if you decide to incorporate emerging plate and imaging-based technologies. You may also go a step further and reduce some of these curves and values to binary signals. So, for one assay, you can threshold the target gene down-regulation and say that below the threshold is 0 and beyond is 1. You may be doing this because the data from this particular assay is noisy, you may also do this if you think you’ve effectively grouped compounds in the binary format. A medicinal chemist should also have control in such a system to maintain and reduce the richness of DFAs.
The second question was about the number of times that a chemist/ team of biologists would run a primary screen and test in DFAs to confirm predictions. The ultimate goal of this screen is to computationally link images and image changes from various chemical perturbations to as much downstream data as possible. So, beyond DFAs, you actually have PK/ PD, mouse, and human studies. You should let a few molecules through that are successful during DFA re-testing to these studies and see how you do. If the line you draw from the primary screen images to deeper layers of data that are increasingly patient-proximal grows longer, you also have less reason to run DFAs.