Finding new and better treatments for many of today’s most complex diseases depends on uncovering new targets. However, target identification remains a formidable challenge in drug discovery. While evidence suggests that there is a wealth of opportunities waiting to be discovered in less-explored areas of the human genome, and there is no shortage of published scientific data to consider, the volume and complexity of this information makes it a difficult path to follow. This results in a tendency for new research to focus on well-known targets rather than striking out on a novel path. This blog post will discuss a new approach that unites CAS content, specialized algorithms and our scientists’ expertise to identify unique, under-explored targets for inflammatory bowel disease (IBD), a challenging and complex condition impacting millions of people worldwide.
Where to hunt for viable new targets?
When it comes to identifying suitable drug targets for high-impact diseases, there’s a plethora of genes to choose from. For example, more than 3,500 genes show some association to IBD in Open Targets and Pharos. However, which are worth pursuing as a druggable target for this disease? A recent study estimates that 22% of the protein-coding genes in the Ensembl genome database are druggable. However, only about one-third of these druggable genes are currently known to be modulated by treatments on the market or in development. This gap reflects a current reality — the primary focus of on-going research tends to be on the small fraction of the genome that has already been well explored rather than the much larger unexplored and underexplored regions. While there is thought to be less risk in pursuing known strategies, is this natural bias causing us to overlook other higher-potential opportunities in less studied areas?
A recent CAS analysis of FDA-approved drugs over the last five years suggests there is significant upside for organizations that choose the road less travelled. The analysis demonstrated that drugs with new mechanisms of action were more than twice as likely to have received Breakthrough Therapy Designation (BTD) by the FDA. These drugs, which are expected to result in a substantial clinical improvement in serious or life-threatening diseases, are given expedited review by the FDA. Sponsors of BTD drugs benefit from intensive FDA guidance in the clinical trial design process and are eligible for rolling review of their marketing application to expedite drug development and approval. Analysis comparing approved BTD drugs to those without this designation shows that BTD drugs on average gain approval in half the time of non-BTD drugs. Thus, this designation has significant financial impacts for drug makers and offers them the potential to get ahead of their competition.
Interested in finding cures faster? Don’t miss this blog post about how AI is being used to efficiently navigate uncharted chemical space to identify new drug candidates.
Pairing human intelligence and algorithms to maximize target identification
So, the question is: how can researchers capitalize on this opportunity by efficiently identifying viable novel targets? The key is to empower scientists to think in terms of what mechanisms could be involved with a given disease, rather than just those we already know. However, a defined process to undertake that exploration is needed, otherwise it can seem like fumbling around in the dark or searching for an oasis in the desert.
A comprehensive review of all the relevant published literature on the target disease and related conditions is a key first step to support successful target identification. However, that is easier said than done. CAS analysis shows that not only is the volume of global journal and patent literature increasing nearly exponentially, but the amount of information in each publication has also increased. Furthermore, there are a growing number of citations and links between documents from disparate fields. Given this reality, it’s impractical to effectively mine the literature for new targets with a purely manual approach.
To help researchers better connect the dots between targets and diseases, CAS developed an approach that unites the expertise of scientists and the power of algorithms to maximize the value of our curated data collection to identify new targets. We recently tested this approach in a study focusing on inflammatory bowel disease (IBD). First, our comprehensive CAS content collection, the world’s largest repository of curated scientific knowledge, was mined to isolate over 50,000 patent and non-patent documents pertaining to IBD. From these documents, we algorithmically extracted over 9,000 potential target proteins and mRNAs. CAS scientists then validated a random sample of 3,000 of these 9,000 potential targets by looking through the full text of the source document to confirm the associations between the potential targets and the disease.
Results show promise in efficient identification of overlooked targets
The aim of this study was to see if this combined algorithmic and human approach identified unique targets not contained within the Open Targets and Pharos databases. The approach resulted in identification of over 600 targets validated for association with IBD, of which 37% (238) were unique to the CAS content collection. This represents a 60% increase in targets. Of these unique targets, 101 (42%) were identified as under-explored, meaning they were either published very recently or identified a long time ago but not investigated since. When extrapolated to the entire data set of 9,000 extracted potential targets for IBD, we would anticipate over 700 unique targets, more than 300 of which are under-explored, would be identified by this approach. These unique, under-explored targets are expected to provide fertile hunting grounds for new therapeutic opportunities.
The remarkably high proportion of unique outcomes is due to the extensive breadth of the documents indexed by CAS, as well as the power of human curation. Indeed, many of the documents containing these unique targets are also filed in Open Targets and Pharos. However, some targets of interest within those documents had been missed and thus not included in the databases. This highlights the critical importance of utilizing the expertise of trained scientists to complement automated approaches.
Ultimately, this study demonstrates the power of a comprehensive content collection, when paired with advanced algorithms and expert human insight, to identify many novel overlooked targets that could provide innovative new avenues for therapy. By focusing on these unique, under-explored targets, forward-looking organizations can get ahead of the curve in developing novel and innovative treatments to benefit patients in need.
Download a poster of the IBD case study and the results.