How mRNA COVID-19 Vaccines Work to Fight COVID-19


When mRNA (messenger RNA) vaccines are injected into a patient, tiny lipid vesicles transport mRNA molecules through the body fluids and merge with immune cells, called antigen-presenting cells (APCs). mRNA vaccines work to instruct APCs to manufacture proteins called antigens that evoke an immune response to the threat.

Back to top

Organizing Vital Data to Unlock Innovation from Brazil’s Biodiversity

Steven P. Watkins, Scientific Data Engineer
Brazil Biodiversity

Consider a family photograph collection started long ago, pictures amassing over the years. Gathering all the images in one place, say a shoebox, as many people do, doesn’t really add value or make it more useful. Finding the image you are looking for takes a lot of time, and it is difficult to then share with others. As such, your shoebox collection often goes untouched and unused, forgotten in a closet.

With today’s digital tools, searching, sharing and organizing your photos is easier than ever before. Now, you can quickly upload images into organized collections that can be viewed around the world. With such resources available, people are moving away from scattered shoeboxes stuffed with pictures, choosing instead to curate powerful collections digitally.

Scientists are faced with similar challenges with their valuable research data. Simply collecting and gathering content is not enough; without proper structure and organization, innovators cannot fully utilize this information. A robust data foundation is crucial for nearly all R&D activities, from day-to-day research to implementation of digital technologies like AI, predictive analytics and machine learning.

Preserving Brazil’s biodiversity

While hosting 15-20% of Earth’s biological diversity, a substantial part of Brazil’s abundant biodiversity remains underexplored. A lack of organized information made it exceedingly difficult for researchers to search, screen or even compare relevant chemical substances. This was hindering their ability to identify new targets, build on prior discovery and drive innovation.

As a result, researchers at the State University of São Paulo (IQ-UNESP) sought a better way to make relevant information on the unique Brazilian biodiversity accessible. The increased urbanization and deforestation endangering rare species were intensifying the need to systematically curate data. Unless samples were processed and categorized quickly, information regarding the substances could be lost forever.

The need to preserve valuable information was magnified in 2018, when a fire consumed the National Museum of Rio de Janeiro, impeding research with the permanent loss of rare samples. In response, CAS experts extended support to the scientific community in Brazil, partnering with IQ-UNESP to manage and organize information on natural bioactive compounds to ensure it will be available for future research.

The collaboration between the scientific information specialists at CAS and natural product researchers in Brazil resulted in an organized and refined collection of natural products. Data is being systemized into the publicly accessible Nucleus for Bioassays, Biosynthesis and Ecophysiology of Natural Products (NuBBE) Database, a project initiated by Dr. Vanderlan Bolzani (Institute of Chemistry of the São Paulo State University – IQ-UNESP) and Dr. Adriano Andricopulo (São Carlos Institute of Physics of the University of São Paulo – IFSC-USP).

The team at CAS used their in-depth scientific knowledge and expertise in managing data to extract and process information from over 30,000 relevant scientific publications. The resulting content collection maximizes access to and increases the utility of Brazil’s natural, bioactive compounds to support innovation.

nubbedb logoDownload the case study now to learn more about how collaboration between CAS and IQ-UNESP resulted in a data collection of over 54,000 substances of Brazil’s rich biodiversity.

Organized and accessible data empowers insight

Efficiency is key to timely innovation. Discoveries are hindered when scientific information is not accessible, searchable or reliable. In fact, data integrity and accessibility issues cause 10-20% of all development work to be repeated. Research teams, therefore, need seamless access to a wide range of consistent and accurate scientific and business information, or risk expensive delays and errors.

The volume and complexity of scientific information has exploded in the last few decades, creating a chaotic landscape of disconnected and unorganized data. Even internal systems have a variety of sources contributing data in different of formats and levels of quality. As such, creating and maintaining a well-organized, searchable data repository is a challenging feat but more important than ever.

According to the FAIR Guiding Principles for scientific data management and stewardship, it is essential that data be findable, accessible, interoperable and reusable. Cleaning and normalizing data with the correct semantic meaning and connections is difficult and requires specialized skills and a significant investment of resources. As a result, many organizations are engaging external experts, such as CAS, to quickly and cost efficiently unlock the power of their data.

Scientific expertise maximizes the value of data

A strong foundation of consistent, validated data ensures your teams and technology can progress efficiently. In one example, a company struggling with the accessibility and accuracy of internal data partnered with CAS to harmonize and standardize their knowledge management system saving their researchers more than 3,300 hours annually.

Building and maintaining a high-quality data set requires expertise. CAS employs hundreds of scientists spanning a wide range of disciplines that speak over 50 languages collectively. Although algorithms can aid data processing, no algorithm can replace the ability of experienced scientists to interpret findings and make connections between seemingly disparate pieces of information.

CAS experts curate content collections tailored to the scope of your specific project. Building around your exact requirements streamlines workflows, increases discoverability and enhances the impact of internal and external resource investments to accelerate a wide range of initiatives.

Custom services from CAS are already helping organizations solve their biggest data curation and integration challenges. Contact us to unlock the power of your data with our expertise.

Back to top

AI drug discovery: assessing the first AI-designed drug candidates to go into human clinical trials

Todd Wills, Managing Director, Consulting Services
Neural network artificial intelligence

AI-driven discovery continues to gain momentum and achieve critical milestones, including drug discovery.The first AI-designed drug candidate to enter clinical trials was reported by Exscientia in early 2020, a pivotal moment in AI drug discovery. Since then, several companies, including Insilico Medicine, Evotec, and Schrödinger have announced phase I trials. Several candidates have had their clinical development accelerated through AI-enabled solutions. Within drug-companies focused on AI-based discovery there is publicly available information on about 160 discovery programs, of which 15 products are reportedly in clinical development

As structurally novel molecules are significantly more likely to be the source of promising new therapies, it’s essential to consider ways to measure the novelty of AI-designed molecules. CAS published anew indicator of drug innovation based on the structural novelty of new molecular entities (NMEs) to better assess the innovativeness of new AI drugs.

We used this new measure of early-stage AI drug discovery to evaluate the structural novelty of the first three AI-designed drug candidates to go into human clinical trials. All three molecules (DSP-1181, EXS21546, and DSP-0038) are in Phase 1 trials and were discovered using Exscientia’s AI platform. Even though the exact structures were not disclosed, details contained in recent patent filings and Exscientia’s IPO prospectus help us focus our analysis on specific molecules of interest. 

How ground-breaking are the first three AIdrug candidates? Here’s what our analysis found:

DSP-1181 in AI drug discovery

The initiation of a Phase 1 clinical study in Japan of DSP-1181 was announced in January 2020. DSP-1181 is a full serotonin 5-HT1a receptor agonist discovered as part of a collaboration between Exscientia and Sumitomo Dainippon Pharma. It is currently being investigated as a treatment for obsessive-compulsive disorder (OCD).

Only three drug molecules are specifically claimed in US10800755 (one of the two granted patents in the DSP-1181 family of patents). Our structural analysis revealed that all claimed molecules (Examples 1, 8, and 11) share their shape with haloperidol, a frequently used first-generation (typical) antipsychotic agent, which the FDA approved in 1967 (see the table below). Even though they are not FDA-approved for treating OCD, some antipsychotics, such as haloperidol, have been used with some success in augmenting selective serotonin reuptake inhibitors (SSRIs) in patients with OCD.

drug candidate molecules with same ring structure as haloperidol
Figure 1: Structural analysis of claimed molecules in US10800755

The patent also contains 38 exemplified molecules with disclosed bioactivity data, which according to Exscientia’s prospectus, represent 11 percent of the 350 molecules that were synthesized and assayed during the discovery of DSP-1181. The shapes of these exemplified drug molecules lack structural diversity, as 58 percent of the exemplified molecules share the same shape as haloperidol, the FDA-approved drug highlighted in the table above. An additional 21 percent of the exemplified molecules are concentrated on a shape shared by 28 other FDA-approved drugs, including lamotrigine, an antiepileptic drug and mood stabilizer that has been occasionally used in OCD treatment. The remaining eight exemplified drug molecules are spread across three other shapes.

EXS21546 in AI drug discovery

In December 2020, Exscientia’s most advanced, internal lead drug candidate, EXS21546, began a Phase 1 clinical trial in the United Kingdom as an immuno-oncology treatment for several types of tumors. EXS21546 is an adenosine A2a receptor antagonist discovered as part of a collaboration between Exscientia and Evotec.

Its patent, WO2019233994, contains 46 exemplified molecules with disclosed bioactivity data, which according to Exscientia’s prospectus, represent 28 percent of the 163 molecules that were synthesized and assayed during the drug discovery of EXS21546. The exemplified molecules reflect three structurally similar shapes, as the size of only a few of their rings differs by one or two atoms (see the table below). Even though these shapes are not currently shared with any FDA-approved drugs, our analysis revealed that they are shared with other reported A2a antagonists, including several identified by Janssen and disclosed in WO2010045006, WO2010045013, and WO2010045017 (all patents filed in the late 2000s).

molecular shape analysis of structures in patent
Figure 2: Molecular shape analysis of structures in WO2019233994

DSP-0038 in AI drug discovery

The initiation of a Phase 1 clinical study in the United States of DSP-0038 was announced in May 2021. DSP-0038 is a dual-targeted 5-HT1a receptor agonist and 5-HT2a receptor antagonist discovered as part of a collaboration between Exscientia and Sumitomo Dainippon Pharma. It is currently being investigated as a treatment for Alzheimer’s disease psychosis. 

Only three molecules are specifically claimed in US10745401 (currently the only granted patent in the DSP-0038 family of patents). The shapes of the claimed molecules (Examples 109, 135, and 171) are structurally similar, as the size of only a few of their rings differs by one or two atoms (see the table below). Our analysis of the clinical trial data found two of the claimed molecules (Examples 135 and 171) share their shape with previously FDA-approved atypical antipsychotics used to treat a variety of psychiatric conditions. The remaining shape is not currently shared with any FDA-approved drugs, but it is structurally similar to the shape of several serotonin receptor agonists/antagonists identified by Yoshitomi Pharma and Suntory and disclosed in US5141930 and US6258805 (both patents filed in the 1990s). Their linker lengths only differ by a few atoms from the shape of Example 109.

claimed molecule shape analysis in patent
Figure 3: Molecular shape analysis of structures in US10745401

The patent also contains 194 exemplified drug molecules with disclosed bioactivity data, which, according to Exscientia’s prospectus, represent almost 40 percent of the 500 molecules that were synthesized and assayed during the discovery of DSP-0038. The shapes of these exemplified molecules lack structural diversity, as 78 percent of the exemplified molecules share the same shape as the FDA-approved drugs highlighted in the table above and 93 percent of the exemplified molecules are concentrated on the shapes of the three claimed molecules. The remaining 14 exemplified molecules are spread across eight other shapes.

Final thoughts on AI drug discovery

The structural innovativeness of these candidates in AI drug discovery might not set the world on fire, but this does not diminish the potential impact AI will have on drug discovery. Rather than holding AI to a perfect standard, the novelty of AI drug molecules should be judged by the same standards as molecules designed by medicinal chemists. In this case, medicinal chemists more than likely would have identified these molecules as potential drug candidates using traditional approaches based on existing scientific literature. 
As the futurist Roy Amara famously said, “We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.” Amara’s point is extremely relevant in this case. New technologies such as AI drug discovery will be initially overhyped, but given time, may significantly change the world. 

The clinical advantages more often associated with structurally novel drugs highlight the importance of pushing the boundaries of chemical space even further in the search for new drugs.Measuring innovation within the pharmaceutical industry is challenging, but CAS have managed to show how pharmaceutical innovation has significantly increased over the last several decades; see our ACS publication on how structural novelty has been proposed and used to analyze recent trends in pharmaceutical innovation. 

Back to top

Chemical Space Analytics: Bridging the Gap Between Data and Action

CAS digital hero image


In the Journal of Organic Chemistry (JOC) article, Recent Changes in the Scaffold Diversity of Organic Chemistry As Seen in the CAS Registry, a large set of organic compounds extracted from the CAS Registry is analyzed to study recent changes in structural diversity. The diversity is characterized using the framework content of the compounds; the framework of a molecule is the scaffold consisting of all its ring systems and all the chain fragments connecting them. The compounds are partitioned based on their year of first report in the literature, which allows framework occurrence frequencies to be compared across a 10-year interval.

Results show the pace of innovation from a structural perspective is accelerating despite the extensive reuse of a relative small number of scaffolds.

CAS helps clients navigate the explored and unexplored regions of chemical space by leveraging its unique indexing of chemistry-related data at a molecular level to identify areas with limited activity as well as meaningful open areas of chemical space (i.e., scaffolds that are potentially important but have never been used as the basis for a molecule).

Back to top

Protecting and Commercializing the Promise of RNAi

CAS Journal RNAi

In this article from The Patent Lawyer Magazine, Anne Marie Clark, Senior Search Analyst for CAS, discusses how to avoid patent pitfalls as the market for RNAi-derived therapeutics grows.

Back to top

Hydrogen Fuel - Insights into a Growing Market

thumbnail image for Hydrogen Fuel Insights into a Growing Market white paper

The emerging, transformative potential of hydrogen as an alternative renewable fuel source is widely recognized, propelling intense interest across multiple industries including electricity, heating and transportation.

With new innovations driving rapid progress, the worldwide hydrogen market is expected to see dramatic growth to exceed $200bn by the end of 2025 creating opportunity across commercial, academic and government sectors. For researchers, funding agencies, investors and business stakeholders in this space, it is essential to keep up to date with emerging trends in the dynamic and complex research and intellectual property landscape to maximize innovation progress and commercial opportunity.

Hydrogen Fuel Market Insights white paper cover


Back to top

Potential therapeutic agents and associated bioassay data for COVID-19 and related human coronavirus infections

The COVID-19 pandemic, caused by the novel coronavirus SARS-CoV-2, has led to several million confirmed cases and hundreds of thousands of deaths worldwide.To support the ongoing research and development of COVID-19 therapeutics, this report provides an overview of protein targets and corresponding potential drug candidates with bioassay and structure-activity relationship data found in the scientific literature and patents for COVID-19 or related virus infections.

Highlighted are several sets of small molecules and biologics that act on specific targets, including 3CLpro, PLpro, RdRp, S protein-ACE2 interaction, helicase/NTPase, TMPRSS2, and furin, which are involved in the viral life cycle or in other aspects of the disease pathophysiology.  We hope this report will be valuable to the ongoing drug repurposing efforts and the discovery of new therapeutics with the potential for treating COVID-19.


Back to top

The Regulatory Environment for Lithium-Ion Battery Recycling

carbon capture hero image 2

Lithium-ion batteries (LIBs) are commonly used in a variety of consumer products, including cellular phones, laptop computers, and more recently, electric and hybrid vehicles. Due to the growing usage of LIBs, recycling will likely be required to mitigate potential environmental costs to keep up with production.

This peer-reviewed article published in ACS Energy Matters examines the status of LIB recycling regulations around the world with a focus on the United States, European Union, and China. Also discussed are the implications of these regulations as well as the logistics of large-scale recycling.

Back to top

How Scientists are Restoring Carbon Balance in the Environment


While human capabilities have only increased due to recent technological and medical advances, they have significantly contributed to the release of around 830 gigatons of CO2 into the atmosphere in just the last 30 years alone. The United Nations has pledged to “net zero” emissions by 2050, meaning that the amount of CO2 released into the atmosphere will also be removed. Achieving this goal will require the collaboration of scientists, policymakers, and industries worldwide.

In this journal manuscript on ChemRxiv, the CAS Content Collection was leveraged to showcase an analysis of how scientists and industrialists have used different approaches to restore carbon balance in the environment. The article includes a unique landscape view of the emerging topics, the latest trends in this area as well as the challenges currently faced. Read the full document here.

Back to top

Lithium-Ion Battery Recycling─Overview of Techniques and Trends

lithium-ion batteries

As the production and usage of lithium-ion batteries (LIBs) has increased exponentially, their manufacturing and disposal have become subjects of political and environmental concerns. World reserves of LIB components are limited and unevenly distributed, while their mining creates considerable pollution. With these concerns about the impact of these materials on the environment, LIB recycling is being positioned as a potential remedy.

This peer-reviewed article published in the ACS Energy Letters utilizes data from the CAS Content Collection to examine the types and methods of recycling within the last decade. The economic and environmental benefits and challenges are also discussed along with the global landscape of recycling facilities. Read the full publication here.

Back to top

Subscribe to