
Artificial intelligence (AI) for science is changing scientific discovery in fundamental ways. The capacity of large language models (LLMs), neural networks, generative AI, and machine learning to analyze huge datasets is making it possible to evaluate more information than a human researcher alone possibly could — and in ever-shorter timeframes. As the evolution of AI-powered technology opens new possibilities in fields ranging from personalized medicine to petrochemicals, here are some recent examples demonstrating how AI is making meaningful impacts on scientific discovery.
Gen AI enabling the design of novel materials based on desired properties

Designing materials with specific attributes can be a slow process that requires experts to manually screen extensive lists of candidates and employ trial-and-error experimentation. Gen AI is enabling a paradigm shift in materials science, helping researchers evaluate many more candidates by predicting substance properties from first principles and screening many more materials much more quickly.
Gen AI lets researchers start with an end goal in mind instead of iterating experiments with existing materials. This doesn’t just save time, but by exploring unconventional combinations to rapidly generate thousands or even millions of hypothetical compounds, researchers can use AI to increase the potential for innovation. For example, several new generative AI models have emerged over the last few years aimed at optimizing the materials design process:
- Microsoft Research’s MatterGen uses a diffusion model as a foundation for material generation.
- MatAgent leverages LLMs augmented with a materials knowledge base to propose a new composition, a diffusion model structure estimator to determine the crystal structure based on the proposed composition, and a graph neural network (GNN)-based property evaluator to predict the properties of that crystal structure.
- IBM Research also released a suite of models using a mixture-of-experts approach, which has shown promise in designing PFAS replacements.
And Meta recently released Open Molecules 2025, a dataset for advancing molecular discovery, and the Universal Model for Atoms (UMA).
Machine learning has also been very effective at predicting candidate materials’ properties. Microsoft developed MatterSim, a companion to MatterGen, which is a deep learning model trained on data that blends active learning, generative models, and molecular dynamics simulations. Google has also released a tool for materials discovery called GNoME, which has been described as “AlphaFold for materials discovery.” Using DeepMind, researchers narrowed a list of 2.2 million materials down to 381,000, then used GNoME to predict which would be the most stable, resulting in 528 potential lithium-ion conductors that could be used in batteries.
AI for science powering the rise of precision medicine

AI-driven genetic modeling can now analyze and interpret vast amounts of genetic, epigenetic, and bioinformatics data, helping researchers understand genetic variations and their implications. This approach uses machine learning algorithms and other AI tools to identify patterns and correlations within genetic information to tailor preventive measures, treatments, and medicine to every individual. The goals are to empower early intervention, minimize side effects, and reduce costs, all with the goal of creating personalized care programs tailored to an individual patient’s needs.
The application of AI to genetic modeling has made significant progress in the following areas:
- Data integration and analysis: AI models process large amounts of genetic data, including DNA sequencing (e.g., whole exome/genome sequence), transcriptome, proteome, and metabolome data, to uncover insights that traditional algorithms may miss. Convolutional neural networks (CNNs) and recurrent neural networks (RNNs) identify complex associations between biomarkers, disease risk, and drug response. These models can capture intricate patterns within genomic data that traditional statistical methods may overlook, leading to more accurate predictions of disease susceptibility and treatment outcomes.
- Predictive analytics: AI is being applied to predict the likelihood of developing certain diseases based on a person’s genetic profile, family history, and even doctors’ notes. This approach uses machine learning algorithms and clinical informatics to analyze complex genetic data and familial patterns, offering insights into preventive measures and personalized treatment planning.
- Tailored treatment plans: AI supports developing precision therapies based on an individual’s unique genome and epigenome. AI models analyze and identify specific variations and mutations that may influence disease risk and treatment responses. AI solutions can also examine epigenetic modifications, such as DNA methylation and histone modifications, to understand how environmental factors and lifestyle choices impact gene expression.
- Integration of multi-omics data: Utilizing AI’s ability to relate genomic, transcriptomic, proteomic, and metabolomic data, researchers can create more comprehensive models of biological systems. This holistic approach enables a deeper understanding of the complex interplay between genetic factors, non-genetic factors, and environmental influences in disease development and progression.
AI-driven solutions can analyze vast amounts of data to help develop diagnoses and treatments that consider the variety of factors at play for each patient. Whether it’s moving closer to an omniomic model of a patient’s health profile or addressing the computational complexity of genetic and epigenetic data, AI promises to be a driving force in the development of precision medicine.
AI algorithms improving the efficiency of carbon capture and sequestration

Even if all carbon dioxide emissions were halted today, CO2 already in the atmosphere would continue to trap heat and drive global warming. As the Intergovernmental Panel on Climate Change (IPCC) noted, while CO2 capture and removal is not a replacement for emissions reduction, it will be required to lower greenhouse gas emissions to stay at 2 °C of warming.
AI for science is playing an increasingly important role in numerous approaches to capturing CO2, including carbon capture and sequestration (CCS) and enhanced rock weathering (ERW).
- Carbon capture and sequestration (CCS): AI algorithms are optimizing carbon capture systems by analyzing operational data to improve efficiency. For example, reinforcement learning algorithms were used for process optimization of CO2 removal from industrial flue gas while minimizing the required operational energy. The results show that predictive models and algorithms can reduce energy costs in more than 70 percent of cases. Machine learning can also identify porous materials for adsorption faster than traditional methods. Microsoft MOFDiff, a coarse-grained diffusion model, generates structures of metal-organic frameworks (MOFs), which are highly effective in carbon capture.
- Enhanced rock weathering (ERW): ERW removes CO2 from the atmosphere by accelerating normal geological processes. By adding a fine layer of crushed silicate or carbonate rocks to land, such as agricultural fields, chemical weathering processes in the presence of rainwater convert atmospheric CO2 to bicarbonate ions. AI-powered solutions can support ERW efforts by identifying target locations and soil conditions, as well as running the predictive models that are needed to determine project effectiveness.
While AI has many applications in carbon capture activities, it is important to note that the emissions associated with running AI must also be considered. Otherwise, they will contribute to global warming rather than mitigate it.
Making petrochemical operations safer and more efficient with digital twins

“Digital twinning” is an AI-based technology used to create high-fidelity virtual replicas of physical spaces to analyze real-time data and predict outcomes. In the petrochemical industry, these replicas allow for simulation, analysis, and optimization of complex chemical processes. By embedding detailed operational data and reaction kinetics, digital twinning provides insights into manufacturing plant systems under varying conditions.
Digital twin software has matured significantly in recent years, with many industries taking advantage of the predictive capabilities these simulations can provide. The successful implementation of digital twins in petrochemical manufacturing relies on several critical technologies:
- High-fidelity simulation platforms: Software tools like Aspen HYSYS, gPROMS, and AVEVA integrate detailed chemical kinetics and thermodynamics, creating precise virtual replicas of petrochemical operations.
- IoT and sensor Integration: Extensive sensor networks provide real-time data feeds, continuously updating the digital twin to accurately reflect current operational states.
- Machine learning and AI integration: AI-driven analytics enhance predictive capabilities, enabling digital twins to anticipate operational disturbances and optimize plant performance proactively.
These innovations allow petrochemical firms to implement predictive maintenance and asset management. Digital twins can even improve safety by predicting equipment failures before they happen, scheduling maintenance proactively, and significantly reducing downtime and operational costs. They also incorporate real-time analytics, enabling dynamic adjustment of processes to optimize yields and reduce energy consumption. For example, by facilitating detailed modeling of ethylene crackers, digital twins maximize feedstock utilization and product yields through precise operational adjustments.
Accelerating drug discovery and drug repurposing using AI models
Drug repurposing has long been a successful strategy in pharmaceutical development. However, it often requires slow, labor-intensive work to review clinical observations and research literature, followed by extensive testing and validation. Using AI for science, pharmaceutical companies can now predict drug indications and contraindications more efficiently, sometimes even without new parameters or data, through a method known as a “zero-shot inference.”
Applications of AI in drug repurposing include:
- AI-powered models that utilize knowledge graphs to evaluate more diseases and potential drug candidates faster than human researchers could alone.
- Identifying repurposing candidates using genetics data network biology to identify multiple drugs to target multiple parts of a disease pathway.
- Polypharmacology, researchers can explore off-target effects that can be utilized in combination with another drug.
The success of these efforts depends largely on the availability of good data, which remains a challenge. By some estimates, 55 percent of organizational knowledge is “dark data,” unstructured information accumulated over time. Typically, unharmonized and stored within disjointed systems, dark data often has limited accessibility and is stored in disjointed systems, impeding its integration into AI strategies.
The complexity of the human body and the challenges associated with diseases that lack viable treatments mean that AI’s computational ability will be crucial in identifying new uses for established medications.
Computer vision for microstructural analysis improving safety and quality

In materials science research, computer vision uses AI to process large volumes of visual data, such as microscopic or spectroscopic images, to identify material properties, structures, and defects. This type of analysis has long been a vital step in understanding the mechanical properties of materials and plays a critical role in safety, materials selection, and quality control. Grain size and distribution, for example, are important factors in material strength and ductility, and industries such as aerospace and electronics require extremely specific, accurate microstructural analyses of materials and components.
AI-powered computer vision is revolutionizing this process. What has traditionally been a time-consuming manual procedure can now be completed more quickly and accurately, automating tasks like phase identification and microstructure classification. Computer-vision algorithms can locate and identify defects, including cracks, voids, and foreign particles. They can also segment individual grain particles within a microstructure, detect different material phases present, and enable various quantitative capabilities utilizing machine-learned features.
For manufacturers, computer vision-powered microstructural analysis improves quality control by confirming that materials meet certain specifications. By identifying potential defects faster than a human-led analysis, these systems can help ensure the integrity of materials under extreme conditions, such as in nuclear reactors, or confirm the stability of welded materials.
The future of AI for science: transformative and surprising
Technological advances and scientific breakthroughs have always gone hand-in-hand, and as AI-driven technologies continue to improve, they are expected to further transform many areas of scientific inquiry.
As we can see from the examples above, there is no “one size fits all” approach to using AI for science. Its impact is likely to be felt in different ways across the full range of both scientific and computing disciplines—from data mining, integration, analysis, and optimization to simulations, visualizations, predictive analysis, and modeling.
At CAS, we’re constantly analyzing the CAS Content CollectionTM, the largest human-curated repository of scientific information, to better understand the research landscape and how AI is making meaningful differences in fields from healthcare to materials science and more. With our ability to get a first-hand look at cutting-edge innovations, we are excited to see where AI takes scientific discovery in 2025 and beyond.