After undergraduate and doctoral education at the University of Manchester, Peter Johnson joined the synthetic chemistry group of Professor A. Eschenmoser at the ETH in Zurich. Later he joined E J Corey’s LHASA project at Harvard University. In 1980 Peter joined the University of Leeds, where he is now Emeritus Professor of Chemistry. His research there led to the development of a number of chemistry-based software applications such as SPROUT/CAESA and CLiDE. He also led a synthesis group which accomplished the total synthesis of a number of natural products. Peter has also founded a number of software companies, including LHASA Ltd), Synopsys, and SimBioSys.
Chemists working in discovery-related applications are frequently tasked with developing practical, efficient and cost-effective synthetic routes to complex chemical structures, whether they’re known molecules or new, innovative compounds. For many chemical scientists, the standard approach to solving this synthetic puzzle would include the application of retrosynthetic analysis strategies. Here, we look at how recent advances in the power and capabilities computational retrosynthesis tools are helping chemists plan complex synthesis schemes more efficiently and creatively.
Retrosynthetic analysis: Perhaps the most valuable synthesis planning strategy in a chemist’s toolbox
Retrosynthetic analysis is the art of methodically working back from complex molecules to identify smaller and more readily available starting structures. Formalized and popularized as a problem-solving strategy by the American Nobel prize-winning chemist E. J. Corey in the late 1960s, retrosynthesis involves the application of chemical reaction rules to deconstruct target structures into precursors, and then assessing the availability and suitability of potential reactants.
But just like recreating a delicious restaurant meal that you’ve never cooked before, there are often many steps involved and many decisions to make. The broad range of chemical transformations at chemists’ disposal can present a tsunami of potential retrosynthetic pathways to a given target compound. Some routes identified by retrosynthetic means will be more practical, efficient and cost-effective than others. What’s more, the greater the target’s structural complexity, the more steps are required to work back to suitable starting materials. And with every step generating perhaps dozens of potential precursors, each with retrosynthetic pathways of their own, you soon have more options than the human brain can meaningfully process. In short, exhaustive retrosynthetic analysis can be challenging – even for the most experienced chemists.
Computational retrosynthetic analysis: A more efficient approach to creative synthesis design
Beginning in the late 60s, pioneering work by Corey and his collaborators laid the foundations of computational retrosynthetic analysis, to provide assistance to brain-only powered retrosynthesis efforts, by applying the same fundamental principles but with the added speed and efficiency offered by computer algorithms. Although the computer hardware and chemistry data available in those early days was puny by today’s standards, the fundamental approach was spot on and truly laid the foundations in the field. Ongoing advances in computing power and the big data revolution mean that today’s digital retrosynthesis tools have elevated this approach to a whole new level where the basic tools are supported by high-quality, comprehensive databases and rule-based algorithms to truly unlock the power of this technique.
One of the biggest advantages of computational retrosynthetic analysis over manual efforts is the way in which digital search tools can draw on a wealth of published knowledge of reactions and reagents. An experienced chemist may know a great many chemical transformations – but it’s unlikely they’ll know every possible reaction that could be used to synthesize a given molecule, whether it is known or unknown.
Computational approaches, on the other hand, can make decisions based on evidence for most reactions that have ever been published, enabling a comprehensive analysis of the vast majority of possible synthetic pathways using information on the cost of materials, number of steps, atom efficiencies, and other factors. As computational approaches are based on hard data and predictive algorithms, they allow chemists to very quickly and confidently identify the best routes to a specific target, reducing trial and error and eliminating associated bottle necks. If the basic scientific content used by a computational retrosynthesis tool is regularly updated, an added bonus is that this ensures that the newest and often most effective chemistries are considered in any search.
These features also give chemists the freedom to identify more creative pathways that lie beyond the limitations of their own knowledge base or synthetic repertoire. Because of the complete lack of bias, the sophisticated algorithms built into modern computational retrosynthesis tools will often predict potential reaction steps for the chemists which are completely outside their range of expertise. Having readily-available access to this information can often form a source of inspiration, resulting in more of those ‘eureka’ moments that can overcome even the most vexing of challenges. By fostering faster innovation, computational retrosynthesis systems are already accelerating the development of new products using more efficient and effective chemistries.
The digital revolution is expanding the capabilities of computational retrosynthetic analysis
Computational retrosynthetic analysis has also changed the way in which chemists optimize synthesis schemes for efficiency and cost, by drawing on a wide range of published data, including properties such as reaction yields and material pricing information. For example, modern computational approaches can take reaction yield data published in journals and databases and aggregate this information to show reported values for known transformations and even provide estimates for hypothetical transformations. Thanks to the improvements in functionality and ease of use, chemists can prioritize specific synthetic routes with the most favorable characteristics.
A goal of many retrosynthetic analyses is to optimize syntheses based on cost – a key consideration when developing economically-sensitive products such as off-patent generic drugs. Some computational retrosynthesis tools can even utilize chemical pricing catalogs from suppliers all over the world, allowing researchers to design synthesis pathways based on up-to-date information on costs of reactants, catalysts, solvents and other materials. While it is difficult to compute the actual purchase cost without knowledge of the chemist’s specific supplier relationships, etc., the estimates provide a useful indication of relative prices for alternative starting materials. These advances are helping researchers make more informed decisions faster, accelerating productivity and improving confidence in R&D.
Simplifying the challenge of chemical synthesis
Computational retrosynthetic analysis has the potential to transform how chemists design and optimize synthesis schemes, helping them quickly refine an inordinate number of potential pathways to the most effective solutions. What’s more, ongoing advances in computing power and access to comprehensive collections of scientific content are empowering scientists to make better informed decisions using the most effective chemistries. But it must be emphasized that although computational retrosynthetic analysis is a powerful tool in its own right, it is the combination of this tool with the chemist’s knowledge and intuition that will provide the best pathway towards productive, efficient and confident work at the bench.
While there is much work to be done before we can approach the planning skills of the best synthetic chemists, who regularly use their vast knowledge and experience to quickly pinpoint good routes without having to consider all possible routes, the way ahead is well charted and I have no doubt that we will see continued and significant advances in this exciting realm of cheminformatics in the very near future.