Drugs act as a common form of medication, serving as an accessible means of fighting sickness worldwide. As a result of rigorous testing and searching, however, drug development is an arduous process that takes billions of dollars and can often last over ten years [1]. The process of drug production can be split into two components: discovery and development. 


Discovery is the process of identifying the target of the drug until its optimization, while development revolves around the testing and refinement of the discovered drug to ensure its effectiveness in the market [2]. Notably, lead optimization acts as the transition between in vitro and in vivo experimentation [3]. In vitro, meaning “in glass” in Latin, implies experimentation within a controlled environment, often a test tube. On the other hand, in vivo, meaning “in life” in Latin, suggests research conducted in organisms, which are more complex than their aforementioned counterpart. In short, lead optimization is the process in drug development where “hit” compounds, identified compounds likely to produce the best results, are tested and optimized with various compounds to optimize them into “leads” that are then tested in vivo.


Lead optimization works by seeking to optimize different categories of the prospective drug, such as efficacy, potency and selectivity [4]. First, potency is the correlation between drug dose and the strength of the effect. Efficacy describes the maximum possible effect when given to an organism. This differs from potency as several factors can turn a high potency drug into something with low efficacy; an example is how the body processes these drugs, where high metabolism in the body can make it hard for a given drug to have enough time to fully affect the body. Selectivity concerns the unwanted side effects a drug causes, with a low selectivity generally resulting in more side effects [5]. 


With these categories in mind, leads are optimized through a variety of ways. For instance, one key way leads are optimized is through structure-activity relationship (SAR) analysis. SAR analysis seeks to find the relationships between the structure of the lead and the assigned function [6]. Thus, by identifying the specific parts of the structure that result in the given activities, those parts could be optimized to have a higher efficacy, higher potency, or lower selectivity. For example, assume an important part of a drug binds to a ligand. SAR analysis would be able to identify this part and connect it to its activity of binding to the said ligand. Then, this part could be optimized to only bind to that specific ligand, which could increase the selectivity of the drug in this case. One way this technique is utilized is for the creation of a me-too or me-better drug, which is a drug based off of an already existing one that was just optimized in certain ways [1]. One prime example of a me-too drug is atorvastatin, a cholesterol drug, which is based on lovastatin [7]. Atorvastatin has a higher efficacy than lovastatin, as, through SAR analysis, scientists were able to engineer it to more efficiently decrease different types of cholesterol than lovastatin. Additionally, similar SAR analysis processes can identify parts of a drug’s structure that decrease its overall performance, leading to scientists being able to mitigate these unwanted effects. 


Another way leads are optimized is through more in vivo testing to see if the optimizations made in previous steps had worked out [8]. Lead optimization as a whole is an iterative process that works as a cycle of testing and optimizing until the aforementioned categories—efficacy, potency, and selectivity— are at the drug's fullest potential. Thus, this makes the process as a whole quite tedious. 


Modern applications of AI into lead optimization could not only provide a more efficient process but also give valuable insight through new methods. Particularly, AI helps speed up the lead optimization process by being able to find both hits and leads much quicker than humans. Computer-aided drug design (CADD), for example, is effective in streamlining this identification process, ultimately reducing cost as well [1]. As for the optimization process itself, AI is able to optimize lead by splitting it into four different sub-tasks: scaffold hopping, linker design, fragment replacement, and side-chain decoration. This is necessary for AI-aided drug discovery (AIDD) because the depth of knowledge necessary to perform optimizations in one task would otherwise be greater than currently possible. Thus, these sub-tasks help the AI optimize through lead optimization. 


This, however, is not the main way AI is actually used within drug discovery. Most of the development within AIDD has come through de novo, meaning “from new”, generative models [1]. These models visualize molecular generations as a graph generation problem. Under a set of constraints, the model connects a set of nodes with each other in the most optimal way. For example, assume that each house within a neighborhood can be represented as a node, and the streets are the constraints for the AI to maneuver around. Thus, the problem would entail something along the lines of finding the most optimal path from one house to another. This can end up creating new structures that lead optimization wouldn’t be able to make. The aforementioned models also don't have a strong need for excessive background information, allowing them to be able to generate new structures quicker than the lead optimization AI models. 


However, a restriction of AI models within lead optimization is the lack of data [1]. Training models in general require lots of data to build. Since the technology is still novel, there isn’t sufficient data to be able to make a hyper-accurate model for lead optimization. However, as data becomes increasingly available/accessible, a world where the lead-optimization process fully utilizes AI becomes more and more possible. This could lead to huge benefits for the pharmacology industry as this would rapidly speed up the time it takes for new drugs to be made [1].


In conclusion, lead optimization plays a vital role in the development of drugs by crafting a careful balance in three aspects: potency, efficacy, and selectivity. More specifically, it functions as the bridge between discovery and development. Furthermore, recent developments in the industry, particularly those that work with AI, expedite a process that could otherwise be extremely costly in both time and finances. These innovations can also be applied more broadly in the field such as in other phases of the drug development process, such as the creation of hit compounds and data review. Moreover, these changes in the development process also offer indirect benefits to the average consumer—enabling personalized medication, accelerating the development of a certain drug, etc. At the end of the day, lead optimization proves to be a fascinating process crucial to the proper functioning of any drug [2].