DyRAMO

Dyramo is a framework for exploring promising candidates (e.g., molecules) by performing multi-objective optimization with a generative model while accounting for the reliability (applicability domain) of multiple property-prediction models, thereby avoiding reward hacking.

DyRAMO: Highly Reliable Molecular Design Using Generative AI

Schematic Overview

Figure 1: Framework of DyRAMO (Dynamic Reliability Adjustment for Multi-objective Optimization) for adaptive reliability control.

Key Points

  • In conventional molecular design, molecules predicted by AI to be promising often turn out to be undesirable in practice. This phenomenon, known as reward hacking, has become a major obstacle to the practical application of AI-driven molecular design.
  • This problem is particularly severe in drug discovery and materials development, where multiple properties must be optimized simultaneously (multi-objective optimization), limiting the usefulness of generative AI.
  • We developed DyRAMO (Dynamic Reliability Adjustment for Multi-objective Optimization), a framework that enables simultaneous optimization of multiple properties while maintaining the predictive reliability of AI models in data-driven molecular design.

Links to Tools and Databases

Terminology

  • Reward hacking:  A phenomenon in which an AI system maximizes the designed objective function (reward) in unintended ways. For example, in molecular design, an AI may exploit biases in prediction models to generate molecules with high scores that are not genuinely promising.
  • Generative AI: A class of AI systems that generate new data using machine learning. In addition to images, text, and music, generative AI is increasingly used in science and engineering to design new materials and chemical compounds.
  • Applicability Domain (AD): The range of data in which a machine learning model can make predictions with a specified level of reliability. Predictions are generally reliable for data similar to the training set, but reliability degrades for extrapolative inputs.
  • Multi-objective optimization: An optimization problem that aims to simultaneously maximize or minimize multiple objective functions.

Overview

Although molecular design combining generative AI and predictive models is highly promising, reward hacking caused by reduced predictive reliability in extrapolative regions remains a serious challenge. In this study, we developed DyRAMO, a framework that performs multi-objective optimization while automatically adjusting predictive reliability. DyRAMO iteratively repeats reliability setting, molecular design, and evaluation, and efficiently searches for appropriate combinations of reliability parameters using Bayesian optimization.

Background

In recent years, molecular design using generative AI has attracted significant attention for its potential applications in drug discovery and functional materials development. However, simply generating molecules with AI is insufficient; it is essential to evaluate target properties accurately and feed the results back into the generative process. Consequently, data-driven molecular design frameworks that combine generative AI with supervised predictive models have emerged as efficient and high-throughput approaches.

Predictive models are generally reliable for interpolation but suffer from reduced accuracy in extrapolative regions. When generative AI proposes novel molecules outside the training data distribution, predictive performance is no longer guaranteed, potentially leading to reward hacking. While applicability-domain-based approaches have been proposed to mitigate this issue, multi-objective optimization requires simultaneous control of the reliability of multiple predictive models. Overly strict reliability constraints severely limit the search space, whereas overly loose constraints increase unreliable predictions. Moreover, it is unknown a priori whether molecules exist in regions where the applicability domains overlap. As a result, reliability-aware multi-objective optimization has remained an open challenge.

Methods

We developed DyRAMO, a framework that automatically adjusts the reliability parameters defining the applicability domains of predictive models. DyRAMO iteratively performs three steps: (1) setting reliability parameters, (2) molecular design, and (3) evaluation of both the reliability settings and the molecular design outcomes (Figure 1).

In Step 1, reliability parameters are set for each predictive model to define its applicability domain. In this study, structural similarity between candidate molecules and training data was used as a reliability criterion. Users can also specify the range of reliability adjustment and prioritize specific properties. In Step 2, molecules are generated within the defined applicability domains such that all target properties are optimized. In Step 3, the quality of the reliability settings and molecular design results is evaluated, and the reliability parameters are updated accordingly. Bayesian optimization implemented via PHYSBO is employed to efficiently explore optimal combinations of reliability parameters.

Results

DyRAMO was applied to the design of drug candidates targeting the epidermal growth factor receptor (EGFR), which is relevant to non-small cell lung cancer therapy. Three properties were simultaneously optimized: EGFR inhibitory activity, metabolic stability, and membrane permeability. ChemTSv2 was used as the molecular generative model. DyRAMO successfully generated molecules in which all three properties were optimized based on predictions with sufficiently high reliability.

Compared with methods that do not consider predictive reliability, DyRAMO maintained designed molecules within regions of relatively high reliability. Notably, the generated molecules included gefitinib, an existing approved drug, demonstrating that the proposed framework can identify practically relevant drug candidates. In addition, DyRAMO allows users to assign different priorities to different properties, enabling optimization strategies that distinguish between properties requiring high reliability and those for which some compromise is acceptable.

Outlook

In conventional AI-driven molecular design, molecules identified as promising by AI often prove unsuitable in practice, leading to inefficiencies and rework in development pipelines. By incorporating DyRAMO, molecules can be selected based on reliable evaluations already at the design stage, reducing downstream failures. This is expected to accelerate drug discovery and materials development processes by improving both efficiency and robustness.

References

  1. T. Yoshizawa, S. Ishida, T. Sato, M. Ohta, T. Honma, K. Terayama,
    “A data-driven generative strategy to avoid reward hacking in multi-objective molecular design,”
    Nature Communications, 2025, 16, 2409.
  2. S. Ishida, T. Aasawat, M. Sumita, M. Katouda, T. Yoshizawa,
    K. Yoshizoe, K. Tsuda, K. Terayama,
    “ChemTSv2: Functional molecular design using de novo molecule generator,”
    WIREs Computational Molecular Science, 2023, 13, e1680.