We proposed a novel ML-guided materials discovery platform that combines synergistic innovations in automated flow synthesis and AI agents. A software-controlled, continuous polymer synthesis platform enables rapid iterative experimental–computational cycles that result in the synthesis of hundreds of unique copolymer compositions within a multi-variable compositional space. The non-intuitive design criteria identified by ML, accomplished by exploring less than 0.9% of overall compositional space, and led to the identification of >10 copolymer compositions that outperformed state-of-the-art materials. Under the RL paradigm, an agent(s) is trained to select actions that maximize the cumulative sum of rewards, which, in the context of chemical discovery, is often consistent with a target property, structural feature, or function. RL agents can learn to suggest synthesis protocols, potential reactants, and experimental conditions by training via value-based or policy-based iterative schemes. These findings demonstrate that machine-guided, human-augmented design is a powerful strategy for accelerating polymer discovery in applications where data is scarce and expensive to acquire, with broad applicability to multi-objective materials optimization.
 Prof. Olexandr Isayev