Most organic reactions happen between electrophiles and nucleophiles. In aqueous environments, electrophiles containing the carbonyl moiety play a central role, e.g., in biochemistry, green chemistry, and the reactions considered relevant to the origin of life. However, predicting the rate of nucleophilic additions to carbonyl groups and the specific underlying mechanism, especially in water, remains largely impossible, as catalysis by acid and base can dramatically change how these reactions proceed. As a result, reaction discovery and optimization in water involve extensive trial-and-error experimentation instead of rational predictions. We herein report using automated liquid handling robots for high-throughput reaction screening to create a comprehensive dataset of experimental rate constants for the reaction of urea as a test nucleophile toward a library of aldehydes, considering a broad array of reaction parameters. Having established a systematic dataset allows us to test for both classical physical organic techniques to derive correlations and the benchmarking of machine learning regression algorithms. This work establishes a foundation for general predictions of reactivity in aqueous environments, providing new insights into the reaction mechanisms crucial for our understanding of fundamental organic chemistry.
 Stefan Kuffer