While AI is advancing rapidly, data scarcity often remains a bottleneck in quantum chemistry, where generating high-accuracy training samples is computationally expensive. This necessitates data-efficient models reliant on robust molecular representations and physical insight. However, state-of-the-art models and representations often suffer from hyperparameter overfitting, high computational costs, and an inability to accurately model intensive properties (e.g., excitation energies) due to additivity assumptions. To fully exploit the information potential of our training data, we explore the possibilities of incorporating physical insight into the models by different means, but mainly at the level of molecular representations, which makes our approach very general. We introduce a lightweight, universal representation that overcomes the abovementioned limitations. By expanding local atomic environments into element-specific Slater-type orbitals and calculating a specific set of rotation invariants, we construct local compact representations with only a single hyperparameter, which can be combined into a global molecular representation or directly fed to a kernel function. Despite its simplicity and low computational cost, our approach matches or exceeds the accuracy of complex alternatives. We demonstrate its versatility using kernel ridge regression across diverse datasets, successfully modeling both chemical and configuration spaces covering a large variety of elements, as well as both intensive and extensive properties.
 Dr. Štěpán Sršeň