With recent advances in generative machine learning, different models have been adapted to predict novel materials, and new architectures are emerging frequently [1]. While several metrics allow for the rating of individual characteristics (e.g., quality or novelty) of generated crystal structures on an instance level [2], approaches that evaluate the general performance of generative models for materials prediction are missing. To close this gap, we developed the Transport Novelty Distance (TNovD) [3]. This metric evaluates generative models by jointly judging the novelty and quality of all newly generated crystal structures. Thereto, the Wasserstein distance is calculated on an abstract feature space distribution derived from the chemical and physical characteristics of the materials. These features are created by embedding the crystals description with an invariant Graph Neural Network (GNN) that was trained with the InfoNCE loss on the identical set of materials as the generative model. Using contrastive learning allows to not only account for materials themselves, but also for their augmented counterparts and differently sized supercells. Based on the resulting feature space, couplings between generated and train set are calculated and split into a quality and a memorization regime by a threshold. This allows to evaluate quality and novelty simultaneously. The TNovD was tested on various toy experiments for memorization and different cases of artificial crystal structure deformation. Additionally, we validated it on the MP20 validation set [4] and the WBM substitution dataset [5]. The experiments results demonstrate the TNovD capabilities of detecting both memorization and low-quality materials. Afterwards, we benchmark the performance of popular material generative models with the MP20 validation data, in particular structures of MatterGen, DiffCSP, DiffCSP++, ADiT, CDVAE, and Chemeleon, generated by [2]. While introduced for materials, our TNovD framework is domain-agnostic and can be adapted for other areas in the space of chemical compounds, such as images and molecules. The code is available at https://github.com/BAMeScience/TransportNoveltyDistance/tree/main
 Simon Müller