Oxtal: 100M Parameter Diffusion Model Predicts Organic Crystal Structures from 2D Graphs

Summarize this article with:
Predicting the three-dimensional arrangement of molecules in organic crystals remains a significant challenge in computational science, with implications for fields ranging from drug discovery to materials design. Emily Jin, Andrei Cristian Nica from Synteny, Mikhail Galkin from Google, and colleagues now present OXtal, a new diffusion model that directly learns how molecular structure dictates crystal packing. This innovative approach bypasses traditional methods relying on pre-defined crystal symmetries, instead employing data augmentation and a novel training scheme to efficiently capture long-range interactions between molecules. By training on a vast dataset of 600,000 experimentally verified crystal structures, OXtal achieves substantial improvements in both accuracy and efficiency, recovering experimental structures to within a fraction of an Angstrom and demonstrating an ability to model the complex process of molecular crystallization. Computational Resources for Crystal Structure Prediction Challenges Researchers investigating crystal structure prediction employ significant computational resources, as demonstrated by data collected from blind testing challenges. Analysis of CPU core hours and computation time using the OXTAL software reveals the substantial effort required to model complex molecular arrangements. Different research groups expend varying amounts of computational power, influenced by their chosen methodology, implementation efficiency, available resources, and criteria for terminating calculations. Groups like MNeumann, KSzalewicz-MTuckerman, and DBoese consistently demonstrate substantial investment in computational resources, indicating their active participation and potentially the use of sophisticated methods. This data provides a valuable benchmark for assessing the performance of different computational approaches and estimating the resources needed for future crystal structure prediction projects.,.
Predicting Crystal Structures From Molecular Graphs Scientists have developed OXTAL, a new diffusion model that predicts three-dimensional molecular crystal structures directly from two-dimensional chemical graphs. This innovative approach addresses a longstanding challenge in computational chemistry by accurately modeling the complex arrangements of molecules within crystalline solids. The model was trained on a vast dataset of over 600,000 experimentally validated crystal structures, encompassing a diverse range of molecules to ensure broad applicability. OXTAL distinguishes itself by abandoning traditional methods that rely on predefined crystal symmetries, instead focusing on learning directly from Cartesian coordinates and employing data augmentation techniques. A key innovation is the development of Stoichiometric Stochastic Shell Sampling, a lattice-free training scheme that efficiently captures long-range interactions between molecules. Performance evaluations demonstrate substantial improvements over prior machine learning methods, recovering experimental structures with high accuracy and achieving over 80% packing similarity, positioning OXTAL as a powerful tool for materials discovery and design.,.
Diffusion Model Predicts Crystal Structures Directly A significant breakthrough in crystal structure prediction has been achieved with the development of OXtal, a large-scale diffusion model with 100 million parameters. This model predicts three-dimensional molecular crystal structures directly from two-dimensional chemical graphs, overcoming limitations of previous methods by abandoning explicit architectural constraints based on crystal symmetries. Instead, OXtal utilizes data augmentation strategies and a novel training scheme called Stoichiometric Stochastic Shell Sampling. Trained on a dataset of 600,000 experimentally validated crystal structures, the model recovers experimental structures with high accuracy and achieves over 80% packing similarity, demonstrating its ability to model both thermodynamic and kinetic regularities governing molecular crystallization. Further testing on rigid and flexible molecules shows that OXtal outperforms existing machine learning models, recovering up to 90% of solid-state molecular conformers. Evaluation against structures from blind tests demonstrates that OXtal outperforms other machine learning baselines and achieves comparable performance to expensive density functional theory methods, identifying likely motifs with fewer computational resources.,.
Predicting Crystal Structures with Diffusion Models Researchers have developed OXtal, a new diffusion model capable of predicting three-dimensional molecular crystal structures directly from their two-dimensional chemical representations. This achievement represents a significant advance in crystal structure prediction, with implications for pharmaceuticals and materials science. By training on a large dataset of over 600,000 experimentally validated crystal structures, OXtal learns the complex relationship between molecular structure and crystal packing, achieving substantial improvements in both accuracy and computational efficiency. OXtal’s success stems from prioritizing data augmentation over explicitly encoding crystal symmetries within the model’s architecture.
The team introduced Stoichiometric Stochastic Shell Sampling, a new training scheme that efficiently captures long-range interactions between molecules without relying on traditional lattice-based parameterization. The results demonstrate that OXtal can accurately recover experimental structures, predicting conformer positions with high accuracy and achieving over 80% packing similarity, indicating the model’s ability to effectively model both thermodynamic and kinetic factors governing molecular crystallization. Future research will focus on improving the model’s ability to handle flexible molecules and expanding the dataset to include a wider range of chemical compounds and crystallization conditions. 👉 More information 🗞 OXtal: An All-Atom Diffusion Model for Organic Crystal Structure Prediction 🧠 ArXiv: https://arxiv.org/abs/2512.06987 Tags:
