No Pretraining, No Equivariant Architecture – Learning MLIPs without Explicit Equivariance

Paper
🤗 TransIP-L checkpoint
Code

Machine-learned interatomic potentials (MLIPs) have become a cornerstone of modern computational chemistry, enabling simulations that approach quantum accuracy at a fraction of the cost of traditional methods such as density functional theory (DFT). However, a central challenge in designing MLIPs lies in respecting the fundamental symmetries of molecular systems, especially rotational and translational invariance, while maintaining scalability and flexibility.

In our recent work, we introduced TransIP, a novel framework that formulates how symmetry is incorporated into molecular models by learning symmetry directly in the latent space of an atomic transformer model, in which we treat atoms as tokens, instead of hard-coding equivariance into the neural network architecture.

At the core of TransIP is a simple yet powerful idea: instead of enforcing SO(3) equivariance through specialized layers, the model is trained with a contrastive objective that aligns representations of rotated molecular configurations. A learned transformation network maps latent embeddings under rotations, encouraging the model to discover symmetry-consistent representations implicitly. This design preserves the flexibility and scalability of standard Transformers while still capturing the geometric structure of molecular systems.

Empirically, TransIP demonstrates strong performance on the large-scale OMol25 dataset, achieving results comparable to leading equivariant models while significantly outperforming data augmentation baselines—by as much as 40–60% in certain regimes. Notably, our approach scales well with both data and model size, and offers improved inference efficiency compared to traditional equivariant architectures.

Beyond performance, our work carries broader implications. It suggests that explicit architectural constraints may not be strictly necessary for encoding physical symmetries; instead, these properties can emerge through appropriately designed training objectives such as our contrastive objective for latent equivariance. As molecular datasets continue to grow, such flexible and scalable approaches may play a key role in advancing simulation-driven drug discovery and materials design.

We are making our methodology, code, and trained model checkpoint openly available for the community to explore this research direction further (links provided above).

Author