For the first time, we can see proteins not as still portraits, but as dynamic movies, all from a single sequence.
When you think of a protein, you might picture a static, ribbon-like structure frozen in a single shape. This is the image often presented in textbooks and the spectacular results from AI tools like AlphaFold. But this picture is a scientific illusion. Proteins are not static; they are dynamic machines, constantly shifting, wobbling, and changing shape to perform their functions. Understanding these movements has been one of biology's biggest challenges—until now.
A new artificial intelligence tool called ESMDynamic is revolutionizing our view of the molecular world. Developed by researchers at the Shukla Group, this deep learning model can predict the intricate dance of a protein directly from its amino acid sequence, and it does so in seconds rather than months. This breakthrough promises to accelerate discoveries in drug design, enzyme engineering, and our fundamental understanding of life itself 2 3 .
Proteins are the workhorses of life, responsible for everything from digesting food to firing neurons. Their function is intimately tied to their motion. For example:
Must change shape to catalyze chemical reactions.
Shift their structures to carry cargo across cell membranes.
Alter their conformations to turn cellular processes on and off.
These functional movements often involve subtle, coordinated fluctuations of amino acids. Residues that are far apart in the linear sequence may transiently come into contact, acting as a kind of molecular switch 3 . Traditionally, capturing these "dynamic contacts" required extremely sophisticated and time-consuming experimental techniques or massive computer simulations that could take months to run. ESMDynamic offers a shortcut, providing a fast and accurate prediction of these essential movements 3 .
So, how does ESMDynamic work? The model is built on a powerful foundation: ESMFold, a protein language model similar to the AI behind advanced chatbots. Just as language models learn the patterns and grammar of human language from vast amounts of text, ESMFold learns the "grammar" of protein sequences and structures from millions of known proteins 3 .
ESMDynamic takes this a step further. It was specially trained to understand not just a protein's final shape, but its entire range of motion. The researchers used a two-stage training process:
The model was first trained on proteins with multiple experimentally determined structures, learning that a single sequence can correspond to several slightly different shapes 3 .
It was then refined using data from molecular dynamics (MD) simulations. These simulations, which calculate the physical movements of every atom in a protein, provided a rich dataset of how contacts between residues form and break over nanoseconds to microseconds. The training focused on the mdCATH dataset, a massive collection of MD simulations covering 5,398 soluble protein domains 3 .
To prove its worth, the ESMDynamic team put their model through a rigorous series of tests, benchmarking it against other state-of-the-art methods and validating its predictions on real-world proteins.
The researchers assessed ESMDynamic's performance on two large-scale MD simulation datasets: mdCATH and ATLAS. These datasets served as the "ground truth" for protein dynamics. The model's task was to classify residue pairs as either dynamic contacts or static/non-contacts, based solely on the protein's sequence 3 .
Its performance was compared against other ensemble prediction models like AlphaFlow and ESMFlow. The key metrics were its balanced accuracy (80%) and recall (73%), demonstrating a high ability to correctly identify true dynamic contacts while minimizing false positives 3 .
| Dataset | Balanced Accuracy | Recall | Key Strength |
|---|---|---|---|
| mdCATH | 80% | 73% | Excellent identification of dynamic contacts from simulation data |
| ATLAS | Matched or outperformed other models | Matched or outperformed other models | Generalizes well across different protein families |
The results were striking. ESMDynamic not only matched but in many cases surpassed the performance of other advanced models. Perhaps more importantly, it achieved this with a massive speed advantage, producing predictions in seconds compared to the weeks or months required for MD simulations 3 .
The model's power was then demonstrated in several case studies:
A particularly powerful application involved using ESMDynamic's predictions to guide further computational analysis. The researchers created an automated pipeline where the predicted dynamic contact maps were used to select "collective variables" for building Markov State Models from unbiased MD simulations. This approach, tested on the SWEET2b transporter, allowed them to generate high-quality models of the protein's energy landscape and kinetics much more efficiently than before 3 .
| Feature | ESMDynamic | Traditional MD Simulations | Other Ensemble AI Models |
|---|---|---|---|
| Speed | Seconds to minutes | Weeks to months | Minutes to hours |
| Input Required | Single amino acid sequence | 3D structure, force fields, water models | Multiple Sequence Alignments (MSAs) or other data |
| Primary Output | Probabilistic dynamic contact map | Atomic-level trajectory | Ensemble of 3D structures |
The Shukla Group has made ESMDynamic accessible to the broader scientific community, providing a full suite of tools for anyone to start predicting protein dynamics.
| Tool / Resource | Type | Function | Availability |
|---|---|---|---|
| ESMDynamic Model | Software / AI Model | Predicts dynamic contact maps from a single protein sequence. Core engine of the technology. | GitHub 4 |
| Google Colab Notebook | Software | Web-based interface for easy prediction of individual protein sequences without local installation. | GitHub 4 |
| Model Weights & Datasets | Data | Pretrained model parameters and training data (mdCATH, etc.). Essential for running the model or retraining it. | Illinois Data Bank (DOI: 10.13012/B2IDB-3773897_V1) 1 4 |
| Docker Container | Software | A pre-configured software environment that simplifies installation and ensures consistent results. | GitHub 4 |
Access the ESMDynamic model and tools through the GitHub repository to start predicting protein dynamics today.
Visit GitHubRead the full preprint on bioRxiv for detailed methodology, results, and technical specifications.
Read PreprintESMDynamic represents a fundamental shift in how we computationally study proteins. By moving beyond static structures to embrace the dynamics that are central to function, it opens up new frontiers in biology and medicine. Its ability to rapidly identify key residues involved in motion provides invaluable guidance for designing smarter experiments, from mutational studies to advanced spectroscopy.
For protein engineers, it offers a blueprint for designing molecules that can move in specific, useful ways. For drug discoverers, it reveals allosteric sites and cryptic pockets that are invisible in static structures. As this technology matures and is adopted by researchers worldwide, we can expect a deeper, more dynamic understanding of the very machinery of life. The age of the moving protein has arrived.