How INPS Decodes Mutation Mysteries Without a Blueprint
Proteins are nature's nanomachines, performing tasks essential for lifeâfrom digesting food to repairing DNA. Their function depends on a precise three-dimensional shape, maintained by delicate thermodynamic forces. When genetic mutations alter a single amino acid (a non-synonymous variation), this balance can shatter, leading to diseases like cancer or cystic fibrosis.
For decades, scientists relied on protein structures to predict mutation impactsâa major bottleneck since <80% of human proteins lack structural data.
Enter INPS (Impact of Non-synonymous Variations on Protein Stability), an AI-powered tool that predicts stability changes using only protein sequences.
Stability changes are measured as ÎÎG (change in Gibbs free energy). Negative ÎÎG means destabilization (high disease risk); positive values suggest stabilization 1 .
INPS exploits evolutionary patterns. Well-conserved residues often stabilize proteins; mutations here are likely harmful 6 .
Developed by Fariselli, Martelli, and colleagues, INPS uses Support Vector Machine (SVM) regression trained on thermodynamic data. Its input features include:
p53 mutations cause 50% of cancers. But which destabilize the protein? INPS tackled 42 variants with unknown stability impacts 1 .
INPS was trained on S2648âa dataset of 2,648 mutations with known ÎÎG values.
Tested on p53 mutations absent from training data.
Compared predictions to experimental ÎÎG measurements.
Combined INPS with mCSM (structure-based) for enhanced accuracy 1 .
Variant | Experimental ÎÎG (kcal/mol) | INPS Prediction | Clinical Relevance |
---|---|---|---|
R175H | -2.8 | -2.9 | High cancer risk |
Y220C | -1.4 | -1.3 | Drug target |
R282W | -3.1 | -3.0 | Aggressive tumors |
This proved INPS's clinical value. For example, Y220C's mild destabilization (-1.4 kcal/mol) makes it a druggable siteârescue drugs are now in trials .
Reagent/Resource | Function | Example in INPS Development |
---|---|---|
ProTherm Database | Curates experimental ÎÎG values | Training data for SVM regression |
HMMER/PSSMs | Builds evolutionary profiles | Input for conservation features |
SVM Regression | Machine learning for ÎÎG prediction | INPS's core algorithm |
DSSP | Computes solvent accessibility | Feature in INPS3D |
BLOSUM62 Matrix | Scores amino acid substitutions | Quantifies mutation severity |
4-Propenyl-2,6-dimethoxyphenol | 20675-95-0 | C11H14O3 |
N-(butan-2-yl)-4-chloroaniline | C10H14ClN | |
5-ethoxy-2,3-dihydro-1H-indole | C10H13NO | |
4-[3-(Methylamino)butyl]phenol | C11H17NO | |
N-(3-chlorophenyl)oxan-4-amine | C11H14ClNO |
Recent extensions like INPS-MD integrate sequence and structure data, while DDGun explores untrained models for multi-site mutations 3 6 . Deep learning tools like DDGemb now leverage protein language models for higher accuracy 5 .
New approaches are needed to predict effects of multiple simultaneous mutations on protein stability.
Protein language models like ESM and ProtTrans are being adapted for stability prediction tasks.
INPS transforms genetic mutation analysis from a structural puzzle into an accessible sequence-based prediction. By revealing how invisible changes destabilize proteins, it illuminates disease mechanisms and accelerates targeted therapies.
As one team noted: "When combined, sequence and structure tools offer unparalleled power" 1 . For biologists battling undruggable targets or unexplained genetic disorders, INPS isn't just a toolâit's a stability oracle.