Protein Docking Score Is High but Experiments Fail?

During protein–protein docking studies, researchers frequently encounter a frustrating situation:

The docking score looks excellent.

The interface appears tightly packed.

There are many hydrogen bonds and salt bridges.

However, when performing Co-IP, pull-down, SPR, MST, or other interaction experiments, the signals are weak or completely absent.

At this point, many people start questioning whether docking itself is unreliable.

Actually, not necessarily.

More accurately, a high protein–protein docking score only indicates that under the current structural model and scoring function, there exists a binding pose that appears geometrically and energetically reasonable.

It does not directly prove that the two proteins will stably interact under real cellular environments or experimental conditions.

I. What Does a High Docking Score Actually Mean?

Protein–protein docking mainly attempts to predict:

If two proteins come into contact, which spatial orientations are more reasonable in terms of shape complementarity, electrostatic complementarity, and interface contacts.

Therefore, a high score usually suggests:

The two protein surfaces contain regions that can fit together
The predicted pose has relatively few steric clashes
Hydrogen bonds, salt bridges, or hydrophobic contacts may form
The model is worth further investigation

However, it does not directly mean:

The proteins definitely interact in cells
The binding affinity is necessarily strong
The current experimental method can detect the interaction
The predicted interface is the true physiological interface

In other words:

A high score is a clue, not a conclusion.

II. Why Can Docking Scores Be High While Experiments Remain Negative?

1. The Interface May Look Good but Still Be Too Small

Some docking models appear well packed on the surface, but only involve limited local contacts.

For example:

Contact occurs only at the edge of the proteins
The interface is discontinuous
The buried surface area is too small
No stable interaction core is formed

Such models may receive favorable scores computationally, yet dissociate easily in solution.

A simple analogy:

They may appear to “embrace,” but in reality they only “briefly touch.”

2. Many Hydrogen Bonds and Salt Bridges Do Not Guarantee Strong Binding

Many reports emphasize:

Number of hydrogen bonds
Number of salt bridges
Residues involved in the interaction

These are useful references, but quantity alone is insufficient.

Protein surfaces naturally contain many polar residues, so docking algorithms can often identify multiple hydrogen bonds or salt bridges.

Truly stable protein–protein interfaces usually also depend on:

A hydrophobic core
Proper electrostatic complementarity
Reasonable clustering of polar interactions
Persistence of interactions during dynamics
Absence of severe steric or electrostatic conflicts

In many protein interfaces, only a few key residues contribute most of the binding energy — the so-called hot spots.

Therefore:

“More interactions” does not necessarily mean “stronger binding.”

3. The Desolvation Penalty May Be Too High

In aqueous solution, proteins are surrounded by water molecules.

During binding, interfacial water molecules must be displaced.

This process carries an energetic cost.

If the interface is highly hydrophilic or contains many charged residues, the system may behave as follows:

Although hydrogen bonds and salt bridges form after binding, the energetic cost of removing water molecules is even greater.

In other words:

The proteins may appear able to bind in docking simulations, but the real system may not energetically favor stable association.

Put simply:

Docking evaluates “what interactions exist after binding,” while real binding also depends on “what energetic cost is required before binding.”

4. The Docking Used Monomers, but the Real Proteins Are Oligomeric

This is a very common but often overlooked issue.

Many proteins do not exist as isolated monomers in reality.

Instead, they function as:

Dimers
Trimers
Tetramers
Multi-subunit complexes
Membrane-associated assemblies

If monomeric structures are docked directly, an apparently favorable interface may emerge.

However, in the real oligomeric state:

The interface may already be occupied
The orientation may become inaccessible
The geometry may no longer be biologically feasible

In other words:

You docked “theoretical monomers,” while the experiment tested “real biological assemblies.”

So the docking score itself may not be wrong — the structural input may simply not reflect biological reality.

5. Binding May Require Specific Conditions

Some protein interactions are conditional rather than constitutive.

For example, binding may require:

Phosphorylation, acetylation, or ubiquitination
ATP, metal ions, RNA, or small molecules
Opening of a particular domain
Membrane insertion
Specific pH, salt concentration, or redox conditions
A bridging protein partner

If docking simulations ignore these factors, the software may still identify a favorable pose, while the experimental system lacks the conditions necessary for interaction.

6. The Interaction May Simply Be Weak or Transient

Another possibility is that the interaction truly exists, but is:

Weak
Short-lived
Highly condition-dependent

Examples include:

Micromolar-affinity interactions
Transient signaling contacts
Stimulus-induced interactions
Local concentration-dependent interactions
Interactions stabilized by additional proteins

Such interactions may produce reasonable docking poses, yet remain difficult to capture experimentally — especially using Co-IP.

Therefore:

A negative experiment does not necessarily prove that no interaction exists.

It may simply indicate that the current method cannot effectively capture weak or transient binding.

7. The Structural Model Itself May Not Be Suitable for Docking

Many researchers now directly use:

AlphaFold monomer structures
Truncated structures
Incomplete PDB models

These can be useful starting points, but caution is required.

Common issues include:

AlphaFold monomer predictions may not represent binding conformations
Flexible loops may be inaccurate
Missing regions may participate in binding
Truncation may destroy native interfaces
Domain orientations may differ from reality
Ligands, metal ions, membranes, or modifications may be absent

In short:

If the input structure is inaccurate, even a high docking score should be interpreted cautiously.

III. How Should a High-Scoring Docking Model Be Evaluated?

1. Is the Interface Area Sufficiently Large?

Check whether the proteins form a continuous interface rather than isolated point contacts.

Useful metrics include:

Interface area
Buried surface area (BSA)

2. Is There a Hydrophobic Core?

Do not focus exclusively on hydrogen bonds.

Examine whether residues such as:

Leu, Ile, Val, Phe, Tyr, Trp, or Met

form stable hydrophobic packing.

3. Are Hydrogen Bonds and Salt Bridges Stable?

The key issue is not quantity, but quality.

Evaluate whether:

Distances are reasonable
Angles are appropriate
Interactions cluster around the interface core
They correspond to known functional regions
Their occupancy remains high during MD simulations

4. Are There Hot Spot Residues?

Focus on residues contributing disproportionately to binding energy.

Potential approaches include:

MM/PBSA residue decomposition
Computational alanine scanning
Literature-supported mutations
Conservation analysis
Disease mutation databases

5. Does the Interface Make Biological Sense?

Determine whether the predicted interface overlaps with:

Known functional domains
Conserved regions
Literature-reported binding sites
Disease-associated mutations
Known interaction motifs
Experimentally implicated fragments

6. Does the Model Conflict with Known Oligomeric States?

Place the docking model back into the biological assembly context.

Check whether the interface:

Conflicts with known dimerization interfaces
Collides with membrane orientation
Overlaps DNA/RNA binding regions
Is sterically blocked by other subunits
Requires unrealistic geometry

7. Does the Complex Remain Stable During MD Simulations?

If the docking model is important, molecular dynamics simulations are strongly recommended.

Key analyses include:

Complex RMSD
Interface RMSD
Center-of-mass distance
Contact residue persistence
Hydrogen bond occupancy
Interface SASA
MM/PBSA binding energy
Residue-wise energy decomposition

Static docking only addresses:

“Can the proteins fit together?”

MD further addresses:

“Can they remain associated over time?”

IV. How Should Experimental Negatives Be Interpreted?

1. Co-IP Negative Results May Reflect

Weak or transient interactions
Lysis conditions disrupting binding
Tag placement blocking the interface
Missing cellular stimulation
Missing post-translational modifications
Requirement for bridging proteins

2. Pull-down Negative Results May Reflect

Incorrect folding after purification
Immobilization masking the interface
Excessively harsh washing
Salt concentrations disrupting electrostatics
Missing cofactors or ligands
Protein aggregation or degradation

3. SPR/MST Negative Results May Reflect

Affinity below the detection range
Immobilization artifacts
Sample heterogeneity
Requirement for oligomerization
Buffer incompatibility
Extremely fast association/dissociation kinetics

4. Y2H Negative Results May Reflect

Dependence on membrane environments
Requirement for post-translational modifications
Improper localization
Incorrect folding
Indirect interactions
Requirement for bridging proteins

Therefore, experimental negatives should not immediately be interpreted as:

“The docking was wrong.”

A more productive question is:

Was the issue caused by the model, the conditions, the assay system, or the intrinsic nature of the interaction?

V. How Can Docking Results Guide the Next Experiments?

1. If Key Interface Residues Are Predicted

Perform alanine scanning mutagenesis.

Mutate critical residues to Ala and examine whether Co-IP, pull-down, or SPR signals decrease.

2. If Salt Bridges Appear Important

Perform charge-reversal mutations.

Examples:

Lys/Arg → Glu/Asp
Glu/Asp → Lys/Arg

3. If Post-Translational Modification Is Suspected

Consider:

Phosphomimetic mutations (Ser/Thr → Asp/Glu)
Non-phosphorylatable mutations (Ser/Thr → Ala)
Docking/MD comparisons before and after modification
Stimulus-dependent experiments

4. If the Real Biological State Is Oligomeric

Do not restrict docking to monomers.

Instead consider:

Dimer docking
Tetramer docking
AlphaFold-Multimer predictions
Biological assembly reconstruction from PDB data
Interface accessibility analysis

5. If Weak or Transient Interactions Are Suspected

Potential strategies include:

Reducing washing stringency
Using crosslinkers
Switching assay methods
Increasing local concentration
Comparing stimulated versus unstimulated states
Performing mutational validation rather than relying solely on pull-down success

VI. Final Thoughts

Protein–protein interaction prediction is inherently difficult.

A high docking score does not guarantee experimental success.

A low score does not necessarily rule out biological relevance.

The truly important questions are:

Is the structure biologically reasonable?
Is the interface credible?
Are the interactions dynamically stable?
Are the experimental conditions appropriate?
Is the assay suitable for this type of interaction?

Relying on a docking score alone is risky.

The real value of docking comes from integrating:

Structural interpretation
Interface analysis
Energetics
Dynamics
Experimental design

into a coherent, testable biological hypothesis.

⚡ From dry lab to wet lab, we provide end-to-end support for protein–protein interaction research.

Explore Docking & Modeling Services →

Explore Protein Interaction Validation Services →

🔬 More biotech insights: Visit our Blog