Fixed 3SEK and 1JCK_A_B
2018-09-16 11:51:20

We have fixed two errors in our database: - For 1JCK_A_B in column "Hold_out_proteins", the value "3QFJ_ABC,3QIB_ABP_CD_DE" was replaced with "3QFJ_ABC_DE,3QIB_ABP_CD". - For 3SEK reference, "3SEK_AB_C" was replaced by the correct "3SEK_B_C". We would like to thank our users for their feedback.

Added missing 3QIB
2018-07-30 14:39:41

We have added the missing PDB structure 3QIB to the compressed file of cleaned structures. We would like to thank our users for the feedback.

2018-05-02 15:20:13

SKEMPI version 2.0 has been released. This new update of the database contains a total number of 7085 mutations.

2012-10-03 11:02:31

SKEMPI version 1.1 has been released. For consistency, three homology groups have been merged as follows: 1REW_AB_C,2QJ9_AB_C,2QJA_AB_C,2QJB_AB_C,1KTZ_A_B has been merged with 3BK3_A_C. 1LFD_A_B,1HE8_A_B has been merged with 2A9K_A_B. 1E96_A_B has been merged with 1GRN_A_B. Please, visit
The previous version is still available as a semicolon delimited csv file here.

2012-08-01 10:57:55

Motivation: Empirical models for the prediction of how changes in sequence alter protein–protein binding kinetics and thermodynamics can garner insights into many aspects of molecular biology. However, such models require empirical training data and proper validation before they can be widely applied. Previous databases contained few stabilizing mutations and no discussion of their inherent biases or how this impacts model construction or validation.

Results: We present SKEMPI, a database of 3047 binding free energy changes upon mutation assembled from the scientific literature, for protein–protein heterodimeric complexes with experimentally determined structures. This represents over four times more data than previously collected. Changes in 713 association and dissociation rates and 127 enthalpies and entropies were also recorded. The existence of biases towards specific mutations, residues, interfaces, proteins and protein families is discussed in the context of how the data can be used to construct predictive models. Finally, a cross-validation scheme is presented which is capable of estimating the efficacy of derived models on future data in which these biases are not present.

Availability: The database is available online at