FAQ and Help



Q. Can the whole database be downloaded?

A. Yes, it can be downloaded from the Download section on the menu.


Q. How can I reference the database?

Please cite:

Jankauskaitė J, Jiménez-García B, Dapkūnas J, Fernández-Recio J, Moal IH (2019) SKEMPI 2.0: an updated benchmark of changes in protein–protein binding energy, kinetics and thermodynamics upon mutation. Bioinformatics 35, 462–469 (https://doi.org/10.1093/bioinformatics/bty635)


Q. What if I see an error or cannot access the data?

A. Please contact us as soon as possible, so that we can investigate.


Q. What if I have some data I would like to add?

A. We encourage you to contact us, as we are always looking for new data.


Q. I don't see any ΔG or ΔΔG values in the table. How do I calculate these?

A. The affinities (Kd) of the wild-type complexes are in the column 'affinity_wt' and the affinities of the mutant are in the column 'affinity_mut'. These can be converted to ΔG values by the relationship ΔG = RT ln Kd; at room temperature this is ΔG = (8.314/4184)*(273.15 + 25.0) * ln(wt), where ln() is the natural logarithm. The changes in affinity upon mutation is calculated as ΔΔG = ΔGmut-ΔGwt.


Q. What is the format of the database?

A. The information and format of the columns are as follows:

1) The PDB entry for the complex, followed by the chain identifiers for the two subunits. The first chain(s) correspond to protein 1 (column 10) and the second chain(s) correspond to protein 2 (column 11). Following this link will lead you to the relevant page in the protein databank.

2) The mutation(s) corresponding to the residue numbering found in the protein databank. The first character is the one letter amino acid code for the original residue, the second character is the chain identifier, the third to penultimate characters indicate the residue number, followed by the residue insertion code where applicable, and the final character indicates the mutant amino acid. Where multiple mutations are present, they are separated by commas.

3) The mutation(s) corresponding to the residue numbering in the 'cleaned' pdb files, in the same format as for column 2.

4) The locations of the mutations(s) in or away from the binding site, as defined in "A simple definition of structural regions in proteins and its use in analyzing interface evolution", ED Levy, J Mol Biol. 2010, 403(4):660-70.

5) The hold-out type. Some of the complexes are classified as protease-inhibitor (Pr/PI), antibody-antigen (AB/AG) or pMHC-TCR (TCR/pMHC). This classification was introduced to aid in the cross-validation of empirical models trained using the data in the SKEMPI database, so that proteins of a similar type can be simultaneously held out during a cross-validation.

6) The hold-out proteins. This column contains the PDB identifiers (in column 1) and/or hold-out types (column 5) for all the protein complexes which may be excluded from the training when cross-validating an empirical model trained on this data, so as to avoid contaminating the training set with information pertaining to the binding site being evaluated.

7) The affinity of the mutant form (M).

8) The affinity of the wild-type form, or form in the PDB structure (M).

9) The reference for the affinities, as well as any further kinetic or thermodynamic information. Where available, the pubmed ID is given with a link to the relevant entry in pubmed, otherwise the whole reference is given.

10) Protein 1. This is the name of the protein which corresponds to the first chain(s) given in column 1.
11) Protein 2. This is the name of the protein which corresponds to the second chain(s) given in column 1.

12) The temperature at which the experiment was performed.

13) The association rate for the mutant protein, where available (M^(-1)s^(-1)).

14) The association rate for the wild-type protein or protein in the crystal structure, where available (M^(-1)S^(-1)).

15) The dissociation rate for the mutant protein, where available (s^(-1)).

16) The dissociation rate for the wild-type protein or protein in the crystal structure, where available (s^(-1)).

17) The enthalpy of association for the mutant protein, where available (kcal mol^(-1)).

18) The enthalpy of association for the wild-type protein or protein in the crystal structure, where available (kcal mol^(-1)).

19) The entropy of association for the mutant protein, where available (cal mol^(-1) K^(-1)).

20) The entropy of association for the wild-type protein or protein in the crystal structure, where available (cal mol^(-1) K^(-1)).

21) Notes regarding the entry.

22) The experimental method used to measure the affinities.

23) The SKEMPI version number.