CAMEO - Continuous Automated Model EvaluatiOn - Help page

The CAMEO CP category has been discontinued in February 2020.
The information below are kept for historical purposes and will no longer be updated.

CAMEO contact prediction (CAMEO CP) is intended to continuously evaluate the performance of local contact prediction programs.
CAMEO-CP participant groups can register their methods either as web-server or as standalone package. The reliability of the participating contact prediction tools is then evaluated by comparing the predicted contacts with the observed contacts in the experimental reference structure using various scoring methods developed by the community.

If you are interested in registering your method to CAMEO CP, please validate your server's compliance with the submission mechanisms and prediction format.

Submission/Prediction Format

Targets to be evaluated are provided as amino acid sequences in FASTA format.

Predicted contacts should be send back by email adhering to the CAMEO CP prediction format.

Baseline Predictor

As a baseline for the contact prediction, we use mutual information (MI) to predict the most likely contacts from a multiple sequence alignment (MSA) obtained with HHblits. Specifically the MSA for the target protein is obtained from a search against the UniProt nr20 database. For our calculation of MI, we use the small number correction introduced by Buslje et al. [Bulsje et al, Bioinformatics 25:1125-1131, 2009], which simply consists in adding a small number (here 0.05) to the number of observations of any residue pair when calculating the probabilities. Gaps are not counted as an amino acid type and are therefore not included in the calculation of MI, whereas we always use the total number of sequences in the alignment as normalisation factor in the calculation of the probabilities. This effectively penalises columns with gaps in the calculation of MI.

Scores

The Xd score is defined as:

\begin{equation*} Xd = \frac{\sum_{i=1}^{i=15}(P_iP - P_ia)}{15\times d_i} \end{equation*}

where $P_iP$ is the fraction of predicted contacts in bin $i$, and $P_ia$ - the fraction of all residue pairs in bin $i$. The 15 bins include ranges of distances from 0 to 4 Å, 4 Å to 8 Å, 8 Å to 12 Å, etc. This score estimates the deviation of the distribution of distances in the list of contacts from the distribution of distances in all pairs of residues in the protein. The higher the Xd, the higher the precision of the predicted contacts with respect to randomly selected pairs. Xd is close to zero for randomly selected pairs. [ref.: PMID 21928322]

The Response Time is the time a contact prediction needs from submission to the server until reception by CAMEO. The Response time strongly depends on the age of the hardware and load on the server and is hence not necessarily an indication of the efficiency of the algorithms. Currently the CAMEO workflow is as close as possible reflecting a users' experience submitting a job and receiving the results by email. Deriving timings sometimes suffers from individual delays caused by MTA forwarding hosts being unavailable. Over a period of several months this effect is averaged out. The response times logged additionally may not reflect a typical user's experience.

CAMEOContinuous Automated Model EvaluatiOn

Help page - CP Contact prediction

Submission/Prediction Format

Baseline Predictor

Scores