Part of the weekly release procedure by the PDB is to publish the sequences of the entries to be released the following Wednesday four days earlier. This pre-release is scheduled every Saturday at 3:00 UTC. CAMEO collects the pre-release and, after some pre-processing of the sequences and filtering steps described below, submits a selected set of sequences (targets) to the registered servers. Participants have until the following Wednesday at 02:00 UTC to return their predictions. Once the reference structures have been released by the PDB the following Wednesday, the evaluation is performed.
The categories currently supported by CAMEO are the protein structure modeling (3D), protein model quality assessment (QE) and protein contact prediction (CP). The upcoming complete modeling (CM) is an extension of the 3D category and supports evaluation of heteromers. Other categories might follow as requested by the community and subject to available funding. The ligand binding site prediction (LB) category has been discontinued in April 2016.
CAMEO servers can be registered as public server with its full name or an anonymous server, where all scoring is performed and visible to other method developers, but not to the public. Here, only the name is anonymized ('serverx'). See our complete list of registered servers.
A CAMEO target is a pre-released PDB entry, which is submitted to all registered servers. In CAMEO CM, one or more protein sequence(s) belonging to the same pre-released PDB entry resemble a target. A target can, thus, be a monomer, a homo-oligomer or a hetero-oligomer.
After downloading the pre-released sequences from the PDB on Saturday, in order to submit a limited number of high-quality targets for modeling, CAMEO CM performs the following actions before submitting the sequences to the participants:
CAMEO Complete Modeling only submits cleaned and filtered amino-acid sequences to the participants. Non standard and modified amino acid residues are replaced by the one letter code of the parent amino acid according to the PDB Chemical Component Dictionary.
A filtering step is performed on the canonical sequences, which removes targets if any of their sequences:
In order to avoid "duplicate" submissions of very similar targets, CAMEO CM clusters the remaining targets according to the following method in a third step.
Targets that are believed to be too trivial to model are not submitted to the participating servers. All representative targets (after clustering) are assessed for difficulty. All sequences are searched separately for templates with BLAST against the full list of protein sequences currently in the PDB. A target too "trivial" to model would feature a template with 85% sequence identity or more, additionally:
A target is referred to be "too easy" if a trivial template covers all the sequences of the target and there is a exhaustive mapping between every sequence of the target and every sequence of the template.
This implies:Only targets that passed filtering, constitute representatives in the clustering, and are not "too easy" are then submitted to the participating servers.
We are working hard to implement as many of the scores from the 3D category into CAMEO CM, please note that we primarily consider superposition-free scores as CAMEO targets might consist of multiple domains. So far the following scores are available:
The lDDT score (Local Distance Difference Test on All Atoms) evaluates the quality of the local atomic environment of a model. lDDT rewards the fraction of correctly predicted inter-atomic distances in a model at different threshold levels. lDDT does not depend on a global superposition of the prediction and target structure.
Specifically, interaction distances (cutoff 15 Å) between atoms in the reference protein structure are compared with distances between corresponding atoms in the predictions. If the difference between the two distances is within a defined threshold, the interaction is considered to be preserved in the prediction. The final lDDT-all score is computed by averaging the fraction of correctly modeled interactions for the following four distance difference thresholds: 0.5, 1, 2, and 4 Å (the same thresholds as GDT_HA). A filter based on the Engh and Huber bond lengths and angles removes stereochemical violations and steric clashes. CAMEO additionally offers a Cα - based lDDT score.[ref.: CASP9 TBM Assessment]