The CAMEO Workflow - part of the weekly release procedure by the PDB is to publish the sequences of the entries to be released next week five days ahead (ie friday) of the actual release. CAMEO collects these sequences and submits them after - some pre-processing - to the servers within the individual categories. The Assessment can be performed, once the actual structure is released by the PDB - usually the following Wednesday.
The categories supported by CAMEO at the moment are 3D Structure Prediction and Ligand Binding Site Prediction. The assessment of Quality Estimation Methods is currently in preparation. Other categories will follow as requested by the community.
The Correlation is computed as the Pearson correlation coefficient, which is a measure of the linear dependence between the prediction and the reference. Correlation is calculated for each of the four ligand categories individually and as the average over all assessed categories (depending on the ligands observed in the target structures). Correlation ranges from -1.0 to 1.0, where 1.0 translates to a perfect and 0.0 to a random prediction.
The difficulty of a target is at the moment defined by evaluating the average accuracy over the models received from the servers. A low score for all servers clearly indicates a hard target.
The preferred prediction format for CAMEO LB comprises the possibility to predict ligand binding sites in four categories described below. It further supports the prediction of actual compounds (three-letter code used by the PDB) to most precisely predict ligand binding sites. Additionally predictions can be made at the atom level for achieving the highest possible predictions accuracy. See our extensive format description.
The CAMEO ligand categories are based on the preliminary ligand classification of the PDB, as defined in the chemical component dictionary (item _chem_comp.pdbx_type). The following four CAMEO ligand categories are:
Category | Description | Rule / Enclosed PDB Type | Examples |
---|---|---|---|
I | ions | type = HETAI, HETIC (non polymeric) | ZN, SO4, ACT, NH4, IOD (All metal and inorganic ions) |
O | organic | type = HETAIN, ATOMS, ATOMN, ATOMP, HETAC, HETAD (non polymeric with n <= 2) | ATP, FAD, SAM, GLA, F3S |
N | Poly-Nucleotides | type = ATOMN (polymeric with n > 2) | A, DA, G, U, T |
P | Peptides | type = ATOMP (polymeric with 2 < n <= 10) | ALA, ARG, KCX, LLP, PTR |
The Matthews correlation coefficient (MCC), which accounts both for over and under predictions, is calculated for each of the four ligand categories individually and as the average over all assessed categories (depending on the ligands observed in the target structures). MCC ranges from +1 (perfect prediction), over 0 (random prediction) to -1 (inverse prediction)."
The BDT method, proposed by Roche D.B. et al., produces continuous scores ranging between 0 and 1, relating to the distance between the predicted and observed residues. Residues predicted close to the binding site will score higher than those more distant, providing a better reflection of the true accuracy of predictions. The threshold used in CAMEO to calculate the BDT score is 5.0 Angstroem. [PMID: 20861025]
The Rank is computed as the Spearman rank correlation coefficient, which quantifies the non-parametric statistical dependency between the prediction and the reference. [ref.: C. Spearman,
Rank is calculated for each of the four ligand categories individually and as the average over all assessed categories (depending on the ligands observed in the target structures). Rank ranges from -1.0 to 1.0, where 1.0 translates to a perfect and 0.0 to a random prediction.
The Response Time is the time a model needs from submission to the server until reception by CAMEO. The Response time strongly depends on the age of the hardware and load on the server and is hence not necessarily an indication of the efficiency of the algorithms. The response times logged additionally may not reflect a typical user's experience.
The area under the curve (AUC) of the Receiver operating characteristic (ROC) is calculated for each of the four ligand categories individually and as the average over all assessed categories (depending on the ligands observed in the target structures). ROC ranges from 0.0 to 1.0, where 1.0 translates to a perfect and 0.5 to a random prediction. [ref.: T. Fawcett,
CAMEO servers can be registered as public server with its full name or as anonymous server, where all scoring is performed but only the name is disguised('serverx'). See our complete list of registered servers.
A CAMEO target is based on the weekly pre-release of new PDB structures, which are submitted to all registered servers. Targets are sometimes referred to as reference predictions.